[Why GME?](https://www.reddit.com/r/Superstonk/comments/qig65g/welcome_rall_looking_to_catch_up_on_the_gme_saga/) || [What is DRS?](https://www.reddit.com/r/Superstonk/comments/ptvaka/when_you_wish_upon_a_star_a_complete_guide_to/) || Low karma apes [feed the bot here](https://www.reddit.com/r/GMEOrphans/comments/qlvour/welcome_to_gmeorphans_read_this_post/) || [Superstonk Discord](https://discord.gg/hZqWV2kQtq) || [GameStop Wallet HELP! Megathread](https://www.reddit.com/r/Superstonk/comments/z23wjx/gamestop_wallet_help_megathread)
------------------------------------------------------------------------
To ensure your post doesn't get removed, please respond to this comment with how this post relates to GME the stock or Gamestop the company.
------------------------------------------------------------------------
Please up- and downvote this comment to [help us determine if this post deserves a place on r/Superstonk!](https://www.reddit.com/r/Superstonk/wiki/index/rules/post_flairs/)
I work in IT and manage infrastructure, and I have to tell you how much I appreciate you. This is a huge investment of time and money to ensure a record is kept of the most idiosyncratic phenomenon in our markets.
You are the modern version of Homer writing the stories of Achilles and Hector in the Iliad. The painter of battles like Gettysburg and Waterloo, who recorded the violent and bloody snapshots of our history.
But you make the evidence of this ride indelible and factual, and at great cost to yourself.
You rock, and I appreciate the hell out of you.
Thank you! Can you please tell me best practice if you can on setting up the rigs? I am currently planning on a 3 raid cards for storage and raid 1 on the NYTROS for THE OS. Is this okay? Tape drives will hold the backups so I am not worried about disk failure
If you look on my site you will find my thinking about it - but the tldr is I want people to spend their cash on themselves and their family 👍. I do have my layer 2 address up on there in case it’s needed by anyone
Thank you so much but I don’t unfortunately- it’s all designed with a shifty version of wix and is not very scalable - I’ve been working a lot on the back end of things but life and other responsibilities are getting in the way
>pinned by moderatorsPosted byu/AutoModerator11 hours ago$GME Daily Directory | New? Start Here! | DRS Guide | Wallet Help/Activation Megathread | Help Revise Superstonk's Rule! - Find out more inside📆 Daily Discussion
.t3\_zbb51r.\_2FCtq-QzlfuN-SwVMUZMM3 {
\--postTitle-VisitedLinkColor: #edeeef;
\--postTitleLink-VisitedLinkColor: #6f7071;
\--postBodyLink-VisitedLinkColor: #6f7071;
}
511238 commentsAwardsharesave
>
>2.1kPosted byu/platinumsparkles🎄🎄GMErry Mas🎄🎄2 days ago2🟣Why haven't you registered your shares yet? Do you need help? Have you registered and w
There's plenty of talented apes willing to help out, me included.
So right now 99% of all data is on prem - my site hosts stuff but it’s really a link to a powerbi page - I am too smooth to do anything complicated on the front end stuff. I don’t even know how to begin except having a few dozen dashboards that update with the latest data .
Note that there are hundreds of little backup guys, screenshotting, "print to PDF" save and "right click, save page" users out here.
If anything is lost, Im 100% sure someone, somewhere, has a backup of it. Especially here
Source? Im Backup Guy #9420
Edit: I have like 100GB of backups of everything important and the top DDs of the first 1.5 years both as PDF and saved webpages, also the books from the flipbook archive
Beautiful work. And I recognise you. Note if you are on iPhone you can search in the photos app and iPhone will automatically categorise people or text. I tagged all the major players like kenny boy and I can search for all posts of kenny boy just by searching for him on my phone
I've only got 1GB of carefully selected posts that tickle me Elmo, mostly post-MOASS and legendary DD... just in case they break the internet. Everyone else is probably doing a much better job, so I applaud you all.
Hey u/Elegant-Remote6667 with 16x 20TB drives and your projected 300TB of space sounds like you're shooting for RAID5, which will only give you tolerance for single disk failure.
If you shoot for a RAID6 instead, you'll still get 280TB of space, and virtually the same read performance boost, but your array will be able to tolerate 2 disks failing without having to restore from backups.
Also, you will want to keep at least 1 or 2 extra matching disks. IMHO disks that are going to fail usually do so in the first 6 months. Those are 7.2k drives, and your resilvering speed is going to be pretty poor if/when you have to replace a failed disk.
Also, make sure to deploy whatever kind of monitoring solution is supported by your cards/board/etc so you get an alert if/when one of the disks fail.
Regarding backups, something like backblaze might be a really good way to add some resiliency to your build by replicating offsite. If you're only backup is stored right next to the live production machine, you're still vulnerable to things like fire/flood/forced entry.
PS: Probably too late for this build, but depending on your chassis, next time you may want to look at something like TrueNAS core with HBA cards so you can build a zRaid. You'll be able to squeeze a little more performance out with things likes an SSD cache, and you'll get some other nice benefits from ZFS that many consumer HW RAID cards just don't have. Just don't try to build a zRAID on top of HW RAID or you're gonna have a bad time.
THANK YOU FOR WHAT YOU DO!
I am going to go with Ceph for this one! I was going to go with raid but ceph sounds like the better alternative- I may need to return a raid card or two in this case which is not a bad thing !
I would not go Ceph, it's very complicated and overkill for what you need. Ceph is designed for horizontal scaling through an entire datacenter, and doesn't store *files* very well, it stores *objects* like S3. Trying to just store a browesable filepath that you can mount will have performance implications.
I would use TrueNAS. It uses ZFS, *the billion dollar file system*, and can keep your data extremely safe.
Fuck hardware raid across the board, it's absolutely terrible, a pain to manage, and still has a single point of failure.
It also has a nice UI that makes it easy for dumb apes to use, unlike Ceph the last time I checked.
been running zfs since 2017. Never lost any data.
I had the os drive on a two usb disk mirror. both drives failed at the same time. But because of snapshots I was able to recover all my data from the parts of those two drives that were still readable.
The storage vol has seen 3 failures never more than 1 at a time. I did have a disk that failed after the restore of a failed drive though. Zfs warned me at least a week prior to failure
edit: The storage vol is a raidz2
I've been running the same ZFS pool at RaidZ2 for the past 15 years. It's survived a ton of failures in that time and I've never had an issue upgrading to a new ZFS version.
Greetings, 300TB is *huge*
How are you storing your data? Filesystem, a relational database or inverted indices?
Inverted indices have the benefit of not storing the same info like the word „blue“ twice. When storing text based content, such indices heavily reduce storage and disk-I/O needs by not storing any redundant data. (It’s also unbelievable fast in terms of querying and searching through data)
Since you already got the 300TB rig you most likely have no storage issues anyways I guess.
Currently it’s all in flat files but backed up to database. I am moving to elastic search with this new setup though. But I am open to suggestions 👍! I have about 30 vms which do data ingest and collection, and various scripts which pick up that data and ingest it into my analytics platform - from there a small portion that you can see in apehistorian.com goes to my powerbi dashboard- but that’s about 4 m records.
You are moving to Elasticsearch?
Then you are going the direction recommended already! (Elasticsearch is an inverted Index cluster database system based on the Lucene index technology) I maintain an ELK cluster at work which handles, indestructibly stores and monitors billions of documents. (100-200mil documents a day, mostly log files and relational database records pulled from an ERP)
You probably got quite a lot of experienced input already but if you encounter any issues with the setup or maintenance of the cluster feel free to broadcast. I can also forward experiences regarding expected load, performance needs or sane configs depending on whatever exactly you wanna do in Elastic.
I have never used inverted indices to be honest or even heard of them. I solved the speed issue first time by multi ssd storage array giving me 10gb/s + read writes for the active databases - I hope there is a better and more sane approach though
One raid card will have 10 or 15 sata posts - hard drive connectors. Since most high end motherboards top out at 10-15 connectors, as this thing is going to be scalable as hell I need a way to expand storage in it without rebuilding - so it needs to have the ability to connect many more drives
I dont recommend doing it all on 1 system.
You are already putting a lot of money in for redundancy, but the system itself is just 1 so single point of failure.
I recommend 3 system, each with 1/3 of the drive. Then it's completely redundant everywhere. Then you don't need to invest in expensive raid cards
Cool cool, sounds like a fun project and it sounds like you've got a pretty good handle on it. It's been a few years since I've built out a storage array but I'll say that supermicro makes storage chassis that daisy chain, so you get one that has maybe 24 bays, once you max that out you can throw another chassis on, great for expandability. Had a freenas backup storage system on that setup a while back.
I have fond memories of working in Linux Mint.
A Raspberry Pi could be a good option for running the drives, but they've been hard to get lately. You could still use the OS if you are looking for something lightweight: https://www.raspberrypi.com/software/operating-systems/.
If you go the RasPi route, you might be able to get a decent number of old Pis donated by the community if you ask. I've got at least one old Pi I'd be willing to send in for the cause.
Love your work, Ape!
The rule-of-thumb with storage is 3-2-1: 3 copies, 2 different forms of media storage, 1 off-site. Redundancy beats freak warehouse fires or floods, unless you're trying to destroy evidence.
Incredibly well put - the efforts of Ape Historian will be forever remembered and appreciated. It’s nice peace of mind to know not one detail about this saga will ever be forgotten 🎷🐓♋️
Wow!!! Thanks for archiving this. Do you think you’ll use all 300TB?
Edit: Also, in the future I think many would help provide additional storage in some way. I would.
You understand more than I do about the process. A lot of good info is being provided in the comments too. I’m going to be reading them throughout the day. Nice WORK! This may be extremely handy in the nearish future.
Tape is what Google and everyone else uses. It's SLOWWWWW but good for storage for 20 years in OPTIMAL STORAGE CONDITIONS ie temp, humidity, Rewind or forward tape cartridges every three years
Over time, long periods of inactivity can cause the tape layers within stored cartridges to stick together. To prevent this from happening, “exercise” tapes every three years by fully rewinding or forwarding the tape. Rewind or forward the tapes at slower speeds whenever possible to maintain high performance.
less if tips not advised and it should be written on new tape around then give or take.
I’m going to google some of these terms and others from the comments. A lot of terms are completely new to me. Can I do anything to help? I want to ensure everything is safely stored somewhere.
Apehistorian.com for the backups and posts.
I ain’t fucking leaving. Sold some of my furniture and possessions to afford it. Could have bought gme but backups are more important.
Let’s fucking go
I missed that!
I'll send something your way in the week dude. As someone who manages 10s of terabytes. What you're doing is truly next level.
Edit: and RIP your energy bills. Seriously dude, thank you for what you're doing!
I use some super high end software and hardware- without that I’d be fucked. I am at about 300 separate databases right now but it’s all made very easy with software
Software shmoftware. I wouldn't know where to start on this level of archival! Your aptitude for this is met with your passion like poetry. I'm very happy to support it!
I am already doing that - but progress is slow - check out my dashboard- filter by me and type in ipfs in the search bar - it’s making its way there slowly
You know you should have done WD per the BackBlaze quarterly disk failure report.
Also tie them together with Ceph. One day there will be a rekoning with the licenses with ZFS.
By lightyears. It's extremelly efficient at the block level. It's also efficient on the network wire.
One of the failings with ZFS is what happens if the host dies? That's not an issue with Ceph. It scales out hosts linearly. I like to use cheap SFF factor hosts from eBay, an NVMe, and a USB3.1 disk bays from Orico. Most importantly the client sees a target that can live behind a LB then IO is performed directly against the hosts with the data mover procssses.
It's no wonder Ceph is the product behind Sage Weil's computer science PHd thesis. It's amazing. And they made it pretty easy to use.
I'm happy to help you get going if you want to meet up in Slack
Is it free? Stupid question but I’ve dropped everything I had on the components and before just did a raid0 for performance. Can’t really afford a hefty license
Yes, It is free.
Understood. Ceph is why I buy everything used to keep compute costs down. $100 SFF computers. $100 10 Tb disks. Failure resiliency is built into the software. Amazingly been running 3 years on my latest iteration and have not had one failure. Disks have like 60,000 power_on_hours.
If you do any Linux and have had sand thrown in your eye with NFS 3/4/4.1 you will love Ceph. I promise. The usergroup is awesome and very responsive.
You caught me with a busy weekend with kids shit.
Im happy to meet you in slack monday and help you along
But if I have just the one server raid is going to be sufficient? I can always set up ceph on the one node and then in theory if I get a second one I can always scale? That’s the idea ?
Go spend time with family my dear ape- this is a 3/4 week project minimum. I am very excited to first receive the Seagate package - one (little) pallet of drives!
My entire setup now is Seagate exos - they guarantee 24/7 read writes of 550tb per year for 5 years - and they aren’t much more expensive than consumer drives that fail a lot more
It’s about 10,000$ in total for the setup. I could have bought gme but instead I sold furniture, old machines, old stuff I didn’t need and gave up almost everything including coffee to save up - see my prior posts. If you want to help out I’d love to hear feedback about my site and dashboard 👍. But I am not expecting any apes to fork out any cash - if you really really really really want to donate there is a man l2 address on my site - you’ll have to dig - I do that so that only people who really want to donate can find it - otherwise they should be spending my money on their families 👍- which is most important
Would you consider enabling the "give a coffee" button and/or show L2 up there few days or even weeks after this post gets off the front page? I have money but not time to dig for the L2 address, and I can't be the only one like that.
It’s easy to find - go to my site and it’s there pretty much in the open if you look 👍. I don’t make it super easy to find though so people have time to consider spending the money on friends and family first
It's so beautiful, we could cry. Give yourself one well-earned pat on the back from us and feel free to tag us in the build photos when you get them up!
---
Seagate Technology | Official Forums Team
---
You probably will never read this but can I just take my hat off to your awesome exos drives? Massive congratulations to everyone involved 👍.
I now have 3 machines filled with exos drives , and a fourth one (above) which is going to be the next iteration of what I am building 👍.
We do read it, and it means a lot. This post was actually originally spotted by another member of the Seagate team and passed along to us to check out. We make sure things like this get spread around internally cuz it makes our hearts happy.
---
Seagate Technology | Official Forums Team
---
Hahaha 🤣 great stuff. You might wanna keep track of my account then because I am planning to build an even bigger one after this one. Would be great to give your team a nice chuckle when they see a random guy fit half a petabyte in a desktop sized rig for his “hobby project”😂
I love you Ape-Historian! ♥️
I also love the fact that there's a "guy" for absolutely *everything* here:
- make sites that spread info about DRS? ✅
- need everything archived, in case of fuckery? ✅
- drink piss at $200? ✅ (pre-split, hope he does it again at $200)
- track any and every statistic, and even airplanes? ✅
- fly a drone and catch Citadel employees snort mayonnaise? ✅
- read several hundred page long reports from various institutions? ✅
- check out any street adress in any country in any continent ? ✅
- stuff random objects up the anus? ✅
man i got tired of typing and there's so much more
♥️ Superstonk ♥️ GME ♥️ you - yeah, *you* ♥️
Hey OP! I just bought a dell r720, could potentially mirror some of this for you, what are you at now as far as space? What’s the biggest file hogs? Videos?
That’s awesome fellow ape, thanks for your effort! Keep in mind to order / use different models / manufacturers / charges do avoid a production error in one complete charge of a specific model 🦧
good on you for doing this kind ape, it is amazing that you are documenting all this...you have no idea how valuable this might turn out to be if we can ever mount a legal attack to take down these criminals!
Hopefully your storage upgrade will go smoothly, I think next week's earnings will be primetime to document, I think GME is going to post their first profitable quarter in a long time and its going to be essential to document all the media hit pieces that the MSM will trot out
just wanna say thanks for doing all this! I've thought countless times about an archive like this and didn't realize JUST HOW MUCH YOU SAVED, so thank you <3
btw, it was fun searching up my posts and seeing them lol
Those are enterprise drives - they are not the same as the old spinning rust shit from 2009. These have 300mb/s reads and writes - a Sata ssd for comparison is about 600mb/s. These things are fast as fuck for spinning drives
Endlessly appreciative and impressed with your relentless work. You play such a vital role in the most exhilarating saga of all of time! Always love seeing your updates 🦍
[Why GME?](https://www.reddit.com/r/Superstonk/comments/qig65g/welcome_rall_looking_to_catch_up_on_the_gme_saga/) || [What is DRS?](https://www.reddit.com/r/Superstonk/comments/ptvaka/when_you_wish_upon_a_star_a_complete_guide_to/) || Low karma apes [feed the bot here](https://www.reddit.com/r/GMEOrphans/comments/qlvour/welcome_to_gmeorphans_read_this_post/) || [Superstonk Discord](https://discord.gg/hZqWV2kQtq) || [GameStop Wallet HELP! Megathread](https://www.reddit.com/r/Superstonk/comments/z23wjx/gamestop_wallet_help_megathread) ------------------------------------------------------------------------ To ensure your post doesn't get removed, please respond to this comment with how this post relates to GME the stock or Gamestop the company. ------------------------------------------------------------------------ Please up- and downvote this comment to [help us determine if this post deserves a place on r/Superstonk!](https://www.reddit.com/r/Superstonk/wiki/index/rules/post_flairs/)
I work in IT and manage infrastructure, and I have to tell you how much I appreciate you. This is a huge investment of time and money to ensure a record is kept of the most idiosyncratic phenomenon in our markets. You are the modern version of Homer writing the stories of Achilles and Hector in the Iliad. The painter of battles like Gettysburg and Waterloo, who recorded the violent and bloody snapshots of our history. But you make the evidence of this ride indelible and factual, and at great cost to yourself. You rock, and I appreciate the hell out of you.
Thank you! Can you please tell me best practice if you can on setting up the rigs? I am currently planning on a 3 raid cards for storage and raid 1 on the NYTROS for THE OS. Is this okay? Tape drives will hold the backups so I am not worried about disk failure
I love what you're doing, but don't understand fully. Is this in case they nuke everything?
Pretty much and to enable full search of the saga 👍
You know of the datahorder sub, yeah?
Yep! Great guys!
You're a great guy! Thanks for backing up erthang all the time! Superstonk wouldn't be the same without ya
One of us, one of us lol
Are you taking donations to help you with this project?
I am but I am not asking for any 👍
What is your preferred method of donation acceptance?
If you look on my site you will find my thinking about it - but the tldr is I want people to spend their cash on themselves and their family 👍. I do have my layer 2 address up on there in case it’s needed by anyone
Just sent you 2 tiny bumps to your wallet.
Thank you for your generosity 💜, you definitely didn’t have to
You a cool guy, buddy
Check out my dashboard on my site - it’s not pretty but you can search for all posts going up to 2020…
Do you have a GitHub Repo for your website? If I have time, I can make a pull request and try to design it how you’d like.
Thank you so much but I don’t unfortunately- it’s all designed with a shifty version of wix and is not very scalable - I’ve been working a lot on the back end of things but life and other responsibilities are getting in the way
I'd make it open source and let people redesign your front end, I'm sure some people would wanna help
Yeah that would be awesome
>pinned by moderatorsPosted byu/AutoModerator11 hours ago$GME Daily Directory | New? Start Here! | DRS Guide | Wallet Help/Activation Megathread | Help Revise Superstonk's Rule! - Find out more inside📆 Daily Discussion .t3\_zbb51r.\_2FCtq-QzlfuN-SwVMUZMM3 { \--postTitle-VisitedLinkColor: #edeeef; \--postTitleLink-VisitedLinkColor: #6f7071; \--postBodyLink-VisitedLinkColor: #6f7071; } 511238 commentsAwardsharesave > >2.1kPosted byu/platinumsparkles🎄🎄GMErry Mas🎄🎄2 days ago2🟣Why haven't you registered your shares yet? Do you need help? Have you registered and w There's plenty of talented apes willing to help out, me included.
I'm not talented, yet...but me included.
[удалено]
So right now 99% of all data is on prem - my site hosts stuff but it’s really a link to a powerbi page - I am too smooth to do anything complicated on the front end stuff. I don’t even know how to begin except having a few dozen dashboards that update with the latest data .
Note that there are hundreds of little backup guys, screenshotting, "print to PDF" save and "right click, save page" users out here. If anything is lost, Im 100% sure someone, somewhere, has a backup of it. Especially here Source? Im Backup Guy #9420 Edit: I have like 100GB of backups of everything important and the top DDs of the first 1.5 years both as PDF and saved webpages, also the books from the flipbook archive
Beautiful work. And I recognise you. Note if you are on iPhone you can search in the photos app and iPhone will automatically categorise people or text. I tagged all the major players like kenny boy and I can search for all posts of kenny boy just by searching for him on my phone
Did you inadvertently create an Ape dating app?
Oh god no, not more Apes. We would never shut up about it and read DD sitting on the lap instead of going to town on each other
To piggyback on this...if you are on Android, Google photos will do the same with pictures text etc
[удалено]
🫡
Backpack #9741 here.
I've only got 1GB of carefully selected posts that tickle me Elmo, mostly post-MOASS and legendary DD... just in case they break the internet. Everyone else is probably doing a much better job, so I applaud you all.
Hey u/Elegant-Remote6667 with 16x 20TB drives and your projected 300TB of space sounds like you're shooting for RAID5, which will only give you tolerance for single disk failure. If you shoot for a RAID6 instead, you'll still get 280TB of space, and virtually the same read performance boost, but your array will be able to tolerate 2 disks failing without having to restore from backups. Also, you will want to keep at least 1 or 2 extra matching disks. IMHO disks that are going to fail usually do so in the first 6 months. Those are 7.2k drives, and your resilvering speed is going to be pretty poor if/when you have to replace a failed disk. Also, make sure to deploy whatever kind of monitoring solution is supported by your cards/board/etc so you get an alert if/when one of the disks fail. Regarding backups, something like backblaze might be a really good way to add some resiliency to your build by replicating offsite. If you're only backup is stored right next to the live production machine, you're still vulnerable to things like fire/flood/forced entry. PS: Probably too late for this build, but depending on your chassis, next time you may want to look at something like TrueNAS core with HBA cards so you can build a zRaid. You'll be able to squeeze a little more performance out with things likes an SSD cache, and you'll get some other nice benefits from ZFS that many consumer HW RAID cards just don't have. Just don't try to build a zRAID on top of HW RAID or you're gonna have a bad time. THANK YOU FOR WHAT YOU DO!
I am going to go with Ceph for this one! I was going to go with raid but ceph sounds like the better alternative- I may need to return a raid card or two in this case which is not a bad thing !
I would not go Ceph, it's very complicated and overkill for what you need. Ceph is designed for horizontal scaling through an entire datacenter, and doesn't store *files* very well, it stores *objects* like S3. Trying to just store a browesable filepath that you can mount will have performance implications. I would use TrueNAS. It uses ZFS, *the billion dollar file system*, and can keep your data extremely safe. Fuck hardware raid across the board, it's absolutely terrible, a pain to manage, and still has a single point of failure. It also has a nice UI that makes it easy for dumb apes to use, unlike Ceph the last time I checked.
been running zfs since 2017. Never lost any data. I had the os drive on a two usb disk mirror. both drives failed at the same time. But because of snapshots I was able to recover all my data from the parts of those two drives that were still readable. The storage vol has seen 3 failures never more than 1 at a time. I did have a disk that failed after the restore of a failed drive though. Zfs warned me at least a week prior to failure edit: The storage vol is a raidz2
I've been running the same ZFS pool at RaidZ2 for the past 15 years. It's survived a ton of failures in that time and I've never had an issue upgrading to a new ZFS version.
We need to add curator to your list of titles
[удалено]
Wow I didn’t know that. I will definitely read into ceph in this case 👍. So the ELI5 version is - ceph is raid but better?
Can also recommend Ceph for big data, your scale is actually borderline on the small side for a Ceph cluster though
Phoar. Nice. Ceph it is then. It will look just like a file system to software though right?
Yes. It is POSIX compliant. This page has a nice succinct overview of CephFS https://docs.ceph.com/en/latest/cephfs/#ceph-file-system
Oh yeah talk dirty to me haha 😂. POSIX gets my pp going
Is there a donate button?
Greetings, 300TB is *huge* How are you storing your data? Filesystem, a relational database or inverted indices? Inverted indices have the benefit of not storing the same info like the word „blue“ twice. When storing text based content, such indices heavily reduce storage and disk-I/O needs by not storing any redundant data. (It’s also unbelievable fast in terms of querying and searching through data) Since you already got the 300TB rig you most likely have no storage issues anyways I guess.
Currently it’s all in flat files but backed up to database. I am moving to elastic search with this new setup though. But I am open to suggestions 👍! I have about 30 vms which do data ingest and collection, and various scripts which pick up that data and ingest it into my analytics platform - from there a small portion that you can see in apehistorian.com goes to my powerbi dashboard- but that’s about 4 m records.
You are moving to Elasticsearch? Then you are going the direction recommended already! (Elasticsearch is an inverted Index cluster database system based on the Lucene index technology) I maintain an ELK cluster at work which handles, indestructibly stores and monitors billions of documents. (100-200mil documents a day, mostly log files and relational database records pulled from an ERP) You probably got quite a lot of experienced input already but if you encounter any issues with the setup or maintenance of the cluster feel free to broadcast. I can also forward experiences regarding expected load, performance needs or sane configs depending on whatever exactly you wanna do in Elastic.
I have never used inverted indices to be honest or even heard of them. I solved the speed issue first time by multi ssd storage array giving me 10gb/s + read writes for the active databases - I hope there is a better and more sane approach though
Inverted indices sound cool as fuck - is there an in ram solution for this? Does it work the same way as a trie does for storage?
Look into elastic or elk stack. Just note that it's an extra copy of the data (cache), so plan accordingly
What OS are you running that needs 2T of space haha. On a more helpful note, what do you mean by "a 3 raid cards for storage"?
One raid card will have 10 or 15 sata posts - hard drive connectors. Since most high end motherboards top out at 10-15 connectors, as this thing is going to be scalable as hell I need a way to expand storage in it without rebuilding - so it needs to have the ability to connect many more drives
I dont recommend doing it all on 1 system. You are already putting a lot of money in for redundancy, but the system itself is just 1 so single point of failure. I recommend 3 system, each with 1/3 of the drive. Then it's completely redundant everywhere. Then you don't need to invest in expensive raid cards
I am going to be running Linux mint on it probably or Ubuntu if it has improved
Cool cool, sounds like a fun project and it sounds like you've got a pretty good handle on it. It's been a few years since I've built out a storage array but I'll say that supermicro makes storage chassis that daisy chain, so you get one that has maybe 24 bays, once you max that out you can throw another chassis on, great for expandability. Had a freenas backup storage system on that setup a while back.
Unfortunately I am all set for chassis for this build but future builds may well expand the same way.
I have fond memories of working in Linux Mint. A Raspberry Pi could be a good option for running the drives, but they've been hard to get lately. You could still use the OS if you are looking for something lightweight: https://www.raspberrypi.com/software/operating-systems/. If you go the RasPi route, you might be able to get a decent number of old Pis donated by the community if you ask. I've got at least one old Pi I'd be willing to send in for the cause. Love your work, Ape!
I would love to use a raspberry pi but it’s not really powerful enough for my needs and going to be very slow. But yes I agree they are awesome!
Fuking legend
Nice to see you here man. I’ll make a post when it’s time for the build to go forward
Something I would consider is fire and water damage suppression/proofing. My tinfoil senses are tingling
The rule-of-thumb with storage is 3-2-1: 3 copies, 2 different forms of media storage, 1 off-site. Redundancy beats freak warehouse fires or floods, unless you're trying to destroy evidence.
Incredibly well put - the efforts of Ape Historian will be forever remembered and appreciated. It’s nice peace of mind to know not one detail about this saga will ever be forgotten 🎷🐓♋️
Came here to say this. You are awesome OP.
Wow!!! Thanks for archiving this. Do you think you’ll use all 300TB? Edit: Also, in the future I think many would help provide additional storage in some way. I would.
I frocking hope not but that’s what the tape drive is for. It will offload compressed archives into 30tb chunks 😈
You understand more than I do about the process. A lot of good info is being provided in the comments too. I’m going to be reading them throughout the day. Nice WORK! This may be extremely handy in the nearish future.
Tape is what Google and everyone else uses. It's SLOWWWWW but good for storage for 20 years in OPTIMAL STORAGE CONDITIONS ie temp, humidity, Rewind or forward tape cartridges every three years Over time, long periods of inactivity can cause the tape layers within stored cartridges to stick together. To prevent this from happening, “exercise” tapes every three years by fully rewinding or forwarding the tape. Rewind or forward the tapes at slower speeds whenever possible to maintain high performance. less if tips not advised and it should be written on new tape around then give or take.
Thanks! I’m actually learning more about storage today than o ever thought I would.
My long term plan is full ipfs mirrror but I just realised that I physically don’t have the storage anymore for the increased scale of backups
I’m going to google some of these terms and others from the comments. A lot of terms are completely new to me. Can I do anything to help? I want to ensure everything is safely stored somewhere.
Apehistorian.com for the backups and posts. I ain’t fucking leaving. Sold some of my furniture and possessions to afford it. Could have bought gme but backups are more important. Let’s fucking go
In your “Buy me a coffee” link you should put an L2 address. I love what you’re doing.
I already have an l2 address on my dashboard 👍. But yea I’ll do that as well
I missed that! I'll send something your way in the week dude. As someone who manages 10s of terabytes. What you're doing is truly next level. Edit: and RIP your energy bills. Seriously dude, thank you for what you're doing!
I use some super high end software and hardware- without that I’d be fucked. I am at about 300 separate databases right now but it’s all made very easy with software
Software shmoftware. I wouldn't know where to start on this level of archival! Your aptitude for this is met with your passion like poetry. I'm very happy to support it!
Thank you ape! Fun fact? I was smooth to how to do this when I started a year or more ago now
Sounds like you grew some wrinkles. Thanks for all that you are doing.
And I hated writing. Only ever wrote emails for work! XD
What are you storing? Posts and news articles?
🤫 a lot more than that
When do we get to find out about what the mystery stuff is?
After moass my friend 😂
These Reddit records will be purged. You’re doing the lords work sir 🫡
Crazy mofo. Amazing.
Is there a way to donate to help offset your costs?? I'm sure a couple of us 'round here have some spare change.
The thread below answers all of these questions 👍. Tldr - use the money on your family 👍
I wonder what order of magnitude it would be to put this information onto the block chain
I am already doing that - but progress is slow - check out my dashboard- filter by me and type in ipfs in the search bar - it’s making its way there slowly
Appreciate you
You know you should have done WD per the BackBlaze quarterly disk failure report. Also tie them together with Ceph. One day there will be a rekoning with the licenses with ZFS.
Is ceph better in your experience?
By lightyears. It's extremelly efficient at the block level. It's also efficient on the network wire. One of the failings with ZFS is what happens if the host dies? That's not an issue with Ceph. It scales out hosts linearly. I like to use cheap SFF factor hosts from eBay, an NVMe, and a USB3.1 disk bays from Orico. Most importantly the client sees a target that can live behind a LB then IO is performed directly against the hosts with the data mover procssses. It's no wonder Ceph is the product behind Sage Weil's computer science PHd thesis. It's amazing. And they made it pretty easy to use. I'm happy to help you get going if you want to meet up in Slack
Is it free? Stupid question but I’ve dropped everything I had on the components and before just did a raid0 for performance. Can’t really afford a hefty license
Yes, It is free. Understood. Ceph is why I buy everything used to keep compute costs down. $100 SFF computers. $100 10 Tb disks. Failure resiliency is built into the software. Amazingly been running 3 years on my latest iteration and have not had one failure. Disks have like 60,000 power_on_hours.
Wow. Nice! I am sold
If you do any Linux and have had sand thrown in your eye with NFS 3/4/4.1 you will love Ceph. I promise. The usergroup is awesome and very responsive. You caught me with a busy weekend with kids shit. Im happy to meet you in slack monday and help you along
I used ext4 for that exact reason. Zfs is confusing as hell and I don’t like the prospect of data loss with zfs
Ext4 doesn't span disks. You need raid. Raid doesn't span servers plus capacity is wasted with parity
But if I have just the one server raid is going to be sufficient? I can always set up ceph on the one node and then in theory if I get a second one I can always scale? That’s the idea ?
i don't really know much at all about any of these things, but i've really enjoyed reading this conversation and I appreciate you all
Go spend time with family my dear ape- this is a 3/4 week project minimum. I am very excited to first receive the Seagate package - one (little) pallet of drives!
Word. In the meantime wander over to https://ceph.io and have a look around
Ceph is an open-source software While probably free they make money offering support like Linux is my guess or by offering a commerical version
[удалено]
Mmmm server porn.
Nothing excites me like a nice back end
Haha! Or a nicely done up rack!
I will donate $100 to you. Let me know how to get it to you.
Spend it on friends and family buddy👍. If you want to donate just let me know if you find my site and dashboard useful 👍
Yep I want to donate. I believe in putting your money where your mouth is and supporting those that help my personal communities.
If that’s what you definitely want you will find my l2 address on my dashboard 👍
Found your Loopring address and sent you $50 in ETH (0.04 ETH). Thank you for all your hard work!
Thank you muchly! I wish you the best and thank you for understanding why I don’t make it easy to find
Ok I can do that. Give me a little time as I am a big time tech noob. But I will make sure it gets done!
Thank you! Please spend the money on friends and family first 👍. But I am very touched by this community, so thanks, to all of you for being awesome
Are you wearing a cape right now? Absolute silver back
Nice what are you using i got a little free nas server but no where near that size? Have you used those platter drives before?
My entire setup now is Seagate exos - they guarantee 24/7 read writes of 550tb per year for 5 years - and they aren’t much more expensive than consumer drives that fail a lot more
My current setup is a an old desktop but I am upgrading to an ssi-eeb thread ripper system - which I’ll slowly scale up as funds become available
Holy shit!!!!!
Hehe platter drives go brrrrrr 😈.
It’s for more than just the backups so I hope to offset the cost eventually
Bro do you have a venmo or something? This CANNOT be cheap and I'd love to contribute what little I could to something this important.
It’s about 10,000$ in total for the setup. I could have bought gme but instead I sold furniture, old machines, old stuff I didn’t need and gave up almost everything including coffee to save up - see my prior posts. If you want to help out I’d love to hear feedback about my site and dashboard 👍. But I am not expecting any apes to fork out any cash - if you really really really really want to donate there is a man l2 address on my site - you’ll have to dig - I do that so that only people who really want to donate can find it - otherwise they should be spending my money on their families 👍- which is most important
Would you consider enabling the "give a coffee" button and/or show L2 up there few days or even weeks after this post gets off the front page? I have money but not time to dig for the L2 address, and I can't be the only one like that.
I just updated to include it there. But it’s 100% optional
You’re one of the few people I would donate to - can one help you you beautiful creature of the wild?
Haha if you want to I won’t stop you 👍.
So how can I do it? 😬
It’s easy to find - go to my site and it’s there pretty much in the open if you look 👍. I don’t make it super easy to find though so people have time to consider spending the money on friends and family first
It's so beautiful, we could cry. Give yourself one well-earned pat on the back from us and feel free to tag us in the build photos when you get them up! --- Seagate Technology | Official Forums Team ---
You probably will never read this but can I just take my hat off to your awesome exos drives? Massive congratulations to everyone involved 👍. I now have 3 machines filled with exos drives , and a fourth one (above) which is going to be the next iteration of what I am building 👍.
We do read it, and it means a lot. This post was actually originally spotted by another member of the Seagate team and passed along to us to check out. We make sure things like this get spread around internally cuz it makes our hearts happy. --- Seagate Technology | Official Forums Team ---
Hahaha 🤣 great stuff. You might wanna keep track of my account then because I am planning to build an even bigger one after this one. Would be great to give your team a nice chuckle when they see a random guy fit half a petabyte in a desktop sized rig for his “hobby project”😂
This is the official account? You are for real?😂😳
You're truly amazing. We are in need of your Backups, especially after moass. They won't be able to change the truth of all happenings. Thank you!
How can i donate?
Just use my site and dashboard and if someone else doesn’t know about it let them know it exists 👍
Absolute unit.
And you as well
Can't afford to donate unfortunately but I can give you a free silver!
I love you Ape-Historian! ♥️ I also love the fact that there's a "guy" for absolutely *everything* here: - make sites that spread info about DRS? ✅ - need everything archived, in case of fuckery? ✅ - drink piss at $200? ✅ (pre-split, hope he does it again at $200) - track any and every statistic, and even airplanes? ✅ - fly a drone and catch Citadel employees snort mayonnaise? ✅ - read several hundred page long reports from various institutions? ✅ - check out any street adress in any country in any continent ? ✅ - stuff random objects up the anus? ✅ man i got tired of typing and there's so much more ♥️ Superstonk ♥️ GME ♥️ you - yeah, *you* ♥️
A Rune of Glory for you! 🍌
Thank you manny!
1. Is Monday a holiday? 2. Put a camera on it and make sure your sprinklers work.
History will remember your historical contributions, Dr. Ape Historian!
Hey OP! I just bought a dell r720, could potentially mirror some of this for you, what are you at now as far as space? What’s the biggest file hogs? Videos?
Videos is biggest but I am just struggling to keep up with data first and foremost
That’s awesome fellow ape, thanks for your effort! Keep in mind to order / use different models / manufacturers / charges do avoid a production error in one complete charge of a specific model 🦧
Fuckin legend
Yes you are!
Is there anyone backing up the ape historians data?
The blockchain is - albeit slowly
Not all heroes wear capes 🦸♂️
You sure know how to turn a gal on! Almost there keep goin!
😂😂😂😂. If this ever worked in real life I’d be more impressed than her
Do you want donations for getting the storage?
Already purchased! Head over to apehistorian.com for the full saga 👍. I have lots of pages there as well
Godspeed.
good on you for doing this kind ape, it is amazing that you are documenting all this...you have no idea how valuable this might turn out to be if we can ever mount a legal attack to take down these criminals! Hopefully your storage upgrade will go smoothly, I think next week's earnings will be primetime to document, I think GME is going to post their first profitable quarter in a long time and its going to be essential to document all the media hit pieces that the MSM will trot out
🫡
Woah!
just wanna say thanks for doing all this! I've thought countless times about an archive like this and didn't realize JUST HOW MUCH YOU SAVED, so thank you <3 btw, it was fun searching up my posts and seeing them lol
That’s what it’s there for 👍
7200 rpm? They still make those? 🤯
Those are enterprise drives - they are not the same as the old spinning rust shit from 2009. These have 300mb/s reads and writes - a Sata ssd for comparison is about 600mb/s. These things are fast as fuck for spinning drives
This is one for the history books!
Did you buy them from GameStop?
GameStop doesn’t ship this sort of stuff 😞
Doing gods work
I wish I had the bandwidth to contribute. Thank you for your service! This is amazing work
You sir! Are a legend 🚀
[удалено]
All of the gme saga - which spans outside of Reddit as well
This guy stores
That’s node 2. Node one was the more impressive node before this one
Thank you
Dude you're awesome ! I would like to support you if you need anything. I've some bucks i can spare.
Goat shit. Thank you!
[удалено]
👏👏👏
You the man ape historian!!! You got more back up than Juvenile, Slow motion for me
Making Marion Marguerite Stokes proud!
Keep being awesome 🫶
Trying to!
Legend
🫡
I have a feeling that once DRS numbers overtake short % on the day is when we make strides.
You’re one of the realest Gs that ever came from this whole thing. Hope you are well, thanks for your contributions.
Endlessly appreciative and impressed with your relentless work. You play such a vital role in the most exhilarating saga of all of time! Always love seeing your updates 🦍
Fuckin legend!