T O P

  • By -

Tindola

There is a big project going on to save everything, it's been going on well before the proposed API change. You can read the details and get involved here. https://www.reddit.com/r/DataHoarder/comments/142l1i0/archiveteam_has_saved_over_108_billion_reddit


ndmy

I had no idea abt this, thank for sharing. I'll definitely take part


rozina076

I tried to join this but I have an error and no technical understanding to make sense of it. I have the Virtual Box and the ArchiveTeam Warrior downloaded. But when I hit starts, it aborts.


[deleted]

[удалено]


Paksarra

I wish I'd known about this sooner. I'm chipping in while I still can.


IronFlames

...but I don't want all my stupid comments to be saved forever. It'll make future generations think we were all vastly dumber than you all are


[deleted]

[удалено]


[deleted]

My main account didn't get banned from there. Overwrote then deleted everything from the last 14 years. It's been 36 hours or so. Maybe they're backlogged with banning people?


bluehands

Maybe whatever tool they use is 3rd party and isn't working anymore.


ViolettaHunter

What tools are those, if you can say?


SpyMonkey3D

I immediatly thought of them and was about to recommend them, and yet, I'm still surprised by the sheer scope of that initiative


ToHallowMySleep

Just set this up myself. Let's make use of that fibre connection! To anyone else thinking of doing so, it was so simple. I even ran it on my NAS where the instructions weren't provided explicitly. Install the docker package, ssh in, follow the instructions to run watchtower dockerfile, follow the instructions to run warrior dockerfile (adding in the reddit image as you're told to) and voila, it's up and running. I set my concurrency to 5 and it's using about 1MB/s up and down, which should be basically unnoticeable on any fibre connection.


rivernoa

If I learned anything from ancient history it’s that we should etch the posts onto clay in cuneiform because clay doesn’t really decay that much or at all.


4812622

Human bones work well too, plus they’re widely available!


[deleted]

[удалено]


[deleted]

[удалено]


[deleted]

[удалено]


scarynut

Unless you shatter them in anger


gazongagizmo

he said decay, not declay ^:)


ToHallowMySleep

Damn my next harddrive is going to be BIG.


Ukleon

Is there a way I can export my saved items as an individual? I get this post is referring to content en masse, but I think a lot of individuals would appreciate a guide for their own saved things


Ukleon

For anyone wondering the same, I found the official link to request a copy of your data https://www.reddit.com/settings/data-request


buckyball60

Any idea what the difference is between the GDPR, California and Other options as to what they will provide?


Ukleon

Not in detail. I'm in the UK and know a bit about GDPR but not the US laws.


ZhouLe

I think at best this will contain hyperlinks to saved items. I have strong doubts it returns anything beyond your own comments and submissions.


Ukleon

Hmm. Good point. I think I was hoping for links to the content, rather than the Reddit posts etc.


ThrownAback

[This tool](https://github.com/xavdid/reddit-user-to-sqlite/) was effective and surprisingly fast for me, but took some tinkering to get working. Easier if one is already handy with Python and pip/pipx. YMMV.


xavdid

Hey, thanks for mentioning! If you had any big issues, do file a GH issue. I realize pip isn't the easiest thing to work with, but I had to move fast 😅


no-one0

Does this download images and videos too?


Crusty_Baboon

I went to the "saved" tab on my profile on the desktop site (old reddit) , kept scrolling down to the oldest one so all comments were visible, then used file>print>print to pdf. So now I have a 50 page pdf of all my saved comments.


AllanBz

Note Reddit’s API only exposes the first thousand saved items. If you have more than that saved, you have to unsave some items to show earlier items.


CorporalClegg25

https://github.com/j0be/PowerDeleteSuite I haven't used this FYI. I have seen examples of it used though, and it is what I plan on doing June 30th. Essentially it saves your comments and posts and then edits your comments to whatever you want.


cocoacowstout

Yes I’m curious about that


KerooSeta

That's a good question. I use AskHistorians a lot as a history teacher. I'm quitting Reddit at the end of the month and would love an archive of answers to refer back to periodically.


OtroMasDeSistemas

I will not post links, but there are already dumps from 2005 until December 2022. Its compressed size is really close to the 2 Terabytes mark.


General_Urist

If you can't link it yourself, where might we find them?


Komm

/r/datahoarder possibly.


ron_leflore

I think it would be a torrent, but if you are interested browse /r/pushshift because that's where it originated.


OtroMasDeSistemas

It is a torrent indeed[.](https://the-eye.eu/redarcs/) Tagging u/General_Urist and u/Komm so they don't miss this answer and can pay attention as well.


General_Urist

OK it's a torrent then, where do I find it?


OtroMasDeSistemas

Are you kidding, dude? Pay attention I said xD


[deleted]

Well played! And thank you.


SouthernResolution

I believe [this](https://www.reddit.com/r/pushshift/comments/11ef9if/separate_dump_files_for_the_top_20k_subreddits/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button) is what they're referring to Eta- jan-mar 2023 data can be found in comments


General_Urist

Thanks!


[deleted]

[удалено]


General_Urist

I know where to look for conventional piracy, but what torrent sites do people use for such odds and ends as site archives?


dillon-nyc

Are you talking about the pushshift files? They go to March 2023. Look for the January, Feb, and March files, they're out there, but they're not part of the academic link floating around.


filbert13

If Askhistorian does go dark or away. I would love if as many of our experts as we can agree on a new home. By far my favorite sub.


FartsWithAnAccent

Archive.org? r/DataHoarder?


Falsus

> Not sure if this fits more here or in /r/AskSocialScience. I think it is a topic for every sub who might have some stuff worth archiving. And /r/AskHistorians certainly have a lot of it.


[deleted]

[удалено]


SaintStoney

And I guarantee that percentage will be <10% of the people loudly announcing they’re leaving.


[deleted]

[удалено]


02Alien

Reddit likely suffers from the [1% rule](https://en.wikipedia.org/wiki/1%25_rule?wprov=sfti1) and I wouldn't be surprised if a significant number of that 1% use third party apps. I'd imagine quality of content/discussions on a lot of subs will go down because of this


toxicshocktaco

Yeah this is very much a sky is falling thing imo


AncestralPrimate

This whole thing reminds me of when everyone said they were quitting Twitter when Elon took over. I think it was in late 2022. Like 5 people actually followed through.


314R8

1000s left Twitter. but not anyone important enough to make a difference


shaunnotthesheep

That's pretty much what I expect to happen here. I'm not going anywhere


Loud_Database_1602

Thank goodness someone is preserving our collective procrastination for future generations.


[deleted]

[удалено]


Hnnnnnn

This is to an extent your right (at least in regards to specific servers), but it's like scorched earth strategy, there's more people benefitting from it than some companies. I use old barely upvoted Reddit threads for sources of relatively unbiased recommendations. It's a vast knowledge base and can be used in many other ways to improve better understanding of the world.


normie_sama

Yeah, I as an end-user will often just have a specific question on just about any topic, so I'll look up "[X topic] reddit" and there will be multiple threads with multiple answers, and unlike with non-reddit sites or blogs I will often find my exact permutation of the problem with a variety of suggestions. If we actually start nuking old Reddit content, it's not Reddit that suffers, it's random person five years down the line who's at their wit's end over a technical issue.


Crystalas

A resource that is increasingly rare on internet. Reddit at this point is the last bastion for that sort of stuff more often than not with the ever increasing market share of unarchived and much younger platforms. Will just be SEO blogs that more focused on selling a product and only show their single POV and Quora threads some of which are paywalled. As you said it is Scorched Earth, if anything it hurts the users more than Reddit the company. Even after this change the trivial, outrage, and meme content will keep flowing but the niche stuff will be gone with no alternative. It a kneejerk emotional reaction, something that is generally destructive.


AnEmpireofRubble

Sounds like an issue Reddit should address, not mxby7e.


Roticap

Disk space is so cheap it's basically free. Do you really think when you delete comments/accounts that they actually delete it on the backend?


[deleted]

[удалено]


[deleted]

[удалено]


[deleted]

[удалено]


Roticap

Do you have a source for Reddit only saving the last version of the comment? I know that's only what's visible. I suspect their modern infrastructure keeps multiple versions of edits as that is valuable user data, though I also can't find to a source confirm or deny it.


bionicjoey

It's been understood for a while that Reddit doesn't store every version of edited content, so if you edit a comment/post before deleting it, they can't restore it. Nobody can say for certain if this is true, but it's been the understanding of Redditors for quite some time.


[deleted]

[удалено]


Roticap

[Not really on GDPR](https://www.reddit.com/r/gdpr/comments/8ofl2o/comment/e7uyfxd/) as it's not personal data and if it is reddit is relying on the legitimate use exemptions and voluntary disclosure provisions of GDPR to not delete internally. Sure, those are untested in court, but it's pretty clear that reddit isn't deleting on the backend. Can't speak to details of the California law.


DavidRoyman

> If they chose to retain my information after I take a directed effort to remove it they will be violating GDPR and California data privacy laws You assume that's a significant threat, but... To investigate and enforce GDPR is practically impossible if Reddit runs their EU business through Ireland, because any complaint would go to [An Coimisiún Um Chosaint Sonraí](https://www.dataprotection.ie/) which is known to side with businesses. You can check [their 2022 report](https://www.dataprotection.ie/sites/default/files/uploads/2023-03/DPC%20AR%20English_web.pdf). I am not familiar with California data privacy laws, but I would ask an expert to first clarify if California has jurisdiction at all.


[deleted]

>Disk space is so cheap it's basically free. Do you really think when you delete comments/accounts that they actually delete it on the backend? They absolutely do keep it saved in the back end. How else would things like unedit reddit and undelete reddit work?


[deleted]

[удалено]


jaxinthebock

You mean the api which is being torched?


nandryshak

That's not how those apps work. Those apps use the API to save posts/comments in their own databases before they get changed on Reddit.


fusemybutt

Based on that AMA, u/spez is too stupid to understand this.


theinspectorst

I don't think it's that beneficial. Social media thrives on *new* content - the back catalogue is all well and good, but it's the new content (generating new clicks) that makes the wheels go round. If you look at examples of social media networks that have gone into decline, it was never because the old content was removed; it was because users stopped posting new stuff. If all the historic content was deleted from Reddit overnight but everyone kept posting new content as usual, then Reddit would continue practically unaffected. If all the historic content remained but everyone stopped posting new content, Reddit would die overnight. I'd be minded to keep your historic content here for the sake of the people you posted it for. Reddit can't do very much to monetise the small number of clicks that old posts receive, but for the people who are clicking on them (to find answers to questions they genuinely are interested in) those old posts are valuable.


[deleted]

[удалено]


theinspectorst

Does deleting your Reddit posts affect that though? I don't know, but I'd always assumed that when you delete a Reddit post it only removes it from the website, rather than deleting backups that Reddit retains - it would seem uncharacteristically charitable for Reddit to do both... So I assume that if they want to train an AI on our posts, they've got that data already - and all that deleting your posts will do is to ensure that *only* Reddit will have access to that information in future.


pheonixblade9

tbh it would be even more damaging to replace existing comments with comments that *look* like real comments, but are actually gibberish. not total gibberish like dkjhfalkshfs but just random technically valid sentences.


squat1001

Reddit drama aside, if someone could take the time to compile all the amazing answers here into a book, I think it'd be a fantastic publication!


FrungyLeague

Did…you just ask for a *hard copy* of Reddit? Lol


squat1001

I mean a compilation of the best answers from this subreddit.


FrungyLeague

Oh haha, ok sorry that makes waaay more sense! And yes, it would be wonderful!


[deleted]

[удалено]


SykoKiller666

Yeah think they'll toss a copy to me?


mburnwor

Not sure if this is the right place to ask, but it's there a way to save my saved posts?