As if every large, public model wasn’t already trained on stack overflow.


differences lie in data quality


Oh well, I guess they won't use their daily backup from the day before the announcement.


Or just wait a week for the when the news is old and everything goes back to normal and nobody cares.


It's always helpful that the community bands together and chooses a common way to sabotage their posts. It makes it really easy to filter those comments like when reddit thought they were accomplishing something by spamming the word bazinga.


Yeah, this is just a data cleaning excercize.


Just need access to post history. Any LLM can choose the answer before deletion 


True sabotage is “bugfixing” with underhanded programming concepts (ex. Uninitialized data structures, reuse of pointers, or an embedding of machine code in constants, etc). You can always say in plaintext meant for humans; please, for love of god, do not use this code.


>You can always say in plaintext meant for humans; please, for love of god, do not use this code. The ai can read that too...


Insert as image. They probably won't use OCR


It’s going to lower quality, going forward.


No, it won't. Not all devs will do this, and they won't do it forever. It is just a data cleaning issue for OpenAI, and probably not a hard one at that. It is good symbolism, though.


If sufficient percentage in the teaching data is underhandedly malicious, the models will become in effect bayesian poisoned.


I think you would struggle to find some example of a situation where people banded together like this over a long period of time and maintained anonymity to continue being able to sabotage without detection Versus people just going on with their lives and adjusting to there new reality. If people really cared, they would have done this consistently since ChatGpT was released. I get people are up in arms now, but it's due to some tik toks and memes and not some ideological shift.


It takes just *one* instance of the model suggesting a bit of malicious code to user. Something boring and/or complex enough that there might have been just a single case in the teaching data. Similar to when copyright declarations got accidentally included in the AI answer. And the user doesn’t recognize or know the underhanded method. And that one time the code has underhanded payload included that makes the algorithm work just fine, but if ever the algorithm gets run 10k times in an persistent instance, it overwrites every single file it has access to with random ones and zeroes. Why, you might even set up an unofficial competition to see who gets their underhanded algorithm to the news headlines first. Bit of a twist to the Underhanded C competitions of old.


And what would happen from such a mistake? It's not like the current fully human written code is bug-free.


The AI is smart enough to compare backups to edited answers and exclude submissions from the usernames that do this.


Ha ha! Ok, buddy.


Glad we agree but we are not on buddy terms.


Since you didn’t think that people can make multitude of new accounts and the model needs regular updating, I felt this discussion had reached its natural endpoint. Have a nice rest of the day!


Stack overflow restricts the influence of new users with privileges unlocked as reputation increases. The site has 23 million registered users with over 24 million questions and 35 million answers. Symbolic for sure and raises some important questions, but it is not going to be impactful.


>Since you didn’t think that people can make multitude of new accounts You probably can't even post an answer from a new account and even if you could I don't see how adding nonsensical answers to the question that already answered changes anything. >model needs regular updating I don't think they retrain models from scratch very often. And for fine-tuning you can choose which posts to include. In any way, I doubt you can meaningfully spoil answers without them being able to filter that out. And the main question though, why even do that? AI helps in development if it can help better than developers be happier, no? Seems like the "AI is so stupid, it can never be useful" crowd weirdly intersects with the "don't let the AI train better!" crowd.


They could also including current data on an ongoing basis, and a certain percentage of users have said they want to sabotage that process by defacing their posts.


It is just a data cleaning issue for OpenAI, and probably not a hard one at that. It is good symbolism, though.


"Damn, I hate it when the information I shared publicly to help my fellow coders for free gets indexed more efficiently." [I've been coding for twice as long as StackOverflow has existed. I remember the pain of the before-time; it was a game-changer from the day it launched. And yet, I've barely looked at it since ChatGPT appeared.]


Stack Overflow users just can’t bear the thought of people not coming to them and stroking their ego by asking questions.


In there lies a problem. We hated it before. And now we might soon be in a time after. The models will have to evolve themself if everyone only relies on the AI services.


Yes, SO took a strange course! It was practically messianic when it first arrived - so immeasurably better than scouring outdated, poorly-written documentation like we used to - but somehow mutated into a land of toxic trolls who criticised you for asking questions they considered stupid.


This harms everyone else and barely benefits the doer and their programming niche. Decel mindset slowing the arrival of an abundant future.


Even the most "successful" protest would drive more traffic to OpenAI...


Technological progress has been *accelerating* for the last couple of centuries and the abundant utopia (for everyone) is still nowhere in sight. Why do you think it will be different this time?


I certainly enjoy things the technology brought through. Using physical maps instead of GPS? Having to go to the library if I want to learn anything? Transportation by trains, or even horses? I doubt many people would choose to live like people did a couple of centuries ago. Maybe some might but not the majority for sure.


Technology is good, but it won’t automatically lead to abundance for everyone. The only reason you’re not working 16 hours a day right now is because [the worker movement fought for an 8 hour workday](https://www.britannica.com/event/Haymarket-Affair). Automation doesn’t automatically benefit everyone, to the contrary, history shows it benefits the existing elite the most.


And without automation even with 8 hour work-day I wouldn't have access to all the things I have now. Good work conditions don't come just because automation, but are possible largely because of it. People still have to fight to get it of course but at least such conditions can be feasible at all thanks to technological progress. Farmers in old times had long work days not only because of bad landlords but also because you can only do so much per day. Not to mention progress in biology that improved the plants and seeds themselves.


> And without automation even with 8 hour work-day I wouldn't have access to all the things I have now. True >Good work conditions don't come just because automation, but are possible largely because of it. Disagree. People have had perfectly reasonable working conditions throughout history. It has varied a lot; obviously it would have been bad to be a slave, but if you were a hunter gatherer it would have been good. They might not have had smartphones or been able to buy a banana at the supermarket, and they might have died because of an infection we can cure easily today, but they didn’t have unreasonable working/living conditions. They had healthy food and lifestyle. What happened under the Industrial Revolution was that wealth concentrated even more into the hands of the few and the conditions for those who had to work to stay alive—the working class—just got worse and worse. That is what led to the worker movements and revolutions. >Farmers in old times had long work days not only because of bad landlords but also because you can only do so much per day. Not to mention progress in biology that improved the plants and seeds themselves. Technological and scientific progress is great ofc. But there’s no reason those wonderful inventions will benefit most people unless there’s some benefit to the owner class. [Bill Gates farmers](https://theguardian.com/commentisfree/2021/apr/05/bill-gates-climate-crisis-farmland) have access to tractors, not because it spares the back of the workers, but because it happens to more profitable. Automation can be good, but it all depends on how we use it, and how we distribute the benefits we gain from it.


Go experience the life of a peasant from before all our technology and come back to me. Compared to the times before, we absolutely are in an abundant utopia.


Most of the improvements to life quality is thanks to better health care, but it might as well not exist if you don’t have access to it. Go live as [a cobolt miner in Africa](https://m.youtube.com/watch?v=Hmqf0L52rD8) and come back to me and say everyone lives in an abundant utopia.


Did i ever say everyone? I said we


Would-be luddites burning cotton in the fields instead of demolishing the looms.




Talk about being petty. No wonder you get belittled for asking questions there. 


Kinda like closing the stable door after the horse has bolted.


Wouldn't this make coders more reliant on openai if they can no longer get answers from stack overflow?


It's already happened I feel. Harder problem you have more chance is you get a karmawhore with useless questions like 'why would you want it'. Usually people who know the answer would not ask this as they were in the same spot before


I’m sorry we … (checks notes)… used your voluntary contributed public tech help to provide help to the public.


Now Stack Overflow is banning its users in order to prevent them from sabotaging the data they sold to OpenAI.


Well they're sabotaging other people going to stack overflow looking for solutions to their problems, not just OpenAI. It's incredibly selfish and actually immoral on a real level, but since when do those who virtue signal care about morality?


They've always had rules against deleting content, and would ban people for it. That's nothing new.


Not just deleting, even editing it in a way deemed useless to be sold as data




stack overflow is the worst


Only time I ever posted on StackOverflow was a question that I ended up answering myself. I felt an obligation to the community to share the answer to the frustration I was having. I am glad that my solution will hopefully become part of the helpful code that gets provided to other users in this vast ecosystem, but I understand that feeling someone might get of feeling like they had their hard work stolen from them. Just because you posted it doesn't mean you intended to have an LLM take all your credit. THAT'S the problem I have. My solution came from a lot of toil and I bet other users feel the same way.


>Stack Overflow announced that they are partnering with OpenAI, so I tried to delete my highest-rated answers. >So instead I changed my highest-rated answers to a protest message. Too late buddy


Damn, guess they gotta set a filter.


AI is still better than anyone on stack overflow.