That's going to be a lot more work since comments and posts are decentralized here. You can probably easily get some of it but it will be hard to get all of it.
I wish people like spez and zuck cried themselves to sleep, but those beds of cash are probable pretty comfortable. The only real hope is that they're pilloried so thoroughly in history books that, at the ends of their lives, they're bitterly angry at the injustice of how they'll be remembered. The good news is that this is something the public can influence. The bad news is that 99% of the public don't give a shit. Musk might be the only one in this crop of unethical sociopaths who might ene up railing about his legacy; the rest are just going to get away with raping the public and generally recognized as being "shrewd business men." And it's only the men; the women who do this tend to end more poorly - fired by boards, or spending time in jail.
America is truly exceptional... Nonagenarian politicians serve as lawmakers of an economy they barely understand, and part of a system of legalized bribery that reinforces their lack of interest in not understanding, while septuagenarian supreme court interpets and applies laws made in the aftermath of the civil war but are free to bend the meaning of laws as their personal political biases allow, and octagenarian presidents wield extreme unchecked power.
In this system, laws against abuse of personal information and exploitation of data will only be written in 2080 or later, after many lives of common people are damaged, until it damages the life of a congressman and then change happens.
Well, they already made it very clear to everyone back in May that the content created by the community does not belong to the community. Anyone still using that dump deserves to be explored.
Reddit will let “an unnamed large AI company” have access to its user-generated content platform in a new licensing deal, according to Bloomberg yesterday.
The deal, “worth about $60 million on an annualized basis,” the outlet writes, could still change as the company’s plans to go public are still in the works.
The news also follows an October story that Reddit had threatened to cut off Google and Bing’s search crawlers if it couldn’t make a training data deal with AI companies.
Last year, it successfully stonewalled its way out of the biggest protest in its history after changes to its third-party API access pricing caused developers of the most popular Reddit apps to shut down.
As Bloomberg writes, Reddit’s year-over-year revenue was up by 20 percent by the end of 2023, but it was still $200 million shy of a $1 billion target it had set two years prior.
The company was reportedly advised to seek a $5 billion valuation when it opens up for public investment, which is expected to happen in March.
The original article contains 346 words, the summary contains 175 words. Saved 49%. I'm a bot and I'm open source!
PowerDeleteSuite. I used this when things went hot with Reddit. You can even edit your comments before deleting them, best part for you, you don't have to delete them. (Hopefully Reddit haven't countered this).
Is there a more effective one, that slowly edits all your comments a little bit at a time so it misses their detection over a period of weeks/months? Like scrambling/nonsense sentences.
There was a book whose card when blunk when they looked up.
Like completely non sensical but a real sentence so it would be hard to detect.
Look at the issues and you will notice it only works on comments visible from the profile page and that not all are visible. It appears that someone made a python script to solve this problem but that you need an API key to use it.
That's what I did. I turned all my comments into Lemmy advertisements, and also an obscene sentence telling u/spez to kill himself (I'm not proud of it at this juncture, but it felt good at the time).
I dont see why someone would need this deal anyways.. most is already available, and most the new stuff probably too, even without API access.
I also expect the fediverse to be crawled and used for training, thats just the thing about publicly available stuff, it gets used, if we like it or not..
Back in 2005, when I signed up for Reddit, the user policy didn't outline that they would be selling my data for profit; in fact, the original user policy was pretty vague. This sounds a bit like theft of intellectual property to me. I know their privacy policy has become more transparent over the years, but why can the company profit off all this information while the public, who built the juggernaut in the first place, gets nothing in return?
Ah, more glue on pizza incoming. Personally I don't understand taking reddit posts as a source for LLM training. It's like they never visited reddit and think that all posts/comments are true, or even useful. Depending on the sub, sarcasm can account for anywhere from 5% to 100%.