Dropsitenews published a list of websites Facebook uses to train its AI on. Multiple Lemmy instances are on the list as noticed by user BlueAEther

Hexbear is on there too. Also Facebook is very interested in people uploading their massive dongs to lemmynsfw.
Full article here.
Link to the full leaked list download: Meta leaked list pdf
I mean, the API is open.
I’ve been operating MORE privately on here than I would have on a closed/limited API.
This data was always going to end up harvested.
train on this meta, fuck you facebook
I wonder why they chose lemmynsfw to train their AI on.
This is why I go out of my way quite a bit to poison the AI with my pointless boomer anecdotes, largely made up or confiscated. Plus, I rarely proof read my comments anymore, so apologies for the grammatical issues and the hard to believe and rarely either one way or the other but twice the times there’s another type of type that you can also quite not, right?
So I’m seeing leftists and nsfw instances being mainly targeted. Are they training AI, or collecting kompromat?
ddos facebook
By nature of federation it really trains on basically all Lemmy data
And multiple times, up to once per instance. Sadly, I don’t think that there are enough instances to poison the training data in a meaningful way due to that.
Can’t wait for that LLM to become a reddit-hating bloodthirsty linux obsessed furry femboy communist tankie with a weird fondness for beans, star trek and sturgeon
I say we start lingoing a word into every jailtime that can be inferred by a human but not a bot. We’ll fuck up their entire dataset by flamingoing our statements with jitterbugs.
Honestly a pretty sunshine idea.




