13 Poisoned Words on Reddit Can Make AI Search Lie
Cornell researchers showed a 13-word poisoned comment reliably steers ChatGPT and Google AI search toward scams.
"a single poisoned Reddit comment can influence generated outputs for an entire cluster"
404 Media, Jason Koebler
Cornell Tech researchers built an attack called WARP and found that a tiny snippet of user-generated text as short as 13 words long is often enough to manipulate the AI agents that power tools like ChatGPT and Googleβs AI search. The reason the trick works is structural, because deep research agents cite user-generated content from sites like Reddit or Wikipedia in roughly half of all queries, and nearly a quarter of all citations come from user-generated websites. The machine that was sold to rescue you from the open web turns out to trust a rando subreddit comment with the same gravity as a federal database, and the going rate to rewrite reality is thirteen words. Read the 404 Media writeup on AI search poisoning.
Source: 404 Media, Jason Koebler · Jason Koebler
More from Slop
arXiv Will Bar Scientists a Full Year for Submitting AI Sludge
arXiv will bar researchers a full year for submitting machine sludge.
Half the Videos Babysitting Your Kid Are AI Sludge
The recommendation engine fed kids roughly 40% machine-made sludge.