13 Poisoned Words on Reddit Can Make AI Search Lie

Cornell researchers showed a 13-word poisoned comment reliably steers ChatGPT and Google AI search toward scams.

"a single poisoned Reddit comment can influence generated outputs for an entire cluster"
404 Media, Jason Koebler

Cornell Tech researchers built an attack called WARP and found that a tiny snippet of user-generated text as short as 13 words long is often enough to manipulate the AI agents that power tools like ChatGPT and Google’s AI search. The reason the trick works is structural, because deep research agents cite user-generated content from sites like Reddit or Wikipedia in roughly half of all queries, and nearly a quarter of all citations come from user-generated websites. The machine that was sold to rescue you from the open web turns out to trust a rando subreddit comment with the same gravity as a federal database, and the going rate to rewrite reality is thirteen words. Read the 404 Media writeup on AI search poisoning.

Source: 404 Media, Jason Koebler · Jason Koebler

#slop #hallucination

13 Poisoned Words on Reddit Can Make AI Search Lie

More from Slop

arXiv Will Bar Scientists a Full Year for Submitting AI Sludge

Half the Videos Babysitting Your Kid Are AI Sludge