Tramèr's team outlined a fairly unsophisticated attack involving carefully timed Wikipedia page edits.
Wikipedia doesn't allow researchers to scrape from their website but instead provides "snapshots" of their pages that they can download, Tramèr said.
These snapshots are taken at regular and predictable intervals that are advertised on Wikipedia's website, according to Tramèr.
This means that a malicious actor could time edits to Wikipedia just before a moderator can revert the changes and before the website takes snapshots.
The paper in question isTramèr told BI that his team didn't perform real-time edits but instead calculated how effective an attacker could be. Their "very conservative" estimate was that at least 5% of edits made by an attacker would make it through.
"In practice, it will likely be a lot more than 5%," he said. "But in some sense, for these poisoning attacks, it doesn't really matter. You usually don't need all that much bad data to get one of these models to suddenly have some new unmated behavior."
Tramèr said that his team presented the findings to Wikipedia and provided suggestions for safeguards, including randomizing the time the website takes snapshots of its web pages.
Nicholas Carlini, Matthew Jagielski, Christopher A. Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, Florian Tramèr: Poisoning Web-Scale Training Datasets is Practical (arXiv)
A second approach outlined is buying dead domains that are still being scraped for training data.