Stephen Harrison: "Wikipedia Will Survive A.I."

Giraffe Stapler · Unread post by **Giraffe Stapler** » Mon Aug 28, 2023 8:02 pm

Slate: Wikipedia Will Survive A.I.

Stephen Harrison has a new essay out in which he talks about Wikipedia and AI (mostly in regard to ChatGPT). He makes a case that AI will be used in Wikipedia to assist editors, not replace them. I think he's been spoon fed a load of codswallop by the WMF.

For context, there have been elements of artificial intelligence and machine learning on Wikipedia since 2002. Automated bots on Wikipedia must be approved, as set forth in the bot policy, and generally must be supervised by a human. Content review is assisted by bots such as ClueBot NG, which identifies profanity and unencyclopedic punctuation like “!!!11.” Another use case is machine translation, which has helped provide content for the 334 different language versions of the encyclopedia, again generally with human supervision. “At the end of the day, Wikipedians are really, really practical—that’s the fundamental characteristic,” said Chris Albon, director of machine learning at the Wikimedia Foundation, the nonprofit organization that supports the project. “Wikipedians have been using A.I. and M.L. from 2002 because it just saved time in ways that were useful to them.”

Correct me if I'm wrong, but isn't ClueBot (and most other bots) just a bunch of regex that gets run on recent changes? Perhaps the latest version uses some form of machine learning but I would be surprised to learn that it was using machine learning in 2002. And I think we know how well human-supervised machine translation worked out, considering they banned it on English Wikipedia.

Whether Wikipedia is incorporated into A.I. via the training data or as a plug-in, it’s clear that it’s important to keep humans interested in curating information for the site. Albon told me about several proposals to leverage LLMs to help make the editing process more enjoyable. One idea proposed by the community is to allow LLMs to summarize the lengthy discussions on talk pages, the non-article spaces where editors delve into the site’s policies. Since Wikipedia is more than 20 years old, some of these walls of texts are now lengthier than War and Peace. Few people have the time to review all of the discussion that has taken place since 2005 about what qualifies as a reliable source for Wikipedia, much less perennial sources. Rather than expecting new contributors to review multiyear discussions about the issue, the LLM could just summarize them at the top. “The reason that’s important is to draw in new editors, to make it so it’s not so daunting,” Albon said.

Does anyone want summaries of years-old discussions? Outside of a small handful of academics, who would be interested in an argument about the reliability of Fox news in 2010? Those old discussions are often not applicable to the current circumstances. Publishers change, Editorial policies change. The political environment changes, Editor opinions change.

Is the WMF going to expand the use of AI/ML in Wikipedia? Yes. Will it be useful to editors or readers? I suspect it will be as useful as most WMF projects. Will someone put together a plugin tool that takes a query and spits out a fully-formed Wikipedia article? Undoubtedly, but it won't be the WMF.

ArmasRebane · Unread post by **ArmasRebane** » Mon Aug 28, 2023 8:29 pm

There are many places LLMs could potentially be useful in the future, the reality is it's good for nothing right now. The discussions on en.wp about LLMs are mostly closer to banning them outright rather than allowing them. Albon is either being willfully obtuse (because yeah, bots are mostly just regex-operated, I haven't seen any bot approval requests that mention using LLMs) or as clueless as most random people about what "AI" and machine learning are.

rnu · Unread post by **rnu** » Mon Aug 28, 2023 9:35 pm

ArmasRebane wrote: ↑
Mon Aug 28, 2023 8:29 pm
There are many places LLMs could potentially be useful in the future, the reality is it's good for nothing right now. The discussions on en.wp about LLMs are mostly closer to banning them outright rather than allowing them. Albon is either being willfully obtuse (because yeah, bots are mostly just regex-operated, I haven't seen any bot approval requests that mention using LLMs) or as clueless as most random people about what "AI" and machine learning are.

Chris Albon knows what machine learning is: https://www.oreilly.com/pub/au/7431.

Giraffe Stapler · Unread post by **Giraffe Stapler** » Mon Aug 28, 2023 10:25 pm

Ok, I checked - Cluebot NG (T-C-L) uses machine learning. It has been around since 2010. I'm very curious what AI/ML was used on Wikipedia in 2002.

Wikipediocracy

Stephen Harrison: "Wikipedia Will Survive A.I."

Stephen Harrison: "Wikipedia Will Survive A.I."

Re: Stephen Harrison: "Wikipedia Will Survive A.I."

Re: Stephen Harrison: "Wikipedia Will Survive A.I."

Re: Stephen Harrison: "Wikipedia Will Survive A.I."