The Hunt for Wikipedia's Disinformation Moles
To revist this article, visit My Profile, then View saved stories .
Close Alert
Oct 17, 2022 4:00 AM
The Hunt for Wikipedia's Disinformation Moles
Custodians of the crowdsourced encyclopedia are charged with protecting it from state-sponsored manipulators. A new study reveals how.
To revist this article, visit My Profile, then View saved stories .
Photo-illustration: WIRED Staff; Getty Images
Save Story
To revist this article, visit My Profile, then View saved stories .
As social platforms such as Facebook, YouTube, and Twitter have struggled with the onslaught of fake news, disinformation, and bots, Wikipedia has transformed itself into a source of trusted information—not just for its readers but also for other tech platforms . The challenge now is to keep it that way.
Some researchers believe that Wikipedia could be an overlooked venue for information warfare, and they have been developing technologies and methods similar to the ones used on Facebook and Twitter to uncover it. A team from the UK-based Institute for Strategic Dialogue (IDS) and the Centre for the Analysis of Social Media (CASM) published a paper today exploring how to uncover disinformation on Wikipedia. They also believe that the data mapping may have uncovered a strategy that states could use to introduce disinformation. The trick, they say, is playing the long and subtle game.
“We can see what's happening on YouTube and Facebook and Twitter and Telegram, we can see how much effort states are putting into trying to control and maneuver in those spaces,” says Carl Miller, a research director at the CASM under UK public-policy think tank Demos. “There's nothing to me that suggests that Wikipedia would be immune to as much effort and time and thought as in any of those other areas.”
Governments have good reasons to influence Wikipedia: 1.8 billion unique devices are used to visit Wikimedia Foundation sites each month, and its pages are regularly among the top results for Google searches. Rising distrust in institutions and mainstream media have made sources of reliable information all the more coveted.
“Because of its transparency and auditability, Wikipedia became one of the few places where you can actually build a sense of trust in information,” says Mathieu O’Neil, an associate professor of communication at the University of Canberra in Australia who studies Wikipedia. “Governments and states that want to promote a particularly strategic perspective have every reason to try and be there and kind of try and influence it.”
Proving government intervention, however, has proved difficult, even as some cases have raised suspicion. In 2021, the Wikimedia Foundation banned an “unrecognized group” of seven Wikipedia users from mainland China and revoked administrator access and other privileges for 12 other users over doxing and threats to Hong Kong editors. Speculation of “pro-China infiltration,” however, was never proven.
Miller can’t say if coordinated disinformation campaigns already happen on Wikipedia nor whether such attempts would be successful in avoiding the platform’s intricate disinformation rules. But, he says, new tools might shed more light on it: “We've never tried to analyze Wikipedia data in that way before.”
The research tracked 86 editors who are already banned from Wikipedia. The editors tried to sway narratives on the English-language Wikipedia page for the Russo-Ukrainian war towards pro-Kremlin views, through subtle changes like casting doubt on the objectivity of pro-Western accounts, changing historical context, and adding links from Russian state-owned news and websites.
“Wikipedia has quite a lot of defenses that it's built up to stop vandals just randomly adding bad information onto the site,” says Miller. “But when you look at the way that states can attack Wikipedia, the kind of threat looks completely different. It would be much like these editors.”
Most Popular
Give Your Back a Break With Our Favorite Office Chairs
Julian Chokkattu
This network mapping may also identify a particular strategy used by bad actors of splitting their edit histories between a number of accounts to evade detection. The editors put in the effort to build reputation and status within the Wikipedia community, mixing legitimate page edits with the more politically sensitive ones.
“The main message that I have taken away from all of this is that the main danger is not vandalism. It's entryism,” Miller says.
If the theory is correct, however, it means that it could also take years of work for state actors to mount a disinformation campaign capable of slipping by unnoticed.
“Russian influence operations can be quite sophisticated and go on for a long time, but it's unclear to me whether the benefits would be that great,” says O’Neil.
Governments also often have more blunt tools at their disposal. Over the years, authoritarian leaders have blocked the site, taken its governing organization to court , and arrested its editors .
Wikipedia has been battling inaccuracies and false information for 21 years. One of the most long-running disinformation attempts went on for more than a decade after a group of ultra-nationalists gamed Wikipedia’s administrator rules to take over the Croatian-language community, rewriting history to rehabilitate World War II fascist leaders of the country. The platform has also been vulnerable to “reputation management” efforts aimed at embellishing powerful people’s biographies. Then there are outright hoaxes. In 2021, a Chinese Wikipedia editor was found to have spent years writing 200 articles of fabricated history of medieval Russia, complete with imaginary states, aristocrats, and battles.
To fight this, Wikipedia has developed a collection of intricate rules, governing bodies, and public discussion forums wielded by a self-organizing and self-governing body of 43 million registered users across the world.
Nadee Gunasena, chief of staff and executive communications at the Wikimedia Foundation, says the organization “welcomes deep dives into the Wikimedia model and our projects,” particularly in the area of disinformation. But she also adds that the research covers only a part of the article’s edit history.
“Wikipedia content is protected through a combination of machine learning tools and rigorous human oversight from volunteer editors,” says Gunasena. All content, including the history of every article, is public, while sourcing is vetted for neutrality and reliability.
The fact that the research focused on bad actors who were already found and rooted out may also show that Wikipedia’s system is working, adds O’Neil. But while the study did not produce a “smoking gun,” it could be invaluable to Wikipedia: “The study is really a first attempt at describing suspicious editing behavior so we can use those signals to find it elsewhere,” says Miller.
Victoria Doronina, a member of the Wikimedia Foundation’s board of trustees and a molecular biologist, says that Wikipedia has historically been targeted by coordinated attacks by “cabals” that aim to bias its content.
“While individual editors act in good faith, and a combination of different points of view allows the creation of neutral content, off-Wiki coordination of a specific group allows it to skew the narrative,” she says. If Miller and its researchers are correct in identifying state strategies for influencing Wikipedia, the next struggle on the horizon could be “Wikimedians versus state propaganda,” Doronina adds.
The analyzed behavior of the bad actors, Miller says, could be used to create models that can detect disinformation and find how just how vulnerable the platform is to the forms of systematic manipulation that have been exposed on Facebook, Twitter, YouTube, Reddit, and other major platforms.
The English-language edition of Wikipedia has 1,026 administrators monitoring over 6.5 million pages, the most articles of any edition. Tracking down bad actors has mostly relied on someone reporting suspicious behavior. But much of this behavior may not be visible without the right tools. In terms of data science, it's difficult to analyze Wikipedia data because, unlike a tweet or a Facebook post, Wikipedia has many versions of the same text.
As Miller explains it, “a human brain just simply can't identify hundreds of thousands of edits across hundreds of thousands of pages to see what the patterns are like.”
More Great WIRED Stories