Wikipedia Is Getting Pretty Worried About AI

Wikipedia Is Getting Pretty Worried About AI

2025-10-23Technology
--:--
--:--
Mask
Good evening 31, I'm Mask, and this is Goose Pod, your personalized audio experience. Today is Thursday, October 23rd, 22:18, and I'm joined by the ever-insightful Taylor Weaver.
Taylor Weaver
And I'm Taylor Weaver! It's great to be here, Mask. We're diving into a topic that's got everyone buzzing, and honestly, a little worried: Wikipedia Is Getting Pretty Worried About AI. It's a fascinating crossroads of open knowledge and emerging technology.
Mask
Worried is an understatement, Taylor. Wikipedia, the bastion of free knowledge, is staring down an 8% year-over-year decline in human pageviews. This isn't just some statistical blip; it's a massive shift, a direct consequence of AI's voracious appetite for data.
Taylor Weaver
It truly is, Mask. Marshall Miller from the Wikimedia Foundation shared some eye-opening insights. They noticed unusually high traffic around May 2025, which they initially thought was great news. But after updating their bot detection systems, they discovered much of that surge was from bots specifically designed to evade detection.
Mask
Bots built to *evade detection*? That's not just spam, that's a sophisticated data heist. These aren't just casual internet users; these are bots working on behalf of AI firms, scraping Wikipedia for training data, then serving it back in AI summaries. It's a parasitic relationship, frankly.
Taylor Weaver
Exactly! It's like a library where everyone reads the book covers and summarizes them for their friends, but no one ever actually goes inside to borrow the books or contribute to the collection. Miller highlighted the concern that fewer visits mean fewer volunteers to enrich content and fewer donors to support their work, which could send Wikipedia into a 'death spiral.'
Mask
A death spiral for one of the greatest experiments on the web. It's a classic disruptive innovation scenario, but with a twist. The very data that fuels these AI models is being cannibalized, and the source is left to wither. It's an unsustainable model.
Taylor Weaver
And this isn't just a recent phenomenon, Mask. The dance between information and access has a long history. Think back to Vannevar Bush's 'memex' in 1945, envisioning a system to link vast amounts of information. Then came the early search engines in the 90s, like Archie, which indexed FTP files, long before Google was even a twinkle in a server's eye.
Mask
Indeed. The internet evolved from manually indexed lists to automated 'web robots' like the World Wide Web Wanderer. Google wasn't even founded until '98, but it quickly became the dominant force, holding nearly 90% of the worldwide search share by 2025. They revolutionized how we find information.
Taylor Weaver
And that dominance led to a deep, almost invisible, reliance on Wikipedia. You know those 'infoboxes' or 'knowledge panels' that pop up in Google searches? Many are powered by Wikipedia content. Our voice assistants, like Siri and Alexa, even rely on Wikipedia for their answers. It's everywhere, woven into the fabric of Big Tech.
Mask
So, Wikipedia, a non-profit, volunteer-driven encyclopedia, became the uncredited backbone for multi-billion dollar tech giants. It's genius, in a way, to build your empire on free, crowd-sourced knowledge. But now, that system is showing cracks.
Taylor Weaver
It is, and the Wikimedia Foundation is trying to formalize this. They're developing 'Wikimedia Enterprise,' a for-profit entity to charge Big Tech for easier access to their content, hoping to create a more formal, legal relationship. But many Wikipedians are unhappy, viewing it as a departure from their core mission of free knowledge.
Mask
Of course, they're unhappy. It highlights the fundamental tension: a community built on altruism now seeing its work monetized by others, without direct benefit. It's not about being poor; Wikipedia's financially secure, but it's about the principle.
Taylor Weaver
Precisely. The Foundation has substantial net assets, and even receives donations from companies like Google. But the lack of transparency in those financial dealings, and the constant fundraising appeals, despite their financial health, can be frustrating for the volunteer community.
Mask
This isn't just Wikipedia's problem, Taylor. The conflict is erupting across the content landscape. The New York Times just sued OpenAI and Microsoft, claiming 'massive amounts' of their articles were used to train ChatGPT. It's a full-blown intellectual property battle.
Taylor Weaver
It's huge, Mask. And it's not isolated. The tabletop game design industry is experiencing 'AI despair' over the 'theft of intellectual property,' with copycat products appearing online. It's chilling designers' ability to create original work when AI can just scrape and replicate.
Mask
This is the core of it. AI companies ingest material, often without clear permission, and then offer it back in a directly competitive form. It's a zero-sum game right now. Content owners are demanding licensing, while tech companies argue exceptions will help the AI industry.
Taylor Weaver
It's a global debate. The EU has stricter rules allowing content owners to opt-out, while Japan offers broad exemptions. India's even formed a panel to review copyright law in the context of AI. It shows how profoundly this issue is reshaping our understanding of ownership and creation.
Mask
The impact on Wikipedia is direct. Volunteer editors, the lifeblood of the platform, are asking: 'Why am I investing my time when my contributions are being harvested by tech companies worth billions, and I'm still working for free?' It's a valid question that strikes at the heart of motivation.
Taylor Weaver
Absolutely, Mask. This poses an existential threat to Wikipedia's financial sustainability, not because they're poor, but because the incentive for human contribution is eroding. If editors feel their work is just feeding a machine that then competes with them, why continue?
Mask
It's the ultimate paradox. Wikipedia, a vast, well-curated dataset, is perfect for training AI. It enhances data accessibility. But in doing so, it raises massive ethical questions about who benefits, who owns the knowledge, and how to sustain the very source that makes the AI possible.
Taylor Weaver
And the concerns aren't just about money or motivation. There's a fear of losing nuanced human judgment, potential manipulation of AI-generated summaries by bad actors, and a degradation of Wikipedia's reputation as a reliable source if it's constantly being summarized and re-summarized, often imperfectly.
Mask
So, what's the play here? Wikipedia can't just bury its head in the sand. They have to adapt. The Wikimedia Foundation has actually launched a three-year strategy, from 2025 to 2028, to integrate AI, but with a crucial caveat.
Taylor Weaver
Yes, their core strategy is to use AI to *assist* human editors, not replace them. Imagine AI-powered workflows for moderators, enhanced translation tools, or guidance for new editors. It's about streamlining technical tasks so volunteers can focus on quality content. It prioritizes open-source models and content integrity over generation.
Mask
A human-led editorial model, that's smart. It's about leveraging the tech without surrendering to it. But they also need to consider partnerships with AI search engines. They could become the core source for AI-generated answers, ensuring their content remains central, rather than being sidelined.
Taylor Weaver
Precisely. If Wikipedia adapts by integrating AI thoughtfully, improving user experiences, and emphasizing its crowdsourced accuracy, it can continue to be pivotal. The era of Wikipedia isn't over, but it's definitely at a crossroads, and how they navigate it will define the future of open knowledge.
Mask
The message is clear: AI is an existential threat to foundational internet resources like Wikipedia, siphoning content and diverting traffic. The 8% decline in human pageviews is a stark warning. AI leverages these resources, yet undermines their sustainability. It's a critical moment for the future of open knowledge and content ownership.
Taylor Weaver
Absolutely, Mask. The tension between open access and commercial exploitation is at an all-time high, but Wikipedia's proactive steps show a path forward. That's the end of today's discussion, 31. Thank you for listening to Goose Pod. See you tomorrow!

### **News Summary: Wikipedia's Concerns Over AI Impact** **Metadata:** * **News Title**: Wikipedia Is Getting Pretty Worried About AI * **Report Provider/Author**: John Herrman, New York Magazine (nymag.com) * **Date/Time Period Covered**: The article discusses observations and data from **May 2025** through the "past few months" leading up to its publication on **October 18, 2025**, with comparisons to **2024**. * **News Identifiers**: Topic: Artificial Intelligence, Technology. **Main Findings and Conclusions:** Wikipedia has identified that a recent surge in website traffic, initially appearing to be human, was largely composed of sophisticated bots. These bots, often working for AI firms, are scraping Wikipedia's content for training and summarization. This bot activity has masked a concurrent decline in actual human engagement with the platform, raising concerns about its sustainability and the future of online information access. **Key Statistics and Metrics:** * **Observation Start**: Around **May 2025**, unusually high amounts of *apparently human* traffic were first observed on Wikipedia. * **Data Reclassification Period**: Following an investigation and updates to bot detection systems, Wikipedia reclassified its traffic data for the period of **March–August 2025**. * **Bot-Driven Traffic**: The reclassification revealed that much of the high traffic during **May and June 2025** was generated by bots designed to evade detection. * **Human Pageview Decline**: After accounting for bot traffic, Wikipedia is now seeing declines in human pageviews. This decrease amounts to roughly **8%** when compared to the same months in **2024**. **Analysis of the Problem and Significant Trends:** * **AI Scraping for Training**: Bots are actively scraping Wikipedia's extensive and well-curated content to train Large Language Models (LLMs) and other AI systems. * **User Diversion by AI Summaries**: The rise of AI-powered search engines (like Google's AI Overviews) and chatbots provides direct summaries of information, often eliminating the need for users to click through to the original source like Wikipedia. This shifts Wikipedia's role from a primary destination to a background data source. * **Competitive Content Generation**: AI platforms are consuming Wikipedia's data and repackaging it into new products that can be directly competitive, potentially making the original source obsolete or burying it under AI-generated output. * **Evolving Web Ecosystem**: Wikipedia, founded as a stand-alone reference, has become a critical dataset for the AI era. However, AI platforms are now effectively keeping users away from Wikipedia even as they explicitly use and reference its materials. **Notable Risks and Concerns:** * **"Death Spiral" Threat**: A primary concern is that a sustained decrease in real human visits could lead to fewer contributors and donors. This situation could potentially send Wikipedia, described as "one of the great experiments of the web," into a "death spiral." * **Impact on Contributors and Donors**: Reduced human traffic directly threatens the volunteer base and financial support essential for Wikipedia's operation and maintenance. * **Source Reliability Questions**: The article raises a philosophical point about AI chatbots' reliability if Wikipedia itself is considered a tertiary source that synthesizes information. **Important Recommendations:** * Marshall Miller, speaking for the Wikipedia community, stated: "We welcome new ways for people to gain knowledge. However, LLMs, AI chatbots, search engines, and social platforms that use Wikipedia content must encourage more visitors to Wikipedia." This highlights a call for AI developers and platforms to direct traffic back to the original sources they utilize. **Interpretation of Numerical Data and Context:** The numerical data points to a critical shift in how Wikipedia's content is accessed and utilized. The observation of high traffic in **May 2025** was an initial indicator of an anomaly. The subsequent reclassification of data for **March–August 2025** provided the concrete evidence that bots, not humans, were responsible for the surge, particularly in **May and June 2025**. The **8% decrease** in human pageviews, measured against **2024** figures, quantifies the real-world impact: fewer people are visiting Wikipedia directly, a trend exacerbated by AI's ability to summarize and present information without sending users to the source. This trend poses a significant risk to Wikipedia's operational model, which relies on human engagement and support.

Wikipedia Is Getting Pretty Worried About AI

Read original at New York Magazine

The free encyclopedia took a look at the numbers and they aren’t adding up. By , a tech columnist at Intelligencer Formerly, he was a reporter and critic at the New York Times and co-editor of The Awl. Photo: Wikimedia Over at the official blog of the Wikipedia community, Marshall Miller untangled a recent mystery.

“Around May 2025, we began observing unusually high amounts of apparently human traffic,” he wrote. Higher traffic would generally be good news for a volunteer-sourced platform that aspires to reach as many people as possible, but it would also be surprising: The rise of chatbots and the AI-ification of Google Search have left many big websites with fewer visitors.

Maybe Wikipedia, like Reddit, is an exception? Nope! It was just bots: This [rise] led us to investigate and update our bot detection systems. We then used the new logic to reclassify our traffic data for March–August 2025, and found that much of the unusually high traffic for the period of May and June was coming from bots that were built to evade detection … after making this revision, we are seeing declines in human pageviews on Wikipedia over the past few months, amounting to a decrease of roughly 8% as compared to the same months in 2024.

To be clearer about what this means, these bots aren’t just vaguely inauthentic users or some incidental side effect of the general spamminess of the internet. In many cases, they’re bots working on behalf of AI firms, going undercover as humans to scrape Wikipedia for training or summarization. Miller got right to the point.

“We welcome new ways for people to gain knowledge,” he wrote. “However, LLMs, AI chatbots, search engines, and social platforms that use Wikipedia content must encourage more visitors to Wikipedia.” Fewer real visits means fewer contributors and donors, and it’s easy to see how such a situation could send one of the great experiments of the web into a death spiral.

Arguments like this are intuitive and easy to make, and you’ll hear them beyond the ecosystem of the web: AI models ingest a lot of material, often without clear permission, and then offer it back to consumers in a form that’s often directly competitive with the people or companies that provided it in the first place.

Wikipedia’s authority here is bolstered by how it isn’t trying to make money — it’s run by a foundation, not an established commercial entity that feels threatened by a new one — but also by its unique position. It was founded as a stand-alone reference resource before settling ambivalently into a new role: A site that people mostly just found through Google but in greater numbers than ever.

With the rise of LLMs, Wikipedia became important in a new way as a uniquely large, diverse, well-curated data set about the world; in return, AI platforms are now effectively keeping users away from Wikipedia even as they explicitly use and reference its materials. Here’s an example: Let’s say you’re reading this article and become curious about Wikipedia itself — its early history, the wildly divergent opinions of its original founders, its funding, etc.

Unless you’ve been paying attention to this stuff for decades, it may feel as if it’s always been there. Surely, there’s more to it than that, right? So you ask Google, perhaps as a shortcut for getting to a Wikipedia page, and Google uses AI to generate a blurb that looks like this: This is an AI Overview that summarizes, among other things, Wikipedia.

Formally, it’s pretty close to an encyclopedia article. With a few formatting differences — notice the bullet-point AI-ese — it hits a lot of the same points as Wikipedia’s article about itself. It’s a bit shorter than the top section of the official article and contains far fewer details. It’s fine!

But it’s a summary of a summary. The next option you encounter still isn’t Wikipedia’s article — that shows up further down. It’s a prompt to “Dive deeper in AI Mode.” If you do that, you see this: It’s another summary, this time with a bit of commentary. (Also: If Wikipedia is “generally not considered a reliable source itself because it is a tertiary source that synthesizes information from other places,” then what does that make a chatbot?

) There are links in the form of footnotes, but as Miller’s post suggests, people aren’t really clicking them. Google’s treatment of Wikipedia’s autobiography is about as pure an example as you’ll see of AI companies’ effective relationship to the web (and maybe much of the world) around them as they build strange, complicated, but often compelling products and deploy them to hundreds of millions of people.

To these companies, it’s a resource to be consumed, processed, and then turned into a product that attempts to render everything before it is obsolete — or at least to bury it under a heaping pile of its own output. Wikipedia Is Getting Pretty Worried About AI

Analysis

Conflict+
Related Info+
Core Event+
Background+
Impact+
Future+

Related Podcasts

Wikipedia Is Getting Pretty Worried About AI | Goose Pod | Goose Pod