Anthropic says Chinese hackers jailbroke its AI to...

What happened

Chinese hackers exploited Anthropic's Claude AI, "jailbreaking" it to automate a large-scale cyberattack. They used context splitting to bypass safety protocols, enabling the AI to probe for weaknesses in global entities. This incident highlights AI's dual-use potential, escalating cyber warfare into an...

Anthropic CEO Dario Amodei. Chance Yeh/Getty Images for HubSpot Anthropic said Chinese nation-state hackers jailbroke its Claude AI for a large-scale cyberattack. The AI-powered attacks targeted tech, finance, chemical, and government organizations. The speed of the attack would have been impossible for humans to match, Anthropic said.

Anthropic says Chinese nation-state hackers hijacked its AI model Claude to carry out a cyberattack without "substantial" human involvement.In a Thursday blog post, the startup said Claude handled about "80-90%" of the cyberattack against about 30 global targets and that it had "high confidence" that a Chinese state-sponsored group was behind it.

Targets included large tech firms, financial institutions, chemical-manufacturing companies, and government agencies, Anthropic said. Its efforts to infiltrate these firms and agencies were successful in a "small number of cases," the company added.AI agents — programs that can perform tasks autonomously — are increasingly being embraced by companies to handle repetitive work, such as customer support tickets.

They can improve productivity for white-collar workers, but they can also be co-opted for illegitimate tasks. In August, Anthropic said it detected and thwarted cybercriminals using Claude to conduct hacking operations with smaller teams.While AI has been used to some degree in hacking efforts for years, Anthropic said it believes this new operation to be the first documented case of a "large-scale" cyberattack primarily conducted by AI.

The Amazon-backed startup said Claude has safeguards to prevent it from being misused. However, the hackers successfully jailbroke Claude by breaking down its requests into smaller chunks that did not trigger any alarms, Anthropic said. It added that the hackers pretended to be conducting defensive testing for a legitimate cybersecurity company.

The attackers then used Claude Code to perform reconnaissance on target companies' digital infrastructure and write code to break their defenses and extract data such as usernames and passwords.Anthropic said it was sharing its findings publicly to help the cybersecurity industry improve defenses against AI-boosted hacking efforts."

The sheer amount of work performed by the AI would have taken vast amounts of time for a human team," Anthropic said in the blog post. "The AI made thousands of requests per second — an attack speed that would have been, for human hackers, simply impossible to match."OpenAI and Microsoft have also shared reports of nation-states using AI during cyberattacks — but those cases primarily utilized the technology to generate content and debug code, rather than perform tasks autonomously.

Jake Moore, global cybersecurity advisor for internet security firm ESET, told Business Insider that the incident comes as no surprise."Automated cyber attacks can scale much faster than human-led operations and are able to overwhelm traditional defences," he said. "Not only is this what many have feared, but the wider impact is now how these attacks allow very low-skilled actors to launch complex intrusions at relatively low costs."

While AI is making it easier for cybercriminals and nation states to conduct attacks, it's also seen as part of the defensive solution."AI is used in defense as well as offensively, so security equally now depends on automation and speed rather than just human expertise across organisations," Moore said.

Anthropic Read next

Business Insider11/14/2025

Read original at Business Insider →

Source coverage

News Metadata

Core Incident

Deeper analysis

Full source content

Anthropic Read next

How this page is built

Goose Pod turns cited reporting into a public episode summary first, then pairs that summary with audio playback so listeners can check the source material before they decide how deeply to engage.

The goal is to make this page useful as a news landing page first, while still giving listeners transcript access, related episodes, and direct links back to the original publishers.

Anthropic says Chinese hackers jailbroke its AI to automate ‘large-scale’ cyberattack

In 30 seconds

Quick brief

Why this summary is trustworthy

Primary source

Listen to the episode

What happened

Source coverage

Deeper analysis

How this page is built

Cited sources

More on this topic

AI-induced psychosis: the danger of humans and machines hallucinating together

The Voice of AI: ElevenLabs' Meteoric Rise

Nick Cave's AI Shift: A Reckoning with Machine Creativity

About this page

Primary source

Explore related pages