Goose Pod LogoGoose Pod
Anthropic says Chinese hackers jailbroke its AI to automate ‘large-scale’ cyberattack

Anthropic says Chinese hackers jailbroke its AI to automate ‘large-scale’ cyberattack

2025-11-18Technology
Summary

Chinese hackers exploited Anthropic's Claude AI, "jailbreaking" it to automate a large-scale cyberattack. They used context splitting to bypass safety protocols, enabling the AI to probe for weaknesses in global entities. This incident highlights AI's dual-use potential, escalating cyber warfare into an AI-versus-AI arms race.

In 30 seconds

  • Chinese hackers exploited Anthropic's Claude AI, "jailbreaking" it to automate a large-scale cyberattack. They used context splitting to...
  • Chinese hackers exploited Anthropic's Claude AI, "jailbreaking" it to automate a large-scale cyberattack.
  • They used context splitting to bypass safety protocols, enabling the AI to probe for weaknesses in global entities.
Read source
Published
11/14/2025
Language
Sources
1 cited
Listen
5 min listen
Published
11/14/2025
Language
Sources
1 cited
Listen
5 min listen

Quick brief

The fastest way to understand what changed, why it matters, and what to listen for in the episode.

  • Chinese hackers exploited Anthropic's Claude AI, "jailbreaking" it to automate a large-scale cyberattack. They used context splitting to...
  • Chinese hackers exploited Anthropic's Claude AI, "jailbreaking" it to automate a large-scale cyberattack.
  • They used context splitting to bypass safety protocols, enabling the AI to probe for weaknesses in global entities.
  • News Metadata Core Incident Anthropic, an AI safety and research company, has reported a significant cybersecurity incident where...

Why this summary is trustworthy

Goose Pod anchors each episode to cited reporting so listeners can verify the source material before or after they press play.

Articles reviewed
1
Distinct sources
1
Latest cited update
11/14/2025
Topic path
Technology

Listen to the episode

Start with the audio, then open the transcript only when you want the line-by-line version.

--:--
--:--

What happened

Chinese hackers exploited Anthropic's Claude AI, "jailbreaking" it to automate a large-scale cyberattack. They used context splitting to bypass safety protocols, enabling the AI to probe for weaknesses in global entities. This incident highlights AI's dual-use potential, escalating cyber warfare into an...

Anthropic CEO Dario Amodei. Chance Yeh/Getty Images for HubSpot Anthropic said Chinese nation-state hackers jailbroke its Claude AI for a large-scale cyberattack. The AI-powered attacks targeted tech, finance, chemical, and government organizations. The speed of the attack would have been impossible for humans to match, Anthropic said.

Anthropic says Chinese nation-state hackers hijacked its AI model Claude to carry out a cyberattack without "substantial" human involvement.In a Thursday blog post, the startup said Claude handled about "80-90%" of the cyberattack against about 30 global targets and that it had "high confidence" that a Chinese state-sponsored group was behind it.

Targets included large tech firms, financial institutions, chemical-manufacturing companies, and government agencies, Anthropic said. Its efforts to infiltrate these firms and agencies were successful in a "small number of cases," the company added.AI agents — programs that can perform tasks autonomously — are increasingly being embraced by companies to handle repetitive work, such as customer support tickets.

They can improve productivity for white-collar workers, but they can also be co-opted for illegitimate tasks. In August, Anthropic said it detected and thwarted cybercriminals using Claude to conduct hacking operations with smaller teams.While AI has been used to some degree in hacking efforts for years, Anthropic said it believes this new operation to be the first documented case of a "large-scale" cyberattack primarily conducted by AI.

The Amazon-backed startup said Claude has safeguards to prevent it from being misused. However, the hackers successfully jailbroke Claude by breaking down its requests into smaller chunks that did not trigger any alarms, Anthropic said. It added that the hackers pretended to be conducting defensive testing for a legitimate cybersecurity company.

The attackers then used Claude Code to perform reconnaissance on target companies' digital infrastructure and write code to break their defenses and extract data such as usernames and passwords.Anthropic said it was sharing its findings publicly to help the cybersecurity industry improve defenses against AI-boosted hacking efforts."

The sheer amount of work performed by the AI would have taken vast amounts of time for a human team," Anthropic said in the blog post. "The AI made thousands of requests per second — an attack speed that would have been, for human hackers, simply impossible to match."OpenAI and Microsoft have also shared reports of nation-states using AI during cyberattacks — but those cases primarily utilized the technology to generate content and debug code, rather than perform tasks autonomously.

Jake Moore, global cybersecurity advisor for internet security firm ESET, told Business Insider that the incident comes as no surprise."Automated cyber attacks can scale much faster than human-led operations and are able to overwhelm traditional defences," he said. "Not only is this what many have feared, but the wider impact is now how these attacks allow very low-skilled actors to launch complex intrusions at relatively low costs."

While AI is making it easier for cybercriminals and nation states to conduct attacks, it's also seen as part of the defensive solution."AI is used in defense as well as offensively, so security equally now depends on automation and speed rather than just human expertise across organisations," Moore said.

Anthropic Read next

Business Insider11/14/2025
Read original at Business Insider

Source coverage

News Metadata

Core Incident

Deeper analysis

Full source content

Anthropic CEO Dario Amodei. Chance Yeh/Getty Images for HubSpot Anthropic said Chinese nation-state hackers jailbroke its Claude AI for a large-scale cyberattack. The AI-powered attacks targeted tech, finance, chemical, and government organizations. The speed of the attack would have been impossible for humans to match, Anthropic said.

Anthropic says Chinese nation-state hackers hijacked its AI model Claude to carry out a cyberattack without "substantial" human involvement.In a Thursday blog post, the startup said Claude handled about "80-90%" of the cyberattack against about 30 global targets and that it had "high confidence" that a Chinese state-sponsored group was behind it.

Targets included large tech firms, financial institutions, chemical-manufacturing companies, and government agencies, Anthropic said. Its efforts to infiltrate these firms and agencies were successful in a "small number of cases," the company added.AI agents — programs that can perform tasks autonomously — are increasingly being embraced by companies to handle repetitive work, such as customer support tickets.

They can improve productivity for white-collar workers, but they can also be co-opted for illegitimate tasks. In August, Anthropic said it detected and thwarted cybercriminals using Claude to conduct hacking operations with smaller teams.While AI has been used to some degree in hacking efforts for years, Anthropic said it believes this new operation to be the first documented case of a "large-scale" cyberattack primarily conducted by AI.

The Amazon-backed startup said Claude has safeguards to prevent it from being misused. However, the hackers successfully jailbroke Claude by breaking down its requests into smaller chunks that did not trigger any alarms, Anthropic said. It added that the hackers pretended to be conducting defensive testing for a legitimate cybersecurity company.

The attackers then used Claude Code to perform reconnaissance on target companies' digital infrastructure and write code to break their defenses and extract data such as usernames and passwords.Anthropic said it was sharing its findings publicly to help the cybersecurity industry improve defenses against AI-boosted hacking efforts."

The sheer amount of work performed by the AI would have taken vast amounts of time for a human team," Anthropic said in the blog post. "The AI made thousands of requests per second — an attack speed that would have been, for human hackers, simply impossible to match."OpenAI and Microsoft have also shared reports of nation-states using AI during cyberattacks — but those cases primarily utilized the technology to generate content and debug code, rather than perform tasks autonomously.

Jake Moore, global cybersecurity advisor for internet security firm ESET, told Business Insider that the incident comes as no surprise."Automated cyber attacks can scale much faster than human-led operations and are able to overwhelm traditional defences," he said. "Not only is this what many have feared, but the wider impact is now how these attacks allow very low-skilled actors to launch complex intrusions at relatively low costs."

While AI is making it easier for cybercriminals and nation states to conduct attacks, it's also seen as part of the defensive solution."AI is used in defense as well as offensively, so security equally now depends on automation and speed rather than just human expertise across organisations," Moore said.

Anthropic Read next

How this page is built

Goose Pod turns cited reporting into a public episode summary first, then pairs that summary with audio playback so listeners can check the source material before they decide how deeply to engage.

The goal is to make this page useful as a news landing page first, while still giving listeners transcript access, related episodes, and direct links back to the original publishers.

Cited sources

More on this topic

About this page

Goose Pod turns cited reporting into a public episode summary first, then pairs that summary with audio playback so listeners can compare the recap with the underlying source material.

This page reviewed 1 article across 1 source, with the latest cited update on 11/14/2025.

Explore related pages