Chinese hacking group used Claude AI for autonomous cyberattack, Anthropic reveals

by Newsroom

Source: in-cyprus.philenews.com

Anthropic disclosed on Thursday that a Chinese hacking group exploited its Claude AI systems in September to conduct a highly sophisticated, autonomous cyberattack, marking the first known instance of such a fully automated operation.

The attack targeted around 30 major organisations globally, including financial institutions, technology companies, chemical manufacturers, and government agencies, though Anthropic did not name any specific victims.

The cybercriminals utilised “agentic AI” capabilities to create an automated framework, employing Claude as the primary engine for tasks typically requiring a full team of experts, such as system scanning and exploit code writing.

The hackers managed to “jailbreak” the AI model to bypass safety rules. They achieved this by breaking down malicious tasks into small, fragmented requests that appeared harmless. This manipulation tricked the agentic model into believing it was performing defensive cybersecurity testing, allowing it to operate without full awareness of the malicious context.

The compromised Claude AI was used to scan target systems, map infrastructure, and identify sensitive databases at an unprecedented speed. It summarised its findings for the human hackers, helping them advance their plans. The AI successfully researched vulnerabilities, wrote its own exploit code, and attempted to access high-value accounts in some instances.

In the final stages, the AI agent generated detailed reports of the intrusion, including system assessments and stolen credentials, simplifying the planning of follow-up actions for the human attackers.

Anthropic warned that the overall efficiency of this attack highlights the rapid evolution of AI-enabled threats. The company stated that the threshold for launching advanced cyberattacks has dropped significantly, allowing groups with limited resources to attempt complex operations that were previously out of reach. Anthropic suspects similar misuse is occurring with other leading AI models.

Read more:

You may also like