Can Anthropic's AI Exploit Software Vulnerabilities?

By Sachin Mohan | April 24, 2026 | 5 min read

"If bad guys get access to a tool like this, they don't need to be the most efficient or go after the biggest target."

Anthropic's decision not to release Claude Mythos Preview publicly has drawn significant attention for what the model can do.

John Carlin, Cybersecurity and Data Protection Practice Group Chair at Paul, Weiss, Rifkind, Wharton and Garrison, is more focused on what it reveals about the organisations that would face it.

Speaking with the Cybersecurity Law Report on April 22, 2026, Carlin warned that Mythos exposes a problem most enterprises have been deferring, the extent of legacy IT infrastructure and software that is no longer patchable.

In a world where a single skilled attacker needed to identify and exploit individual vulnerabilities, that deferred maintenance carried manageable risk. Mythos changes that calculus.

"If bad guys get access to a tool like this, they don't need to be the most efficient or go after the biggest target," Carlin said. "Likely they can just, at scale, target thousands and thousands of points at once."

Carlin's warning is not primarily about Mythos itself. It is about what Mythos makes visible. Enterprises that have accumulated years of unpatched systems, end-of-life software, and deferred security investment now face a threat environment where that exposure can be systematically identified and exploited at a speed and scale that outpaces any manual response.

His recommendation is equally direct, companies should engage in urgent high-level discussions about disclosures and governance response. The implication is that this is no longer a conversation that can remain within IT and security teams.

Boards and senior leadership need to understand what their attack surface actually looks like and what governance mechanisms exist to address it before a tool like Mythos forces the question.

What Mythos Actually Does

The context for Carlin's concern is well warranted. Anthropic's Frontier Red Team documented that Mythos Preview autonomously identified and exploited a 17-year-old remote code execution vulnerability in FreeBSD, found thousands of zero-day vulnerabilities across every major operating system and every major web browser, and produced working exploits from known vulnerabilities at a rate that far exceeded its predecessor model.

Engineers at Anthropic with no formal security training used Mythos to find remote code execution vulnerabilities overnight and woke to complete, working exploits.

Anthropic restricted access to Mythos Preview under Project Glasswing, a limited initiative involving partners including AWS, Apple, Google, Microsoft, JPMorgan Chase, Cisco, CrowdStrike, and NVIDIA.

The company is committing up to $100 million in usage credits and $4 million in donations to open-source security organisations as part of the initiative. It is also investigating a report of unauthorized access to Mythos through a third-party vendor environment.

The UK AI Security Institute independently confirmed that Mythos Preview is the first AI model to complete its simulation of a full network takeover.

Key Takeaways

Anthropic's AI tool, Claude Mythos, can identify software vulnerabilities at scale.
The model's capabilities raise concerns about cybersecurity if misused by malicious actors.
Anthropic opted not to release Claude Mythos publicly, citing potential risks.
Experts warn that even less skilled attackers could exploit vulnerabilities using this AI.
The development highlights the growing intersection of AI technology and cybersecurity threats.

Can Anthropic's AI Exploit Software Vulnerabilities?

Key Takeaways

Related Articles