Consultants and software program engineers warn that Anthropic’s new AI mannequin might usher in a brand new period of hacking and cybersecurity as AI methods able to superior reasoning determine and exploit a rising variety of software program vulnerabilities.
Subscribe to learn this story ad-free
Get limitless entry to ad-free articles and unique content material.
Citing the potential harm that would outcome from a wider public launch, main AI firm Anthropic launched the cutting-edge mannequin, referred to as Claude Mythos Preview, to a restricted group of tech firms Tuesday.
The mannequin is the most recent in Anthropic’s Claude sequence of AI methods. Its launch was previewed on the finish of March, when Fortune recognized its point out in an unsecured database on Anthropic’s web site.
Anthropic’s researchers say Mythos Preview was capable of detect 1000’s of high- and critical-severity bugs and software program defects, with vulnerabilities recognized in most main working methods and internet browsers. Anthropic stated among the vulnerabilities had been undiscovered for many years. Whereas some exterior specialists referred to as for warning in decoding the brand new outcomes given restricted public details about the recognized vulnerabilities, many others stated the mannequin’s debut and Anthropic’s warning have been vital.
“It’s all very a lot actual,” Katie Moussouris, the CEO and co-founder of Luta Safety, an organization that connects cybersecurity researchers with firms which have software program vulnerabilities, stated of the hype round Anthropic’s claims.
“I’m not a Rooster Little form of particular person on the subject of these items,” Moussouris stated. “We’re positively going to see some enormous ramifications.”
As a substitute of a public launch, Anthropic is giving tech firms like Microsoft, Nvidia and Cisco entry to Mythos Preview to shore up cyber defenses. As a part of the brand new effort, referred to as Challenge Glasswing, Anthropic will give over 50 tech organizations entry to Mythos Preview with over $100 million in utilization credit.
“Challenge Glasswing companions will obtain entry to Claude Mythos Preview to search out and repair vulnerabilities or weaknesses of their foundational methods — methods that characterize a really massive portion of the world’s shared cyberattack floor,” Anthropic introduced in a weblog submit. “Challenge Glasswing is a crucial step towards giving defenders a sturdy benefit within the coming AI-driven period of cybersecurity.”
It’s unclear precisely what the vulnerabilities Mythos Preview recognized are or what number of have been beforehand found or reported. Due to the delicate nature of the vulnerabilities, Anthropic stated it will disclose the character of presently opaque vulnerabilities inside 135 days of sharing the vulnerabilities with the organizations or events answerable for the software program.
It’s the first time in practically seven years {that a} main AI firm has so publicly withheld a mannequin over security issues. In 2019, OpenAI — now one in every of Anthropic’s major rivals — determined to withhold its GPT-2 system “as a consequence of issues about massive language fashions getting used to generate misleading, biased, or abusive language at scale.”
Mythos Preview is a general-purpose mannequin, or the kind of system that powers merchandise like Claude Code or ChatGPT. But in pre-release testing, Anthropic discovered its cybersecurity capabilities particularly have been surprisingly superior in contrast with these of earlier fashions, which led to the creation of Challenge Glasswing.
Logan Graham, who leads offensive cyber analysis at Anthropic, stated the Mythos Preview mannequin was superior sufficient not solely to determine undiscovered software program vulnerabilities but in addition to weaponize them. The mannequin can single-handedly carry out advanced, efficient hacking duties, together with figuring out a number of undisclosed vulnerabilities, writing code that may hack them after which chaining these collectively to type a strategy to penetrate advanced software program, he stated.
“We’ve often seen it chain vulnerabilities collectively. The diploma of its autonomy and type of lengthy ranged-ness, the flexibility to place a number of issues collectively, I feel, is a specific factor about this mannequin,” Graham instructed NBC Information.
That functionality meant that the corporate is up to now reluctant to launch even a rigorously guardrailed model of the mannequin to the general public, he stated, at the very least till some Western firms can use it to determine defenses to construct round them.
“We’re not assured that everyone ought to have entry proper now,” Graham stated. “We have to begin determining how we’d put together for a world of this primary earlier than we are able to deal with the thought of black hat [criminal or adversarial] hackers having entry.”
Anthropic has additionally briefed the federal authorities on Mythos Preview’s cybersecurity capabilities. Anthropic is embroiled in a heated dispute with the Trump administration over the federal authorities’s use of its fashions after Protection Secretary Pete Hegseth declared Anthropic a “provide chain danger to nationwide safety” in late February. A federal decide late final month issued a preliminary injunction in opposition to the designation, which the Trump administration is interesting.
In accordance with an Anthropic worker, the corporate “briefed senior officers throughout the U.S. authorities on Mythos Preview’s full capabilities, together with each its offensive and defensive cyber functions. That engagement has included ongoing discussions with CISA [the Cybersecurity and Infrastructure Security Agency] and CAISI [the Center for AI Standards and Innovation], amongst others.”
“Bringing authorities into the loop early — on what the mannequin can do, the place the dangers are, and the way we’re managing them — was a precedence from the beginning,” the worker stated.
CISA and the Nationwide Institute of Requirements and Expertise, the company that features CAISI, didn’t reply to requests for remark earlier than publication. A spokesperson for the Nationwide Safety Company, broadly thought to be essentially the most subtle hacking company on this planet, declined to remark when requested whether or not it had been briefed on Mythos.
Not everyone seems to be satisfied Mythos Preview represents the leap Anthropic claims.
Heidy Khlaaf, chief AI scientist on the AI Now Institute, a U.S.-based analysis institute, stated Anthropic’s detailed weblog submit explaining the brand new vulnerabilities omitted many key particulars wanted to confirm its claims.
Writing on X, Khlaaf warned in opposition to “taking these claims at face worth” with out extra data, such because the charges of false positives and clearer explanations for a way the people performed guide critiques of the recognized vulnerabilities.
Past Mythos Preview’s cybersecurity implications, the mannequin additionally demonstrated regarding conduct as a part of Anthropic’s pre-release security testing.
As a part of releasing new fashions, Anthropic creates detailed reviews referred to as “system playing cards” that share how the fashions carry out on a wide range of evaluations. The system card for Mythos Preview, launched Tuesday, says that “Claude Mythos Preview’s massive enhance in capabilities has led us to determine to not make it typically out there.”
In accordance with the system card, in a single analysis, the mannequin confirmed some type of consciousness that it was being evaluated in round 29% of transcripts — although it didn’t explicitly inform researchers that it knew it was being evaluated. Some researchers fear {that a} mannequin would possibly change its conduct and act extra safely whether it is conscious it’s being evaluated.
Anthropic and researchers working with the corporate to judge the mannequin’s security additionally stated Mythos Preview deliberately appeared to carry out worse on one analysis than it might have as a way to seem much less suspicious. “We discover this general sample of conduct regarding, and haven’t seen it earlier than in related evaluations of earlier Claude fashions,” Anthropic wrote within the system card.
In one other experiment with an earlier model of Mythos Preview, an Anthropic researcher instructed a mannequin primarily remoted to a secured laptop to attempt to escape and ship the researcher a message. The researcher, Sam Bowman, was consuming a sandwich in a park when he obtained an e-mail from the mannequin.
“That occasion wasn’t alleged to have entry to the web,” Bowman wrote on X.

