
“This is the scary phase of AI—a model deemed so powerful that its full release into the wild could unleash untold catastrophe.”
—Jim VandeHei, CEO of the news organization Axios and cofounder of Politico
“We formed Project Glasswing because of capabilities we’ve observed in a new frontier model trained by Anthropic that we believe could reshape cybersecurity. Claude Mythos2 Preview is a general-purpose, unreleased frontier model that reveals a stark fact: AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities. …[It] has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser.”
—An announcement by the AI company Anthropic about its plan to provide the new system to multiple companies and organizations as a means of solving the cybersecurity issues presented by AI
Artificial Intelligence is big business. The major AI companies are locked in a battle for dollars and influence, both of which they need because running their programs is incredibly expensive.
So when an AI company says that they aren’t going to sell you access to their program because they think it’s too dangerous, people’s ears go up.
That’s what happened last week when Anthropic—best known for its AI chatbot Claude—announced that it wouldn’t be making its newest model, Mythos Preview (often just called Mythos) available to the public. Instead, it would be providing it only to select large companies and organizations to help protect their systems from hackers.
Why wouldn’t they make it widely available? Because, they claim, it’s so good at identifying software vulnerabilities that hackers could potentially use it to breach any computer in the world, leading to global havoc. Hackers have already blackmailed major corporations after breaking into their systems and locking them down; imagine that threat magnified to every enterprise on the planet. State-sponsored baddies could potentially dismantle their adversaries’ electrical grids, banking systems or government infrastructure almost instantly.
Has Anthropic pushed us to the edge of a dangerous precipice, or are they preventing us from falling over it?
Understanding the Mythos of Mythos
Mythos Preview isn’t fundamentally different from the chatbots many people use daily, such as ChatGPT. It wasn’t specifically designed to be a whiz at breaking into computers. Rather, like most generative AI programs, it was trained to excel at general reasoning and problem solving.
We aren’t privy to the details of Anthropic’s training methodology. As tech writer Zvi Mowshowitz put it, “They are not about to tell us the [ingredients in the]secret sauce.”
They have, however, revealed a great deal about their concerns. While much of it is technical, the primary danger centers around a few critical breakthroughs.
Most alarming is that Mythos was able to find thousands of zero-day exploits across every major operating system (e.g., Windows, Apple’s iOS or Linux) and every major web browser from Google Chrome to Apple’s Safari. A zero-day exploit is a flaw in a computer program that has not yet been discovered by the developer or vendor, leaving no time (“zero days”) for a fix before a hacker can strike.
Zero-day vulnerabilities are extremely valuable to hackers, because not only do computer security experts not know how to deal with them but no one even sees them coming. The day after zero-day, some experts can reverse-analyze the problem and eventually spread the word about a patch to remedy it. But on that first day, the hackers can basically do anything they want.
Some of these vulnerabilities were in very old software. For example, Mythos found an exploit in OpenBSD, the operating system that is used to run firewalls and other very important software infrastructure. This flaw, which would allow a hacker to remotely crash any computer running OpenBSD, had existed for 27 years undetected. Other similar security flaws that were over a decade old were discovered by Mythos in moments.
Mythos Preview is also capable of using its hacking ability to escape containment. Placing it in a so-called sandbox—a secure environment isolated from the Internet and the host computer’s controls—didn’t prevent it from breaking free. (The details of those escapes aren’t being made public, as the holes it used have yet to be plugged.)
Anthropic’s anxiety with regard to Mythos was clear from the fact that a whole section of their public discussion was about their concern that using it in their own offices would place the company’s security at risk.
Some critics suggest that Anthropic is overstating Mythos’ capabilities. But if they are telling the truth, it’s certainly unsettling.
Project Glasswing
So instead of releasing Mythos Preview into the wild, Anthropic launched Project Glasswing, a collaboration with industry giants to make sure that their software (and hardware in some cases) is safe.
The select partners announced as participating in the project included Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks.
Anthropic also said that some 40 other entities involved in critical infrastructure would join the initiative and receive access to Mythos to help shore up their defenses. (The decision to keep these participants anonymous may have been a security measure to protect them from hackers.)
Furthermore, Anthropic is committing $100 million worth of Mythos usage to this effort, in addition to a $4 million donation to open-source security organizations (nonprofits that develop tools and make them publicly available).
To read more, subscribe to Ami

