Anthropic's Glasswing update is the kind of AI safety story that looks reassuring until you sit with the logistics. The lab says Claude Mythos Preview found more than 10,000 high- or critical-severity vulnerabilities across partner software. Not theoretical weaknesses, not a neat benchmark category, but things that need triage, verification, disclosure, fixes, retesting, and the awful meeting where someone decides which production system can be touched this week.

Project Glasswing was announced as a defensive coalition with AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, and others involved. Anthropic put $100 million in credits behind it. The basic argument is sound: if frontier models are becoming unusually good at vulnerability discovery, defenders should see that capability before attackers do. I buy that. I also think the update reveals a nastier bottleneck than model access. Finding the hole is only the start of the work.

Security already had this problem before Mythos arrived. Every serious organisation owns more old code than it wants to admit, and plenty of it has dependencies nobody has enjoyed thinking about since the person who wrote the integration left for a different badge system and a better coffee machine. A model that can surface ancient defects at speed doesn't magically create the change windows, test environments, maintainers, legal coordination, or user patience required to repair them. It turns buried debt into visible debt. Visibility is useful. It is also a queue.

That queue is what makes the Palo Alto Networks numbers so interesting. The company says it scanned more than 130 products with frontier AI systems and its May security advisory disclosed 26 CVEs covering 75 security issues. Before this, Palo Alto says a typical month involved five or fewer CVEs. This is the uncomfortable middle stage of defensive AI: better tools produce more work than the existing institution can absorb. The old rhythm of patching was already theatrical, monthly drops, emergency exceptions, half-remembered risk registers. Now the detection side is speeding up while the fixing side remains stubbornly human, bureaucratic, and full of servers that cannot go down.

Google's discovery of an AI-generated exploit earlier this month—the one with docstrings still hanging off it—comes to mind here. That story had a strange comic neatness: the model made the attack possible and also left enough model-shaped residue for defenders to notice. Glasswing is less tidy. It suggests a future where the attacker and defender both have better discovery tools, and the winner is the side with the less exhausted patch pipeline.

IBM's framing is similar but more corporate. In its own Glasswing note, it says exploitation of public-facing applications rose 44 percent last year and that AI is being used for detection, remediation prioritisation, testing, and response. That is the sensible shopping list. Prioritisation matters because ten thousand urgent things are not urgent in any practical sense. They are a map of institutional overload.

The temptation is to call this a capability threshold and stop there. Mythos can find bugs at a scale that changes the economics of vulnerability discovery. Fine. But the more important threshold may be administrative: whether companies can build a patching culture that matches machine-speed finding without collapsing into noise. The model can point at the broken part. Someone still has to own it.

Sources: