Anthropic Reaches for the Brake

Anthropic has put a brake pedal into the AI race, or at least a drawing of one. In a new Anthropic Institute post, Marina Favaro and Jack Clark argue that frontier labs should have a coordinated, verifiable way to slow down or temporarily pause development if recursive self-improvement starts to arrive before governments, safety researchers, and the public have any usable grip on it. The careful phrase is doing a lot of work. Anthropic is not saying the machine has started rebuilding itself in the strong science-fiction sense. It is saying the slope now points that way.

The post is called "When AI builds itself", which is blunt by Anthropic standards. It says recursive self-improvement has not happened yet and is not inevitable, but could come sooner than institutions are prepared for. The evidence offered is not a single threshold crossing. It is a pattern inside Anthropic's own workflow: as of May 2026, the company says more than 80 percent of production-merged code was authored by Claude. In Q2, a typical Anthropic engineer was merging about eight times as much code per day as in 2024. A March poll of 130 research staff put the median self-estimated output gain with Mythos Preview at four times.

I don't read those numbers as proof that Claude is already inventing its successor. The Guardian's account is useful here because it says the described advances do not yet amount to recursive self-improvement, and quotes Steven Murdoch at University College London saying Anthropic has shown no fundamental step change in capability. That skepticism matters. A lab can be genuinely worried and still have every incentive to make its worry sound like evidence of unique frontier status.

However, the proposal is sharper than ordinary safety theatre. Anthropic says a unilateral slowdown would mostly hand the lead to someone else. The useful version would need multiple well-resourced labs, multiple countries, agreed conditions for triggering and lifting the pause, and some way to verify that everyone had actually stopped. AP framed it as a way for top AI companies to coordinate instead of letting a secret holdout jump ahead. The WSJ-syndicated piece in To Vima catches the nasty detail: "Training runs are far easier to conceal than missile silos."

A pause button is only a button if someone can tell whether it has been pressed. Otherwise it is a press release with a nicer verb. Anthropic knows this, which is why the interesting part of its proposal is not the word pause. It is verification. The company says the Institute will convene policymakers, researchers, civil society, and other AI firms to work through those mechanics. That sounds dry until you imagine the actual inspection problem: private data centres, model weights that can move, synthetic data pipelines, distributed research teams, national-security exemptions, and investors who did not fund a frontier lab so it could politely wait at the lights.

The timing is almost too perfect. Anthropic only just made its confidential IPO filing, and this spring has already been a long exercise in making Claude look like critical infrastructure rather than a chatbot. I wrote this week about Mythos moving into infrastructure, where the same company is expanding access to a restricted model through governments and security partners. I also wrote about the White House's voluntary frontier-model review, which has the same problem in a milder form: cooperation is useful only until it becomes inconvenient.

That is the uncomfortable bit. Anthropic may be right about the need for a brake, and also right that no single lab can safely use it alone. It may also be using the language of restraint to state a market position: we are close enough to danger that you should treat us as one of the few institutions able to manage it. Both things can be true. Frontier AI politics keeps producing that double image, civic duty and competitive theatre sharing the same stage, with nobody quite able to prove where one ends.

Sources:

When AI Builds Itself — Anthropic Institute
Anthropic Urges Temporary Pause on AI Development to Discuss Risks — The Guardian
Anthropic Calls for AI Development Slowdown to Ensure Safety — Semafor
Anthropic Urges a Way to Pause AI Development as Risks Grow With the Tech Advances — Associated Press
Anthropic Urges Global Pause in AI Development, Flags Self-Improvement Risk — The Wall Street Journal via To Vima

Plutonic Rainbows

Anthropic Reaches for the Brake

Related Entries