Skip to content

Plutonic Rainbows

The Thousand-Token Gambit

OpenAI shipped Codex-Spark yesterday — a smaller GPT-5.3-Codex distilled for raw speed, running on Cerebras Wafer Scale Engine 3 hardware at over a thousand tokens per second. Four weeks from a $10 billion partnership announcement to a shipping product. 128k context, text-only, ChatGPT Pro research preview.

The pitch is flow state — edits so fast the latency disappears and you stay in the loop instead of watching a spinner. Anthropic is chasing the same thing with Opus fast mode. Everybody is.

I wrote about speed becoming the only moat last month. Codex-Spark is that thesis made silicon.

Sources:

Two Billion in Efficiency Savings and What Gets Lost

Barclays posted £9.1 billion in pre-tax profit for 2025 — up 13% — and CEO C.S. Venkatakrishnan used the results announcement to outline an AI-driven efficiency programme targeting £2 billion in gross savings by 2028. Fraud detection, client analytics, internal process automation. Fifty thousand staff getting Microsoft Copilot, doubling in early 2026. Dozens of London roles relocated to India. A £1 billion share buyback and £800 million dividend to round things off. The shareholders are happy. The spreadsheet is immaculate.

I don't doubt the savings are real. Every bank running these numbers is finding the same thing — operations roles that involve documents, repeatable steps, and defined rules are precisely where large language models excel. Wells Fargo is already budgeting for higher severance in anticipation of a smaller 2026 workforce. JPMorgan reports 6% productivity gains in AI-adopting divisions, with operations roles projected to hit 40-50%. Goldman Sachs has folded AI workflow redesign directly into its headcount planning. This isn't speculative anymore. The back offices are getting thinner.

What bothers me is the framing. "Efficiency" is doing a lot of heavy lifting in these announcements. When Barclays says it will "harness new technology to improve efficiency and build segment-leading businesses," what that means in practice is fewer people answering phones, fewer people reviewing transactions, fewer people in the building. The GenAI colleague assistant that "instantly provides colleagues with the information needed to support customers" is, by design, an argument for needing fewer colleagues. The call handling times go down. Then the headcount follows.

The banking industry's own estimates are stark. Citigroup found that 54% of financial jobs have high automation potential — more than any other sector. McKinsey projects up to 20% net cost reductions across the industry. Yet 76% of banks say they'll increase tech headcount because of agentic AI. The jobs don't disappear. They migrate — from the person who knew the process to the person who maintains the model that replaced the process. Whether that's a net positive depends entirely on which side of the migration you're standing on.

Barclays will likely hit its targets. The efficiency savings will materialise. The return on tangible equity will climb toward that 14% goal. The question nobody at the investor presentation is asking — because it isn't their question to ask — is what a bank actually is when you've automated the parts where humans used to make judgement calls about other humans. A fraud model is faster than a fraud analyst. It's also completely indifferent to context, to the phone call where someone explains they've just been scammed and needs a person, not a pipeline, to understand what happened.

Two billion pounds is a lot of understanding to optimise away.

Sources:

175,000 Open Doors

SentinelOne and Censys mapped 175,000 exposed AI hosts across 130 countries. Alibaba's Qwen2 sits on 52% of multi-model systems, paired with Meta's Llama on over 40,000 of them. Nearly half advertise tool-calling — meaning they can execute code, not just generate it. No authentication required.

While Western labs retreat behind API gates and safety reviews, Chinese open-weight models fill the vacuum on commodity hardware everywhere. The guardrails debate assumed someone controlled the deployment surface. Nobody does.

Sources: