Skip to content

Plutonic Rainbows

Ginny Danbury, 1989

Peter Weir shot the Midsummer Night's Dream sequence of Dead Poets Society on location in Delaware over a month in late 1988. Lara Flynn Boyle was eighteen, cast as Ginny Danbury — Chet's sister — with a second role as Hermia in the staging that anchors the film's middle act. She filmed her Ginny scenes. She filmed Hermia. She went home and waited for the release.

Then she watched the film with her mother, Sally. "Here comes my scene, here comes my scene. No scene, no scene." That's what she told People in 2024. Ginny had been cut. No one called. No one wrote. She found out because the movie was playing on a television in front of her, and she was still sitting there when the credits ran.

The Danbury subplot was a full piece of scaffolding. Ginny was a friend of Chris Noel's, a third young woman in a film that otherwise pushes its women to the edge of the frame. Cut her and you cut the character who was going to give Chris someone besides Knox to talk to, the one who was meant to play Hermia, the one who might have given the film a female friendship. The Midsummer Night's Dream scenes survive because you can't drop a staged Shakespeare sequence without breaking the film's rhythm, so Boyle walks through them wordlessly, a credited extra in what was supposed to be her role.

People didn't ask the counterfactual. If Ginny had stayed at Chris's screen weight, Boyle would have been the breakout young actress of summer 1989. Peter Weir films have a habit of making careers. Ethan Hawke's did. Robert Sean Leonard's too, for the short run he had left. Josh Charles went quiet for a decade and came back to television. Instead, Boyle got the cleanest dispatch Hollywood issues: a credit on a film that contains almost none of her.

She was already familiar with the shape of the edit. Ferris Bueller's Day Off had cut her three years earlier. Dead Poets made her two-for-two.

A year after watching her mother's screen skip over her, she was in Washington state playing Donna Hayward. "Twin Peaks gave me everything I have as an actor," she said. It's hard to picture the timeline where Ginny survives the edit and Lynch still casts her. There aren't many universes where the 1989 breakout of a Robin Williams vehicle signs on to a strange little ABC pilot about a murdered homecoming queen.

Sources:

Maine Gets There First

On Tuesday the Maine legislature passed LD 307, a temporary ban on new data centers drawing twenty megawatts or more. The House voted 79 to 62, the Senate 21 to 13, both on party lines with a handful of exceptions. Governor Janet Mills has ten days to sign, veto, or let it become law by inaction. If she signs, Maine becomes the first US state to pause construction on what the industry calls hyperscale facilities, and what everyone else has started calling AI data centers.

Twenty megawatts sounds like a lot until you look at what a contemporary data center actually pulls. The Regional Plan Association pegs the current average at around forty. The early ones used two. So the Maine threshold isn't a ceiling on something exotic; it's a line drawn roughly at the median. The bill walks the state back to a posture from fifteen years ago.

The politics are messier than the vote count suggests. Mills is running in a contested Democratic primary for a US Senate seat and has signalled hesitation about signing without a carveout for a proposed 82MW facility in Jay. Tony McDonald, the developer, has argued his project is the wrong target — not a hyperscaler chasing AI training loads, just a large tenant that got caught in a dragnet. He may be right. He may also lose anyway, because the dragnet is the point.

What makes Maine interesting isn't the substance of the bill, which is mild. Eleven other states tried something similar in the last year and stalled or failed outright — Georgia, Maryland, Michigan, New Hampshire, New York, Oklahoma, South Carolina, South Dakota, Vermont, Virginia, Wisconsin, per Gizmodo's count. Maine's version passed partly because Maine barely has any data centers yet — the first large one is still under construction at the old Loring Air Force Base — so the industry hadn't staked out enough ground to defend.

And it passed while the federal direction runs the other way. Trump's December executive order explicitly warned that excessive state regulation thwarts the AI race, with threats to withhold funding from states that restrict growth. That's the same playbook I covered in the broadband money fight — use federal dollars as leverage against state AI laws. Maine will be a test of whether that threat actually bites a small New England state with a pipeline of maybe two data centers.

The Sachs argument, delivered in plain committee language, was that the tradeoffs haven't been shown to benefit ratepayers, water use, or community economic activity. That's a boring sentence until you remember that Maine already ranks fourth in US electricity prices. A hyperscaler sized at forty megawatts is a quiet permanent tax on every household on the grid — which Turley at OpenAI basically conceded when he admitted unlimited AI plans are like unlimited electricity plans.

The moratorium ends November 2027. A new Data Center Coordination Council gets the interval to write policy recommendations. By the time it reports, the federal posture may have shifted again, or it may have hardened. Either way, Maine's bet is that the study period is worth more than the projects it blocks. Given the two projects on the table, that math isn't hard.

Sources:

Darkness, Translated

The English translation of In Praise of Shadows appeared in 1977, published by Leete's Island Books with a foreword by the architect Charles Moore, who wrote that the essay "could change our lives." By the mid-1980s it had become something very specific: the canonical text Western designers and architects reached for when they wanted theoretical cover for dimmer rooms. This is not a complaint. It is just an accurate description of what happens when an essay written in Japanese in 1933 crosses into a culture that is, at that exact moment, running out of patience with the Modernist interior.

Tanizaki's argument is simple enough in outline. Japanese architecture organises itself around shadow. The deep roof overhang of a traditional house keeps the interior dim. Lacquerware, gold embroidery, cloudy paper — their beauty depends on low light, on the way darkness lets a material breathe without the scrutiny of full exposure. Westerners, he argues, are congenitally committed to brightness: from candle to oil lamp to gaslight to electric light, their progress is a history of abolishing shadow wherever it pools. Japanese culture accommodated the dark. Western culture declared war on it.

What made this legible to an 1980s Western audience was partly timing. The translation arrived just as postmodern architecture was trying to argue its way out of the glass box. Moore himself was building Piazza d'Italia in New Orleans. The appetite for any text that questioned the clean-bright-functional dogma was considerable. Tanizaki, dead since 1965 and safely historical, offered an aesthetic that could be absorbed without political risk — and the quality he described, that particular stillness of a dim room with tatami underfoot and light filtered through shoji, landed precisely where Western designers needed it to land.

The irony that the 1980s West almost entirely missed is geographic. The same decade that architects were annotating their copies of In Praise of Shadows, Tokyo was building the city that inspired Ridley Scott's Blade Runner. Japan had become the global centre of consumer electronics: Sony, Honda, Sharp, the robotics movement, a neon-saturated culture that made the Pacific Rim feel like a preview of everywhere else's future. Tanizaki's 1933 elegy for pre-modern shadow had become, within fifty years, a description of the very country that had most completely abandoned what he mourned. Western readers, discovering the essay in translation, were perfectly positioned not to notice.

There is also a reading problem the 1980s West missed almost entirely. Tanizaki was not writing a sincere treatise. He was writing a zuihitsu — a Japanese prose form that permits fiction, irony, and digression — using a narrator who ventriloquises the tropes of cultural superiority that circulated freely in 1930s nationalist Japan. The essay's reader in 1933 Japan understood this. The Western reader in 1985, lacking that context, took the East/West binary at face value, lifted it into their own anti-modernist argument, and used it as evidence.

This is what the essay became in Western hands: permission. Permission to use wood instead of aluminium. Permission to keep a room dim. Permission to frame slowness and imperfection as values rather than deficits. All of that was probably overdue. I am not sure the fact that Tanizaki was doing something more complicated invalidates it entirely — you can extract something real from a text even when you misread it, and what the West extracted was genuinely useful. But the book it drew from is odder and funnier than the version it invented. Tanizaki admits he cannot give up electricity. He praises a toilet as a place for quiet contemplation and then spends two paragraphs on miso soup. The serene, quietly authoritative In Praise of Shadows that entered Western architectural thinking in the eighties is cleaner than the original, which is messier and considerably more entertaining.

Sources:

Amodei at the West Wing

On Friday, Dario Amodei walked into the West Wing to meet Susie Wiles, Scott Bessent, and National Cyber Director Sean Cairncross. Asked about it afterwards, Donald Trump told reporters he had "no idea" the meeting had taken place. That is, so far, the only direct presidential comment on the record.

The White House readout was more conventional: "introductory, productive, and constructive," covering "opportunities for collaboration" and "shared approaches and protocols to address the challenges associated with scaling this technology." Both Politico and CNBC confirmed the attendee list before the meeting happened. The story is not that it happened. The story is what the administration is now doing in public about a company it has spent two months calling a national security risk.

Anyone following this has the sequence memorised by now. February: Pete Hegseth tries to force Anthropic to drop its bans on autonomous weapons and mass surveillance. Amodei refuses. The Pentagon declares the company a "supply chain risk", federal agencies get a phase-out order, OpenAI signs the contract instead. March: Anthropic sues the DOD. April 7: Claude Mythos ships under Project Glasswing, restricted to about fifty partners because its zero-day-finding capability is judged too dangerous for general release. April 10: Bessent and Jerome Powell summon five bank CEOs to discuss Mythos. April 16: Bloomberg reports OMB is setting up a framework to let federal agencies use the tool. April 17: the West Wing meeting.

What's new is that the meeting is openly about routing access. The Next Web's read — and CNBC confirms the rough shape — is that any deal exits through civilian agencies and explicitly does not include the Pentagon. That is a peculiar compromise. The blacklist isn't being lifted. The supply-chain designation isn't being rescinded. Hegseth is still, nominally, correct about Anthropic being too dangerous. Treasury, CISA, and the intelligence community are simply going around him, with the Chief of Staff in the room to make it official.

One detail from CNBC is worth keeping. Wiles was previously at Ballard Partners, the lobbying firm Anthropic hired immediately after the February designation. I don't think this is a scandal — hiring lobbyists with access is what companies do when the government cuts them off — but it does change the texture of "introductory, productive, and constructive." Those are words chosen by someone who already knew the room.

The harder question is what Anthropic is actually getting. The civilian-only carve-out leaves the Pentagon's objections technically intact. The lawsuit is still live. The phase-out is still policy. What's on offer is something like conditional rehabilitation: you stay blacklisted where it began, you get sold back to everyone else, and the administration doesn't have to admit the original call was wrong.

Amodei has, in some sense, won. The product the Pentagon banned is the same product Treasury is trying to procure. The company the president said he would never do business with again had its CEO in the West Wing six weeks later. The safety features that got it blacklisted are the same safety features Treasury is asking to audit in the hope of finding flaws it can use.

And asked about any of this, the president, on the record, says he had no idea it was happening. A ten-minute briefing on a frontier AI model should probably have reached him by Friday afternoon. It is not clear that it has. It is not clear that it will.

Sources:

Montana's Silver Year

On 30 January 1991, Claude Montana showed his haute couture collection for Lanvin in Paris, and somewhere in the middle of it Yasmeen Ghauri walked out in a silver mesh hood and a white bubble skirt. The look now reads like the exact midpoint of a career: still built around the glamazon silhouette that had made him the King of Shoulders, but lighter, younger, flirting with the sixties Space Age revival that the earliest nineties briefly tried to make happen.

This was as good as it was ever going to get for him.

The Lanvin job had arrived in 1990 on the back of a decade nobody else had owned quite as completely. Broad shoulders, sculpted leather, the body treated like architecture. Dynasty borrowed it. Wall Street borrowed it. Working Girl borrowed it. By the end of the eighties, Montana's silhouette wasn't just a look, it was the look: the default shape a woman took when a film or a TV show wanted to signal that she was about to walk into a boardroom and win. Alexander McQueen would later credit him openly. So would Riccardo Tisci, Olivier Theyskens, Anthony Vaccarello at Saint Laurent, and Willy Chavarria. The DNA was load-bearing.

His Lanvin work earned him two consecutive Golden Thimbles (the Dé d'Or, the closest thing French couture has to an Oscar). Critics who'd doubted whether a ready-to-wear designer could handle the discipline of a 19th-century atelier were reportedly astonished. Pale green iridescent trench coats. Patterned dresses with ribbon hats. Embroidered leather panels and, controversially at the time, beaded T-shirts on a couture runway, which was roughly the couture equivalent of serving crisps at a state banquet. He'd smuggled his own vocabulary into the house and made it work.

Then the decade happened around him.

By 1993, Helmut Lang had shown the anti-shoulder collection that basically ended the era. Margiela was deconstructing linings and wearing them as outerwear. Calvin Klein was doing the thing where nothing happens and everyone calls it genius, which my post on his 1980s advertising tried to take seriously on its own terms. Nirvana had put a cardigan on MTV. Kate Moss was on the cover of British Vogue in a slip. The culture's tolerance for theatrical power dressing had collapsed almost overnight, and Montana's response was essentially not to have one. He kept working in his own register. Retailers started dropping him. The House of Montana was in receivership by 1997.

The retrospective question is whether anything from that Spring 1991 collection crossed into the wider decade it opened. The honest answer is: not directly. The nineties that followed belonged to Prada nylon, Miuccia's ugly sandals, Jil Sander trousers, and the kind of Calvin Klein minimalism that Azzedine Alaïa had already quietly predicted three autumns earlier in a roomful of black. Even Mugler, Montana's closest rival in theatrical silhouette, was being pushed toward the costume-piece end of his archive. Valentino had his own reckoning the same year, managing it more gracefully because he never really needed the shoulder to be doing the talking.

What survived, survived filtered. The militarised tailoring and exaggerated shoulder came back through McQueen in 1996, through Balenciaga under Ghesquière, through Tisci at Givenchy in the 2010s, through Saint Laurent under Vaccarello now. The debt is real. It just took twenty-five years to be paid in a currency anyone still recognised as spendable.

By 1994, the same silhouettes that had made him a god in 1988 looked like costume in a room that had decided it wanted clothes. It wasn't that Montana had stopped being good. It was that the weather had turned, and he was still dressed for the last one.

Sources:

Cromptons of Ramsgate

The cabinet on the end of the row at Skegness was built before I was born. A two-penny coin slot, a sloped tray of copper, a hydraulic shelf shoving a tide of coins toward an edge that never quite spilled. The wood-effect side panel was patched with masking tape where someone had bashed it, probably more than once, and the back of the machine still ran on what looked like the original transformer. A boy in front of me dropped his last coin in. The shelf swept forward. Three coins fell. He cheered.

The Cromptons Penny Falls was first manufactured in 1964 in Ramsgate, with a refined version released by 1966. Decimalisation in 1971 retooled the slot. The euro changed nothing because it never came. The shift from one penny to two pence to whatever fractional unit will replace cash entirely has been, for this object, a series of cosmetic tweaks to a mechanism that was finished sixty years ago. Alan Meades, who wrote a social history of the British amusement arcade, calls them pivotal — the machines that, alongside the fruit machine, kept arcades solvent through the collapse of the seaside holiday and everything that came after it.

What's strange about the coin pusher is not the survival itself but the absence of any pressure to replace it. The software industry I work in cannot tolerate a system that hasn't been rewritten in three years. The financial system cannot tolerate physical currency at all if it can be helped. Yet a sweeping shelf in a Blackpool arcade, manufactured the year of Goldfinger, is still earning its keep. Nobody has built a better coin pusher because nobody needs to. The mechanism is correct. It performs the function exactly. The only thing it had to adapt to was the denomination of the coin.

The wider arcade is more layered than this. Penny pushers share floor space with light-gun shooters from the eighties, crane grabbers whose grip strength is famously calibrated to fail, and pre-decimal "old penny arcades" that have repositioned themselves as heritage attractions, charging entry to mechanical fortune-tellers and execution dioramas built between the wars. The original pioneer of all this was a Leeds mechanic named John Dennison, who started making working models in 1875 and supplied Blackpool Tower with around fifty machines that ran on its upper floors until the late sixties. Three of his daughters — Evelyn, Florence, and Alice — kept the business going. Alice did the mechanics. Most of what they built has been lost.

The point is not that the arcades are sad now. They are not. A wet Tuesday in October at Coral Island, Blackpool, is still a functioning piece of infrastructure for a child with a paper cup of two-pence pieces. The point is that almost nothing else in British public life has been allowed to persist on its own terms this long. Libraries get rebranded. Pools get demolished. Post offices close. The arcade survives because it was never institutionally important enough to be rationalised. Nobody was ever going to commission a five-year strategic review of what coin pushers are for.

Cromptons is still based in Kent, the same county the original prototype came out of. Coin pushers, in slightly varied cabinets, are still being sold. The mechanism is older than most of the people who built the rest of the seaside, and it does not appear to be going anywhere.

Sources:

Counting R's in Strawberry

Ask a frontier model to count the letter 'r' in "strawberry." Often you'll get two. Sometimes three. Rarely with consistent accuracy across a hundred trials. This isn't a bug somebody hasn't fixed. It's a direct consequence of how text enters the model in the first place, and it explains a dozen other strange behaviours that cluster around counting, spelling, and arithmetic.

Language models don't read text. They read tokens, which are integers that index a lookup table. The algorithm most of them use to build that table is called Byte-Pair Encoding, or BPE. It was originally designed as a data compression scheme. Philip Gage wrote it in 1994 for compressing files. Sennrich, Haddow, and Birch adapted it for machine translation in 2016, and it's been the default across most major model families since.

The idea is simple. Start with individual characters. Find the most frequent adjacent pair in your training corpus. Merge them into a new token. Find the next most frequent pair. Merge. Repeat for tens of thousands of iterations. The result is a vocabulary full of common words and common subword pieces. "The" is one token. "ing" is one token. "Strawberry" might split into ["straw", "berry"]. Rare words fragment into more pieces.

Once the model is trained, it sees "strawberry" as two integers, not ten characters. No mechanism inside the transformer can reach inside a token to ask how many r's it contains. The letters are sealed inside the token the way pages are sealed inside a book you can only see the cover of. The model has, statistically, learned that strawberry contains three r's. It just hasn't learned it from the token sequence. It's learned it from surrounding text that happened to mention the fact. That knowledge is fragile, and it decays on uncommon words.

The arithmetic failures come from the same place. Numbers don't tokenize uniformly. "480" might be one token. "481" might be two. A four-digit number can split one way and the same digits rearranged can split another. Researchers using arithmetic as a diagnostic have found that when an answer has more digits than either input, accuracy on certain tasks collapses to under 10%. The model isn't bad at maths. It's being handed digit sequences in a shape it wasn't trained to work with.

The fix, in principle, is byte-level tokenization. Every byte becomes a token. No merging, no hidden letters. The tradeoff is sequence length. The same passage takes more tokens, sometimes many more. That means more compute, longer context windows, slower inference. GPT-2 went byte-level and paid the cost. Recent models use hybrid approaches: BPE for efficiency, special handling for digits, sometimes per-character processing inside the chain-of-thought. None of it is free.

What strikes me is how much of the model's apparent cognitive style is downstream of this one preprocessing choice. The blind spot for letters isn't a reasoning failure. It's the data format not including letters, most of the time. Change the tokenizer and you change what the model can notice.

The cost-per-task shift playing out across the industry is partly a reasoning-modes story, but it's also a tokenization story. The reasoning variants often break digits into individual character tokens during arithmetic steps, which inflates token counts for the same sum. The reason your cheap tier struggles with long division is the same reason your expensive tier's bill keeps climbing when you ask it to do the job properly. Both are paying, differently, for the fact that the default encoding hides the thing you want the model to count.

Sources:

Palette Icon, Canva Engine

On Friday, Anthropic launched Claude Design, the AI design tool I wrote about on Wednesday when it was still a single-source scoop in The Information. The product is now live as a research preview for Pro, Max, Team, and Enterprise subscribers, reachable via a palette icon on Claude.ai's left-hand navigation. It is powered by Opus 4.7. Figma dropped about 7% on the news. The market made the same mistake twice.

What actually shipped is neither of the two shapes I expected.

The question I raised on Wednesday was whether Claude Design would be hosted-only or API-accessible. A hosted product is a controlled experiment. An API-accessible one would reset the wrapper economy — the cohort of Lovable, Bolt, v0, Cursor builders who pipe prompts into frontier models and ship a UI on top of the response.

Anthropic shipped neither version cleanly. The product is hosted only, no API. But it isn't running on design infrastructure Anthropic owns. It runs on Canva's Design Engine. Output is exportable as PDFs, URLs, or PowerPoints, or handed off directly to Canva for drag-and-drop editing. Canva's own Create event was in Los Angeles the same day, where they announced Canva AI 2.0 and called it the biggest product launch in company history. Both launches were timed together, which is not how opportunistic partnerships work.

This is a third shape I didn't account for: a hosted-only product that gets reach through somebody else's installed base. Canva has distribution inside enterprise marketing teams. Anthropic does not. The partnership gives Anthropic that distribution without building it, and gives Canva a direct pipe into the frontier-model side Adobe has been slow to acquire. It's joint, not cannibalistic.

Which partially inverts the thesis. I argued Figma and Adobe were the wrong targets. That's still true; they trade in coordination and indemnified enterprise contracts Claude Design doesn't touch. I argued Lovable, Bolt, and v0 should be the ones panicking. The Register named Lovable explicitly as the shot-across-the-bow target. That half lands as predicted.

What I didn't see coming: Canva is the structural winner. The consumer-grade design generators (Microsoft Designer, Adobe Express's free tier) lose more than Figma ever was going to, because they don't have Opus sitting underneath their prompt box and they don't get the dual announcement billing. Opus 4.7 itself shipped a day earlier, framed in Anthropic's own launch materials as "less broadly capable" than the restricted Mythos. That framing makes sense now. Opus 4.7 is the commercial frontier with Claude Design parked inside it. Mythos is the internal model kept in reserve. The two launches rhyme in a single week because they are solving different problems in the same portfolio.

The interesting question is when the API arrives. A hosted-only Claude Design is a controlled experiment. A Canva-backed Claude Design is a distribution play. A Claude Design with an API, and maybe a Canva-rendered API, would be the real wrapper-layer displacement. None of the coverage I've read this morning says anything about API access. Anthropic's research-preview framing keeps all of that future-tense.

The signal I was waiting for on Wednesday has shifted. It's no longer "what does Anthropic do about the wrapper economy." It's "what does Anthropic do with Canva six months from now."

Sources:

Cut Along the Dotted Line

The coupon was usually in the bottom corner of the right-hand page, bordered with a dashed line and the instruction to cut here. You cut. You filled in your name and address in capital letters. You walked to the post office with a parent or on your own if you were old enough, stood in the queue, and asked for a postal order for one pound ninety-nine. The woman behind the grille filled in the amount, stamped it, and handed it across with a receipt. You folded the order into an envelope with the coupon, stuck on a second-class stamp, and dropped it into the pillar box on the way home.

Then you waited.

The waiting is the part that has become unrecoverable. Most small ads specified "please allow 28 days for delivery," and that figure was realistic rather than padded. The ad you'd torn out lived in Exchange & Mart, or the back pages of Roy of the Rovers, or somewhere inside Smash Hits. The company processing your order might be one man operating out of a garage. Your postal order had to clear. The catalogue had to be printed. Stock had to be located. The padded envelope came back eventually, franked by a post office hundreds of miles away, the address scrawled in a handwriting you'd never seen.

In the month between sending and arriving, the object lived entirely in your head. X-ray specs that saw through skin. Sea monkeys that performed synchronised dances. A magic set whose tricks the advert had strongly implied would fool everyone. The mental image grew more specific and more extraordinary the longer the padded envelope took. You could not check on its progress. There was no tracking number, no notification of dispatch, no photograph of the warehouse worker who'd packed it. The order entered a system and disappeared from view, and your imagination filled the silence.

When the padded envelope arrived, the object inside could not win. The X-ray specs were pieces of plastic with cardboard lenses that made everything look striped and red. The sea monkeys were brine shrimp that hatched to roughly the size of a comma. The magic set came with trick cards you could see through in good light. You held the thing in your hand and felt the distance between the copy in the advert and the object the padding had protected. Then you played with it for an afternoon and mostly forgot.

Nostalgia is not the right register for any of this. The objects were nearly always a let-down. What has disappeared is not the stuff but the structure of expectation the stuff was suspended inside. A whole month of specific, named waiting, knowing exactly what you'd ordered and unable to retrieve or cancel or check. The desire had time to become baroque before meeting reality.

The small ads themselves have mostly vanished from regional press. Their mail-order equivalents migrated to online storefronts that list, illustrate, price, and review everything in one page without requiring you to tear anything out of anywhere. The padded envelope has been replaced by the brown Amazon box, which arrives on a schedule you can track hour by hour. That the product inside is often the same plastic tat imported from the same factories is beside the point. Next-day delivery cannot sustain the same quality of imagining. The real object arrives before the imagined one has started to grow.

Plastic tricks and broken toys still turn up in charity shops and at car boot sales, slipped from their time but intact. The X-ray specs outlast the magazine that advertised them. Someone ordered them in 1983 and kept them in a drawer. The padded envelope is long gone, but the object still carries the shape of something that was waited for. An artefact that came through the post moves on differently from one that came over a counter. It always contained an interval.

A postal order can still be bought at any British post office. I don't know anyone under forty who has ever held one.

Sources:

Cheaper Per Token, More Per Task

The sticker price of a frontier model has been falling for eighteen months. The bill I run up using one has not. Both are true, and they are both true for a reason.

Per million input tokens, GPT-5.4 Standard is $2.50 and $10 on the output side. Claude 4.6 Sonnet sits at $3 and $15. Gemini 3.1 Pro is $1.25 on input, with the output price landing somewhere between $5 and $12 depending on the table you trust. Eighteen months ago the comparable tier was meaningfully higher. At the budget end the collapse has been more dramatic. GPT-5 Nano is $0.05 in, $0.40 out. Gemini 3.1 Flash-Lite hovers around $0.10 and $0.40. DeepSeek halved its prices in late 2025 and the copycat pressure has not let up. If all you are doing is matching 2024 workloads to 2026 models, you are paying less.

That is the sticker answer. The receipt answer is worse.

The first thing that moved is the premium tier. GPT-5.4 has a High Reasoning mode that runs $10 in, $40 out — roughly 16× the standard tier for the same provider, same parent model, just with the thinking dial turned up. A Claude Opus Fast Mode clears $30 per million input tokens. Long-context windows, the big selling feature of 2025, became a billing surface: GPT-5.4 doubles input pricing beyond 272K tokens and adds 1.5× on output, and Gemini 3.1 Pro doubles input beyond 200K. Anthropic, to its credit, removed its long-context premium on March 13; the full 1M window bills at standard rates now.

The second thing is the token count itself. Gemini 3.1 Pro's chain-of-thought reasoning generates internal tokens billed at output rates, and a simple prompt can consume three to five times more tokens than expected. This is the quiet version of a price hike. You did not pay more per token. You paid for more tokens. Any workflow that shipped in 2024 with a predictable output length is now spending meaningfully more on the same question if the model is thinking before answering. Which it will be, because every provider is pushing you toward the reasoning variants as the default for serious work.

Third is where most of the enterprise analysis actually lands: context caching. Both platforms discount cached reads heavily, up to 90% off base rates on repeated context. If your workload has a stable system prompt and repeated document context — customer support, code assistance, document processing — the effective per-million blended rate can compress substantially. If your workload does not — agentic tools that spin up fresh contexts, one-shot research queries — the cache discount saves you nothing. The bill diverges based on access pattern, not sticker price.

Against all of that, the budget tier deserves its own column. A Gemini 3.1 fast variant at $0.075 in, $0.30 out is not a frontier model. It is a utility-grade language model that happens to be smarter than GPT-4 was two years ago. That tier has collapsed so far below the frontier that calling any of this a "price rise" misses the entire rearrangement. For high-volume, low-stakes work, the cost per task has fallen by an order of magnitude. For frontier-quality work on harder tasks, the cost per task has held steady or climbed.

I suspect this is what the providers want. The price floor drops aggressively enough that casual users move on sticker alone. The premium ceiling goes up just as aggressively, and the people who have actual hard problems — research, production coding agents, long-horizon reasoning work — end up on the tier where the margin actually lives. Between the two, the flagship standard tier holds a middling price that lets the pricing page look competitive without giving anything away.

The honest answer, then: per token, no. Per unit of intelligence applied to a specific task, probably yes for anything hard, and flatly yes if you are using the reasoning modes the providers are quietly making default. The budget I set in 2024 for a particular agentic task has to stretch much further on 2026 Opus at $5 and $25, because the agent thinks three times as hard before producing the same output and the output itself is longer. That is not a price rise. It is also not a price cut. It is the industry offering a cheaper ruler and then giving you longer things to measure.

A ruler I keep thinking about: the automated alignment researchers Anthropic ran cost $18,000 in compute for nine Claude instances running for five days. A year ago that per-token figure would have been higher. I doubt the total would have been lower.

Sources: