Skip to content

Plutonic Rainbows

Banal Eccentricity, 1996

The bomb scare came on the final day of Milan Fashion Week, October 1995, just as the editors were heading over to Prada. The headquarters got cleared, the police swept the building, the show went on. Whatever Miuccia put on the runway that afternoon was always going to be the news. What she actually put on the runway was a problem.

Avocado green and sludge brown. Murky 1970s tones a critic later said hovered somewhere between shades of slime and mold. Checked kitchen-tablecloth patterns paired with dirty 1950s florals, hand-drawn in a way that looked like the printer had given up halfway. The shoes were clunky T-bar sandals and unorthodoxly low-heeled sliders, the opposite of the strappy follow-me heels the rest of fashion was selling that season. The collection was called Banal Eccentricity. The press, mostly, called it Ugly Chic.

Robin Givhan ran a piece in the Washington Post the following May titled "Ugly is in." Susannah Frankel later wrote, in the AnOther cover story for S/S17, that the term belle laide could have been invented for Miuccia at that moment. Alexander Fury, in a 2014 essay reissued half a dozen times since, called the brown "faecal." All of these are compliments. Read them in sequence and you start to understand what was happening: a designer had walked onto the most commercial week of the fashion year and committed an act of taste sabotage so calculated that the trade press needed two years to catch up.

The cleverness wasn't the ugliness. The cleverness was that the ugliness was made out of the most expensive materials a luxury house could source. Cashmere, silk, the high-tech nylons Prada had been refining since the 1984 backpack. The kitchen tablecloth was hand-embroidered. The avocado wool was woven to the exact gauge a couture house would demand. None of it looked it. That was the point.

Miuccia had inherited the company from her mother in 1978 and spent the eighties quietly building a reputation for understatement. The black nylon backpack of 1984, the gauzy minimalist suits of the early 1990s, none of that prepared anyone for what S/S 1996 actually did. It threw out the playbook of seduction. It said the female silhouette did not have to flatter, and that taste itself was a kind of laziness. Miu Miu, launched three years earlier and named after the family nickname, had been her sketchbook for this. The mainline collection finally said it out loud.

What followed is the part that's hard to remember now because it became the water everyone swims in. The off-key cool Frankel describes, the ironic 1970s palette, the deliberate awkwardness around proportion and footwear, the willingness to make the model look slightly wrong on purpose, all of that is now a default mode for half the labels showing in Milan and Paris. You see a deliberately bad sandal in a 2026 lookbook and the visual grammar comes from this one show.

Miuccia's ugly-chic vocabulary outlasted the supermodel era it interrupted. It outlasted the boom that bought it. It is still, somehow, the cleverest argument a designer has made against beauty in my lifetime, and the only one that the market eventually agreed with.

Sources:

Last Drinks at Droitwich

A 106-year-old working men's club in Droitwich shut its doors earlier this month. The committee cited the usual culprits, rising operating costs, building repairs, debt. The same week, in Cleethorpes, another club went down. Monks Road in Lincoln went in 2018 after a century of trade. The Louth Conservative Working Men's Club rebranded in 2023, dropping the "Conservative", dropping "Working Men's", trying to stay alive as Louth Social Club after membership fell from a thousand to three hundred. The Club and Institute Union itself, the federation that has stitched these places together since 1862, has quietly cut "Working Men" from its own name.

Three-quarters of the country's working men's clubs have closed in the last fifty years. In the 1970s the CIU listed about four and a half thousand affiliated clubs and four million members, a tenth of the adult population. The current figure is around eleven hundred, with some recent counts putting it under a thousand.

It would be tidy to blame the 2007 smoking ban, and people do. The ban hurt, of course. But the decline was already a long-running project, sitting underneath the headline cause. The mines went first, then the mills, then the engineering shops that named the clubs they sponsored. Once the works closed, the membership pool dried up, because the membership pool was the works. A Railwaymen's Club without railwaymen is a strange room. The smoking ban only finished a thing that deindustrialisation had already arranged.

The interiors are what people miss without knowing they miss them, and the interiors are what cannot be photographed back into existence. Flock wallpaper. Formica bar tops curving round to a glass-fronted display of crisps and pork scratchings. A concert stage at one end with a curtained backdrop and a small electric organ. A bingo board screwed to the wall behind the bar, numbered cards in a wooden rack. A committee room with leatherette chairs and minutes in a ringbinder. None of this is heritage in any official sense. There is no listing, no fund, no preservation society. When the building goes, the whole grammar of the room goes with it.

What's properly hauntological about the working men's club is that it was the kind of social institution the internet has not replaced and cannot replace, and yet it has not been mourned as a loss. A bounded community, geographically anchored, with a printed membership card that admitted you to two thousand other rooms exactly like this one in towns you'd never visit. You could walk into a CIU club in Wakefield with a card from Workington and be served. The card was a passport into a country that has now closed its borders.

Like the village hall at Balcombe, these were the institutional architecture of working-class self-organisation, built before the welfare state arrived to do some of the same work, and now outliving the world that made them make sense. The buildings persist longer than their function. A shuttered concert room with the bingo board still on the wall is not a ruin yet, only a room waiting to be turned into flats.

Reverend Henry Solly, who founded the CIU in 1862, was a teetotaler who wanted alternatives to the pub. The members, within three years, voted the alcohol back in. That tells you everything about who the institution actually belonged to. It belonged to them, and they ran it, and now it is closing because there are fewer of them left, and the ones who remain are tired.

Sources:

Atmosphere Without Address

You can't go back to 1996. Not as a place. Not the streets, not the shop windows lit for an evening that ended thirty years ago, not the bottle of perfume still alive on its top notes, not the future still unopened. That door has shut, and the building it belonged to has been refaced. The angle of light on a particular Tuesday afternoon in late September is not retrievable.

And yet certain years don't behave like other years. They don't recede on schedule. They become weather inside you. They come back as a smell, as a light level, as a particular synth pad arriving in a song you didn't think you remembered, as the blue of a 5pm sky in early October, as the texture of a typeface on a magazine spine, as the feeling of being in a room ten minutes before the news arrived that changed everything.

This is what memory does, and it's also what memory refuses to do. It preserves the emotional truth at extraordinary fidelity, the atmosphere of a year, the specific quality of being alive in it. It destroys the access. Atmosphere, not address. You get back the weather, never the doorway.

A bottle, a record, a photograph, a jacket, a magazine. At the time these were just the furniture of life. They didn't have to mean anything, because you lived inside the world that gave them meaning. Now they've become survivors. That's where the wrongness comes from. The object is still here. The world around it has vanished. It sits in the present carrying an atmosphere that no longer has a home.

It looks innocent, but it's also proof of loss. It says: this happened; you were there; you can't go back. That can feel almost accusing. The feeling is a mismatch. The object promises return and can't deliver. It opens the door a crack, lets the air of that time come through, then refuses entry. Familiar and alien at the same time. Not evil, exactly. Charged. A relic with teeth. If these things worked the way you want them to, you'd be a tourist in your own past. They don't work, and you keep them anyway.

Old perfume is especially powerful, because scent bypasses ordinary distance. A smell doesn't feel like a memory; it feels like the past has walked into the room. But when the perfume has darkened and lost its voice, even the key feels corrupted. The thing that was supposed to restore the past now reads as a damaged message from it. It comforts you by proving the past was real, and wounds you by proving it is unreachable. The objects aren't hostile. They've become haunted.

What I notice is that the rituals around them get more elaborate, not less, the further the year recedes. People build little shrines. A shelf of records arranged in a specific order. A folder of images. A playlist with a specific opening track. A bottle kept sealed in a cupboard with a strip of parafilm around the cap. None of this pretends time has reversed. None of it is naive about that. The shrines aren't a trick to get back; they're a way of keeping faith with the person who was there.

That distinction matters. Nostalgia, the cheap version, wants the place back. What I'm describing knows the place is gone and tends to the trace anyway. It's closer to what Mark Fisher meant when he kept returning to traces of futures that didn't arrive. The corridor stays raised across the field even after the line stops running. The earthwork outlasts its purpose by a factor of three. You can't take the train any more, but the embankment is still arrow-straight, and you can stand on it.

Some years resolve into events and pass through. Other years won't. They settle into the body as a quality of air. 1996 is one of those for me, and I think for a lot of people my age, although I won't pretend it's the same year for everyone. The point isn't the calendar. The point is that the year stops being a date and becomes a temperature. A specific way the afternoon used to feel. A small, irrevocable atmospheric reading.

The shrines are the only way to honour it without lying about it. You don't tell yourself the year is still here. You don't tell yourself you can return. You keep the records arranged. You wear the perfume on a Saturday in spring. You play the synth pad. You let the weather come through.

1996 isn't dead. It's folded into you, the way a season is folded into a bulb. It comes up in its own time, not as the year itself, but as the climate the year left behind.

An Inference Company Now

OpenAI shipped GPT-5.5 this morning, and the two numbers that matter are not on the benchmark slide. They are on the pricing page. Five dollars per million input tokens. Thirty per million output. That is steep enough to make you ask who, exactly, this model is for.

The benchmarks are real. Terminal-Bench 2.0 lands at 82.7%, Expert-SWE at 73.1%, both meaningful jumps in the kinds of tasks where a model has to plan, execute, recover from a failed step, and try again. The model ships with a one million token context window, available across the Plus, Pro, Business, and Enterprise tiers via ChatGPT and the API. For coding agents and computer-use workflows, this is a serious upgrade. For the person opening ChatGPT to draft an email, it is not. Gizmodo's headline put it best: 0.1% more excited about the future of ChatGPT.

Sam Altman's own framing was telling. "I personally like it," he posted on X, which is the kind of endorsement you give a sandwich. The line he was pushing harder was about the inference team: "Really excellent work by the inference team to serve this model so efficiently. To a significant degree, we have to become an AI inference company now." That is not an aside. That is OpenAI admitting, in public, that the differentiation game has moved from training to serving. Anyone can train a frontier model now, or close to it. Whether you can run it cheaply at scale, with predictable latency, against contracts that will not bankrupt you when token usage spikes, that is the actual moat.

The pricing makes more sense in that frame. OpenAI is not aiming this at the consumer who wants a slightly better autocomplete. It is aiming at the enterprise customer building agents that consume tokens by the billion, where higher per-token cost is offset by the model burning fewer tokens per task and finishing more reliably. The MLQ writeup notes that 5.5 maintains prior latency while using fewer tokens for efficiency, which is the language of someone who has been told to optimise the unit economics, not the wow factor.

Compare this with what happened the same morning across the Pacific. DeepSeek previewed V4 Pro at $0.145 in and $3.48 out, with a million token context, open weights, running on Huawei Ascend silicon. GPT-5.5 is a better model, almost certainly. It is also roughly thirty-five times more expensive on input and nearly nine times more expensive on output. Whether that gap is defensible depends on whether the agentic tasks OpenAI is targeting actually need GPT-5.5-grade reasoning, or whether V4 Pro is good enough at a fraction of the cost.

The other thing worth noting is the cadence. In the past week alone, OpenAI shipped a new image generator, workspace agents, a personally-identifiable-information redaction model, a Codex update, and now 5.5. The shipping pace I wrote about in March has not slowed; if anything it has accelerated. None of these launches individually feels like an event. Stacked together, they look like a company trying to occupy every adjacent surface before someone else does.

GPT-5.5 is not the model that takes us to AGI, despite Magic Path's Pietro Schirano reposting that exact framing and Altman amplifying it. It is a competent step on a long curve, priced like a strategic asset rather than a consumer product, optimised for a customer profile that is not most of us. The interesting question is whether the inference-company pivot actually works. Models keep getting commoditised. Serving them well, at predictable cost, might not.

Sources:

V4 on Ascend

DeepSeek previewed V4 this morning, and the interesting part is not the model. It is the stack underneath it.

The headline numbers are respectable. Two variants: V4 Pro at 1.6 trillion parameters and V4 Flash at 284 billion, both mixture-of-experts, both with a 1 million token context window, up from 128,000 in V3. DeepSeek has named its long-context trick "Hybrid Attention Architecture" and claims world-leading cost efficiency at that window size. Pricing is the familiar undercut-the-frontier gambit. Flash is $0.14 per million input tokens and $0.28 per million output, which sits below Haiku 4.5, Gemini 3.1 Flash, and GPT-5.4 Nano. Pro tops out at $0.145 in and $3.48 out, cheaper than Opus 4.7, GPT-5.5, and Gemini 3.1 Pro. On the company's own benchmarks, V4 Pro is claimed to compete with Claude Sonnet 4.5 on agentic coding and approach Opus 4.5. Benchmarks from the vendor are benchmarks from the vendor, but the last time DeepSeek made a claim like this it was broadly true.

None of that is the story.

The story is that Huawei announced, the same morning, that its entire Ascend supernode product line supports V4. Ascend 950 clusters, stitched together with Huawei's Supernode interconnect, running a frontier-class open-weight model. Cambricon pre-announced compatibility within hours. DeepSeek has not disclosed what hardware it used to train V4, and that silence is doing a lot of work, but the inference layer is now explicitly, publicly, and aggressively domestic.

This is what the export controls were meant to prevent, and it is what they arguably accelerated. Nvidia H100s and H200s got harder to ship to China in 2022, harder still in 2023, then harder again. The policy assumption was that constraining the chips would constrain the models. Instead it constrained the supply chain around Huawei, and Beijing poured money into closing the gap. The gap is not closed, Ascend 950 is not an H200, but it is close enough for a 1.6 trillion parameter MoE to run coherently across it, and "close enough" is what matters once a single Chinese lab can ship a competitive open-weight model end-to-end on domestic silicon. That is a political fact, not a technical one. It will be cited in policy papers for the next decade.

The open-weight angle is the other shoe. V4 ships on Hugging Face with weights anyone can download. Americans kept their frontier models closed for commercial reasons that made perfect sense at the time; the result, a year after R1 rattled Silicon Valley, is that the cheapest capable model in every size class is Chinese, the best open-weight model in every size class is Chinese, and now the hardware story is Chinese too. Anthropic and OpenAI have spent the last six months accusing DeepSeek of distilling their outputs to train on, and that may well be true, but it is also a slightly desperate argument to be making at this stage. The ship has sailed. The ship was built somewhere else.

I was skeptical about how much V4 would move the needle given how long the wait had stretched, and how many Chinese competitors, MiniMax included, had shipped credible frontier work in the meantime. The model itself might end up being an incremental update. The hardware announcement is not incremental. It is the first time a Chinese lab has said, on the day of a major release, that its production inference runs on chips Washington cannot block.

That changes the shape of the next two years of this race, more than any specific benchmark would.

Sources:

Lady Denman's Kitchen

At Balcombe in West Sussex, set back from the road, there is a hall with a kitchen at one end and a stage at the other. The kitchen was designed to double as a meeting room for the local Women's Institute. The stage was designed for whist drives and amateur dramatics and the reading of parish council minutes. The walls carry murals by Neville Lytton depicting war and peace. The building is called the Victory Hall, and it was paid for by a woman called Lady Denman, who was the first national president of the WI, and it is often described, in the history kept by ACRE, as the first of its kind.

Its kind being the purpose-built English village hall. Which is a thing I did not really understand as a category until I started paying attention to the ones I drove past.

There are thousands of them. Most were built in the decade after the First World War, paid for by the grief of a country that had lost 880,000 men in four years and did not know where to put the feeling. A lot of them are war memorials in the strict sense, with a plaque of names by the door, or sometimes the whole building is the memorial, with the names folded into the act of unlocking it on a Tuesday evening for a yoga class. The Historic England account is clean about this: the halls were a way of converting loss into use.

The machinery that built them is still running, which is the odd part. The Development Commission set up a rural building loan scheme in 1924, and that scheme, passed between departments and renamed and reshaped, is now administered by ACRE on behalf of Defra. A village committee somewhere in Rutland asking about a roof grant in 2026 is asking a question first formalised to help villages bury their dead from the Somme. Nobody on either end of the transaction needs to know this for it to remain true.

What I find myself returning to is the specific shape of the buildings. They are almost always a single volume with a small kitchen bolted on, a stage at one end, and a floor that takes chalk marks well. You can fit a badminton court in the main space, and a jumble sale, and a funeral tea, and a parish council. They were designed to be general-purpose in a way that nothing built now is allowed to be.

The thankful villages get a particular mention in the longer histories of this, the ones where every man came back. Some of those halls are peace memorials rather than war ones, built in a register of quiet amazement that the list of names was empty. You can read the plaque if one exists. There is usually no fanfare.

I walked past one the other week, a low pebbledash building with the initials WI picked out in brick above the door, and the noticeboard had a handwritten poster for a whist drive. Thursday, 7:30. Raffle. No internet address. The building was older than anyone who might attend.

Sources:

Brussels Moves on ChatGPT

Handelsblatt reported this week, and Reuters confirmed through a Commission spokesperson, that Brussels is days away from designating ChatGPT as a Very Large Online Search Engine under the Digital Services Act. If the decision lands as expected, it will be the first time a generative AI product has been pulled into the DSA's most demanding compliance tier, and it will happen because OpenAI's own numbers forced the question.

The trigger is scale, not function. The DSA hands its harshest obligations to any platform or search engine that averages more than 45 million monthly active users in the EU. OpenAI disclosed that ChatGPT's search feature hit 120.4 million EU users over the six months ending September 2025. That's 2.7 times the threshold. The Commission was required to publish user numbers anyway, every six months, so the evidence arrived in its inbox via OpenAI's own transparency reporting. The only remaining question was whether the Commission would treat ChatGPT as a search engine at all, and a spokesperson has already indicated it will be handled "case-by-case."

Translation: yes, probably.

What follows a VLOSE designation is not trivial. The Commission's own page lays out the schedule plainly. Four months to comply. Mandatory annual risk assessments covering illegal content, fundamental rights, electoral processes, public health, and the protection of minors. Independent audits. A crisis response mechanism. Researcher data access. Supervisory fees calculated as a percentage of EU turnover. The obligations read like the shape of a regulator trying to catch up with ten years of unchecked product design, all at once, pointed at a company that has been public-facing for less than three years.

OpenAI's position is awkward. The company has spent the past year arguing, plausibly, that ChatGPT is not really a search engine, that it retrieves, synthesises, generates, and does several other things besides. The DSA's definitional scaffolding doesn't care. It cares about the search-shaped function and the user count, and OpenAI built the former and reported the latter. The company can contest the designation at the General Court, which is the path VLOP designees have used before, but that doesn't pause the clock. You still comply while you litigate.

The broader pattern is worth naming. Europe's regulatory posture toward American AI firms has stopped being consultative. The AI Act, the DSA, Ireland's Media Commission calculating supervisory fees — this isn't one framework, it's a stacking set of them, and the interaction effects are where the real enforcement pressure will land. A model provider can comply with the AI Act's GPAI rules and still be on the hook for DSA systemic-risk obligations for the consumer product that wraps it. The state-level pressure in the US is crude by comparison, a threat to yank broadband money to keep states from passing their own laws. Brussels just does the work.

There is a version of this story where the designation is a tidy procedural event, OpenAI ships the risk report on time, the audit clears, nothing visibly changes for the EU user. That is probably how the next six months go. But the precedent is the point. Once one generative AI service is inside the VLOSE tent, every other chatbot that reports EU usage becomes a candidate by arithmetic. Gemini already clears the threshold. Claude will, if it hasn't. Perplexity is smaller but arguably more search-shaped than any of them. The Commission has been handed an instrument and a user-count floor, and it knows how to use both.

The regulators caught up faster than I expected. That might be the most interesting part.

Sources:

A Contractor Had Mythos

Three days after the NSA quietly joined the Mythos preview, an unauthorized group has the model too. Anthropic confirmed on Wednesday it is investigating the incident. The vector, per the reports, was a third-party contractor's environment. A private online forum ended up with access to the system Anthropic had chosen not to release broadly.

This is the thing everyone was worried about, and it arrived on roughly the schedule you'd expect.

The timeline is short and bleak. April 7: Mythos announced, a limited set of vetted partner organisations given keys. April 16: Opus 4.7 lands with Mythos held back as its more capable but gated cousin. April 20: reporters reveal the NSA has access, a detail Anthropic had not disclosed. April 21: TechCrunch runs the breach story. Bloomberg and SiliconAngle follow the same day. By Wednesday morning the former National Cyber Director is telling Fortune that Mythos can hack nearly anything and the country is not ready, and Chubb's chief executive is on an earnings call using the phrase "the arms race is on."

Two weeks from closed preview to unauthorized access. If you sat down to script how a controlled rollout of a frontier offensive-security model would fail, you would write something close to this. Not a direct breach of Anthropic's corp network. Not a jailbreak of the model itself. A vendor relationship. Someone with legitimate keys whose environment turned out to be the weakest link in the chain.

There is a particular irony here that I want to name plainly. Anthropic's official posture is that Mythos exists, in part, to identify the next generation of supply-chain vulnerabilities. The company has been telling the White House and Treasury that frontier models are how the United States gets ahead of its own software fragility. The specific way their own most guarded capability leaked was through the class of risk the model was supposed to find. The fourth party had the keys.

I don't think this ends the Mythos program, for what it's worth. The NSA is presumably still using it. The courts will continue to hear the Pentagon's supply-chain case against Anthropic while the intelligence community continues to consume the product. The lesson the industry will draw is not "don't build Mythos." The lesson will be: tighten the vetted-partner list, redo the vendor attestations, add another audit layer. Business as usual, one notch paranoid.

What the breach actually demonstrates is quieter. A model described as capable of chaining software exploits and discovering flaws at scale is now, in some unknown quantity, outside the boundary of the vetted organisations that were supposed to hold it. Whoever has it does not need to exfiltrate weights or reverse-engineer the system card. They just need API access through someone else's key. That is a fundamentally different threat model from "a secret AI lab builds something scary." It's "a secret AI lab builds something scary and then a mid-tier consulting firm's Okta misconfiguration hands it to a chat room."

Chubb's Greenberg, whatever else you think about insurance executives on earnings calls, picked the right noun. This is an arms race, and the starting gun just went off sideways.

Sources:

Twenty-Four Points

The £2,924,622 was won on the weekend the country stopped caring. A syndicate of regulars from the Yew Tree pub in Manchester shared the cheque, the largest payout in pools history. The date was November 1994. That same weekend, on a Saturday night televised by Noel Edmonds, the National Lottery launched with five winners splitting £5,874,778 on its inaugural draw. The pools had peaked and been displaced in the same breath.

Football pools were never really about football. The Treble Chance, introduced by Littlewoods in 1946, asked you to pick eight matches you thought would end as score draws. Score draws counted for the most points; no-score draws less; home and away wins least. You wanted twenty-four. The maths punished favourites and rewarded the kind of ordinary dull Saturday in March when nine matches across the lower divisions all finished 1-1. Most weeks nobody got the maximum. Some weeks the dividend was £2.94 million.

Ten million people played every week at the peak. Ten million in a country of fifty-eight million. The coupon arrived through the door from a collector you knew by name, or you carried it yourself to the corner shop. You filled it in with a biro, thinking about Crewe versus Bury, and you handed it back with the right coins counted out. Saturday tea-time was a calculation. The teleprinter at five o'clock, the classified results read by James Alexander Gordon in that cadence which rose for the away score and fell for the home, and somewhere in the rhythm a person was working out whether their eight had come in.

The lottery undid all of this in eighteen months. It wasn't that the pools were unfair, or hard to play, or unpopular. They were popular. They had been popular for sixty years. They were displaced by something faster: a six-number ticket bought at the counter of the Spar, drawn on television by a machine, with a jackpot that started at seven figures and rolled over until it hit eight. The pools required you to think about football. The lottery required you to think about nothing.

Vernons closed its pools operation in February 1998. Littlewoods sold out to Sportech in 2000 for £161 million, a number that would have been laughable five years earlier. Ten million players became 830,000 by 2006, then 700,000 by 2007, then something smaller still that nobody publishes loudly. Thousands of jobs went on Merseyside, most of them women who had counted coupons in the great Art Deco building on Edge Lane that Littlewoods opened in 1938 and which has stood derelict since 2003.

The texture you can't recover is the weekly arithmetic. People who would have called themselves bad at maths could in fact do permutation calculations in their head, trading coverage for cost, balancing what their pension would stand against what Saturday might bring. They were running probability models in a ledger in the kitchen drawer. Then a machine drew six balls and the ledger closed.

I don't think the lottery was a worse thing. It pays out in ways the pools never quite did: lump sums, instant millionaires, a charitable arm that built half the country's velodromes. But it asks nothing of you. The pools asked you to look at the fixtures.

Yew Tree won the most. Yew Tree won last.

Sources:

Forty Labelers

Before ChatGPT, there was a paper. March 4, 2022. Ouyang, Wu, Jiang, Almeida, and a cast list long enough to fill a film credit, posting to arXiv under the title "Training language models to follow instructions with human feedback." Inside the paper sits the specific mechanism that turned a statistical parrot into something you could ask for things.

GPT-3, for all its parameter count, did not follow instructions. It predicted the next token. If you gave it "Summarise this paragraph in one sentence," it would happily extend the paragraph, suggest ten more instructions, or ignore you entirely and generate a shopping list. Prompt engineering was the art of tricking it into the shape of the task. Most people gave up after a few tries.

OpenAI's fix came in three stages. First, supervised fine-tuning. Forty human labelers sat down and wrote, by hand, roughly thirteen thousand demonstrations of the form (prompt, correct response). The model was fine-tuned on these the way you'd fine-tune on any other dataset. This alone got them most of the way there. The SFT model already outperformed vanilla GPT-3 on instruction tasks, and a reasonable person might have called it done.

They didn't. The second stage was a reward model. Same labelers, different task: presented with a prompt and several model outputs, rank them from best to worst. That preference data trained a separate model whose only job was to predict, given a candidate response, how much a human would like it. A critic, in the old-fashioned sense. It has no opinions of its own, only an internalised sense of what the labelers collectively preferred.

Third stage, the reinforcement learning itself. They took the SFT model, let it generate responses to new prompts, scored each response with the reward model, and used Proximal Policy Optimization to shift the weights so that higher-reward tokens became more likely. The critic graded, PPO updated. Round and round. The original pretraining objective got mixed back in (they called this PPO-ptx) to stop the model from forgetting how to write English while chasing the reward.

The headline result: a 1.3 billion parameter InstructGPT was preferred by labelers over the 175 billion parameter GPT-3 it started from. A model a hundred times smaller, judged better, because it had been shown what better looked like. Size still mattered. But the gap between "big" and "useful" turned out to be bridgeable by thirteen thousand demonstrations and a ranking tool.

What the paper doesn't advertise is what the technique inherits. Reinforcement learning from human feedback had been kicking around since Christiano et al. in 2017, where it taught agents to perform tasks in simulated environments and Atari games by eliciting human preferences rather than writing down a reward function. Teaching a model to be helpful is, structurally, the same problem: you cannot write the reward function, so you collect it from humans and train a model to stand in for their judgement. What changed was the scale of the demonstration set and the object being trained.

Every model you talk to that acts like an assistant is, underneath, some descendant of this pipeline. The chain-of-thought monitoring that Anthropic relies on to catch deception is a shadow cast by this exact mechanism. The model learned to produce reasoning the reward model liked. Whether that reasoning is faithful to the computation underneath is a question the 2022 paper did not ask. Four years later, it's the question everyone is asking.

Sources: