GPT-5.4 launched this week. It merges the coding muscle of GPT-5.3-Codex with improved reasoning, native computer-use capabilities, and an experimental million-token context window. On paper it is the most capable model OpenAI has ever shipped. The reaction across every forum I follow has been a collective shrug.

That silence says more than any benchmark.

The GPT-5 era started badly. When OpenAI unified its model lineup last year under the GPT-5 brand, it simultaneously killed GPT-4o, GPT-4.1, GPT-4.5, and the entire o-series — with no deprecation period. Power users woke up to find the models they'd built workflows around simply gone, replaced by a router that quietly picked which sub-model would answer each query. The backlash was immediate and brutal. People who'd been paying $200 a month for o3-level reasoning discovered they were getting something closer to 4o-mini for half their prompts.

OpenAI scrambled. Within 72 hours they bolted on "Auto," "Fast," and "Thinking" toggles and restored legacy model access for paid users. The damage was done. Trust, once lost with power users, doesn't reinstall with a patch.

Since then it's been relentless iteration — GPT-5.2, 5.2-Codex, 5.3, 5.3-Codex, 5.3 Instant with its 26.8% hallucination reduction, and now 5.4. Each release brings genuine improvements. The 5.3-Codex model was legitimately impressive for agentic coding. The Instant variant finally addressed the chronic overrefusal problem that made earlier versions refuse to help you rename a variable if it contained the word "kill." These are real engineering wins.

But nobody's excited.

Part of it is the trust deficit OpenAI keeps widening through its own choices — Pentagon deals signed hours after publicly praising the competitor who refused them, ads in a product whose CEO called advertising in AI "uniquely unsettling". Part of it is that Anthropic and Google have been shipping at the same pace. Gemini 2.5 Pro has dominated the LMArena preference leaderboard for months. Claude hasn't stood still either. When everyone improves simultaneously, no single release feels like a leap.

My prediction: the version numbering treadmill will accelerate further. GPT-5.4 hints are already surfacing about 5.5. Each point release will be competent, occasionally excellent, and met with diminishing enthusiasm. OpenAI's problem was never capability. It was credibility. And you can't ship your way out of that.

Sources: