On Saturday, DeepSeek announced that the 75% discount on its flagship V4-Pro model is no longer a discount. It's the price. The promotion was due to expire on 31 May; instead the company locked the new rates in indefinitely. Output tokens now cost $0.87 per million. Cached input sits at $0.003625. The standard input rate is $0.435.

For context: GPT-5.5 charges $5 per million input and $30 per million output. Claude Opus 4.7 is $5 in, $25 out. Decoder ran the comparison and put V4-Pro at roughly 34 times cheaper than GPT-5.5 on output, and about 52 times cheaper once you cross GPT-5.5's long-context tier above 272K tokens. Those gaps aren't margin; they are a different business model wearing the same product shape.

The framing of "price war" has been used for every Chinese model release since the original V3 in late 2024, and it has become a tired phrase. What is actually new here is the supply story. V4-Pro is the first Chinese frontier model that runs natively on Huawei Ascend 950 silicon rather than Nvidia. DeepSeek told customers at launch a month ago that prices would ease once Ascend 950 supernodes started arriving in volume, and warned that until then the Pro tier could cost up to twelve times more than the lighter Flash model because of compute constraints. Saturday's lock-in is the company saying the supply problem is solved well enough to commit to the new floor.

That is the part worth paying attention to. The token price is a headline; the chip pivot is the structural fact. US export controls on the most advanced Nvidia parts pushed Chinese buyers toward Huawei. A second layer of restrictions on chipmaking equipment slowed Huawei's own ramp. Both pressures are still in place. What changed is that DeepSeek has decided the Ascend pipeline is reliable enough to price against, which is a different kind of bet than running benchmarks on borrowed hardware.

The interesting question is what this means for buyers who do not live inside Anthropic or OpenAI's stack. The flagship tax was already collapsing inside the Western labs; you could get 95% of the quality for a third of the cost by dropping from full GPT to a mini variant. DeepSeek's move is a more aggressive version of the same trick, only the discount goes to roughly two cents on the dollar and the savings are independent of which tier of Western model you benchmark against. For a CTO running document analysis or codebase review across a million tokens of context, the math is no longer close.

The complications are real. Training-data provenance for V4 is opaque, Anthropic has openly accused DeepSeek of distillation against earlier Claude generations, and routing enterprise traffic through a Chinese API still trips most large companies' procurement processes. Whether that gets resolved through audited deployments, private hosting, or just slow erosion of caution is the actual question. Reuters reported DeepSeek is chasing a $45 billion valuation off this strategy. The playbook is Amazon Retail circa 2002: give up margin, take the demand, build the moat. The new wrinkle is that the moat is silicon made in Shenzhen.

There's a smaller observation underneath all this, which is that the phrase "frontier model" is starting to do too much work. V4-Pro is not the most intelligent model in the world. It is the cheapest model that is intelligent enough for almost everything, and that has become a separate axis of competition entirely.

Sources: