expectedwrong hindsight

The $5.5 Million Number Is Wrong and It Doesn't Matter

DeepSeek dropped an open-source model that broke the narrative, and WSJ had to cover it, which means it's real.

2 min read 355 words #open-source #deepseek #ai #llm
hindsight — nailed it

The analysis was correct — the $5.5M was the last coat of paint, not the total bill. But the efficiency was real. DeepSeek's contribution was genuine regardless of the accounting. Getting the number wrong and the conclusion right was the whole point.

The $5.5 million figure is bait. It's the number DeepSeek cited for training R1 — a reasoning model that goes head-to-head with o1 — and it traveled around the internet at the speed of a crisis because it implies that everything the large labs have been selling us about the cost of frontier AI is, if not a lie, then at minimum a very convenient story.

But the number is undercooked. It doesn't include the prior runs, the ablations, the compute spent on dead ends, the infrastructure that was already paid for, the human feedback that doesn't show up on a GPU invoice. The $5.5M is the final training run, not the total bill. When people say "they built GPT-4 for $5 million," they're doing the equivalent of pricing a house by the cost of the last coat of paint.

None of this means DeepSeek didn't do something extraordinary.

They did. The actual achievement — which the $5.5M framing both inflates and obscures — is that a team outside the usual club produced a frontier-class model and then open-sourced it. The weights are out. Anyone with the right hardware can run it. That part is not ambiguous, not a rounding error, not contingent on accounting methodology.

The reason WSJ had to run the piece is the same reason the number went viral. The story isn't really about DeepSeek. It's about what DeepSeek implies: that the moat everybody assumed existed — the one that justified the valuations, the chips hoarding, the "we're the only ones who can do this responsibly" posturing — might be shallower than advertised.

Open source wins are usually slow and quiet. Someone releases a model, researchers integrate it, a year later it's infrastructure nobody thinks about. This one wasn't quiet. This one landed like a piece of foreign policy.

The $5.5 million will get debunked — partially, correctly, with caveats — and the debunking will travel half as far as the original claim, which is how it always goes. Meanwhile the weights are still out there, the model still runs, and the WSJ still had to write about it.

That's the tell.