OpenAI's Efficient Scraps
GPT-4.1 dropped today and it's not trying to win anything — which is maybe the whole point.
The efficiency read was correct. GPT-4.1 was OpenAI admitting that cost-per-token matters more than benchmark crowns. The naming simplification still hasn't happened.
People were expecting open weights. What they got was GPT-4.1.
Not the same thing.
The headline numbers are fine — 26% cheaper than GPT-4o for median queries, 1M token context, nano is their fastest and cheapest model ever, SWE-bench up. If you squint at it like a press release it looks like a strong release. OpenAI even promised to simplify naming, which, gesturing broadly at the model lineup, did not happen.
The more interesting read is what this release isn't doing. Anthropic clearly had to go all the way in to get Claude 3.7 Sonnet where it landed on coding benchmarks — you can feel the effort in the thing. GPT-4.1 is sitting at roughly half the cost for roughly 90% of the SWE-bench score, and it doesn't feel like they strained to get there.
That gap — half the cost, minus ten percent on the benchmark — is either a gift to developers or a message to competitors. Probably both.
The theory that makes sense: this isn't OpenAI's play to retake the coding crown. This is OpenAI clearing the table, making the commodity layer cheaper and faster before they release whatever agentic thing they've been building, the thing that actually requires models running thousands of times in a loop and where per-call cost is the only number that matters.
You drop cheap, fast models into the world. You wait for everyone to build on top of them. Then you drop the agent layer.
The scraps were load-bearing all along.
Counterpoints
Push back, extend the argument, or sharpen it. New counterpoints go through review before they show up here.
No approved counterpoints yet.