OpenAI Just Handed Me a Delete Key
GPT-4 Turbo, the Assistants API, and the quiet death of a lot of code I wrote.
The 128k context was real. The Assistants API was not. OpenAI deprecated it in favor of the Responses API — the whole stateful-thread abstraction got rethought. The delete key moment was genuine but the thing it deleted got deleted too. GPT-4 Turbo itself became a footnote once GPT-4o shipped.
Sam Altman just told a room full of developers to call him directly on his cell phone, which is either the most confident thing a CEO has ever said into a microphone or a threat.
It was a livestream. I watched the whole thing. Here is what happened.
GPT-4 Turbo with 128k context — which is roughly the entire text of War and Peace plus your feelings about it — is now the model, and it costs 2.75x less than the GPT-4 it replaces. Not 3x, not 2x, specifically 2.75x, a number that feels like it was produced by a CFO having a minor episode. It's trained through April 2023, which means it doesn't know about itself, a fact I find comforting for reasons I can't fully articulate.
JSON mode. A seed parameter for reproducible outputs. Parallel function calls in a single request. These are the kinds of features that don't make headlines but make grown engineers cry quietly into their laptops, because they represent months of workarounds — the if "```json" in response guards, the retry loops, the elaborate prompt engineering that exists solely to coax a language model into behaving like a function — suddenly becoming dead code.
Speaking of which.
They announced the Assistants API — retrieval, code interpreter, function calling, all of it, baked in, managed by OpenAI, no infrastructure required. RAG plus custom agent scaffolding as a platform primitive. My immediate reaction was that I could delete a significant portion of a codebase I've been nursing for months, which is either a great product announcement or a personal tragedy, depending on how you look at sunk costs.
DALL-E 3 is in the API now too. This is going to make the doomsday machine significantly more threatening at the next opportunity. I will not elaborate.
The big new thing they're calling GPTs — their branding for agents, lowercase-a, the kind of agents you assemble with natural language rather than Python — is genuinely interesting and also a little terrifying in the way that all "program with natural language" pitches are terrifying, which is that they're right. They're solving it at the right layer. The demo was a travel assistant that Altman built onstage, and it worked, and the audience applauded, and somewhere in the world a developer is calculating how many lines of code it replaces.
The answer, conservatively, is a lot.
Custom models exist for enterprises willing to pay the kind of money where Sam Altman gives you his cell number. If you have to ask, and you do, you cannot afford it.
One note: as of the end of the keynote, none of the new APIs were actually live. I checked. Multiple times. Someone at OpenAI needed to click a switch and had not yet done so. The announcements were real, the pricing was real, the demos were real — the infrastructure was on a one-hour delay, which is its own kind of metaphor for the whole enterprise.
The switch did eventually flip. I am now staring at an Assistants API reference and a folder full of code I wrote last month that appears to have aged out of relevance in real time, which is the fastest I have ever experienced technical debt.
Two and three-quarter times cheaper.
Counterpoints
Push back, extend the argument, or sharpen it. New counterpoints go through review before they show up here.
No approved counterpoints yet.