Batman and Robin Are the Same Person Now
GPT-5 solved something Opus and Sonnet couldn't, and I'm not sure what to do with that.
GPT-5 solving what Claude couldn't. The leapfrogging between models continues. The sensation of a new tool earning your respect — that keeps happening with each release.
First time running GPT-5 in Codex. First time. And it immediately walks into a UI problem that had beaten two Claude models and just — solves it.
I've been living in Sonnet and Opus long enough that I'd stopped thinking of them as things that could fail. They fail. This problem was particularly tricky — not "the model is confused" tricky but "this requires holding a specific kind of spatial reasoning across a weird interaction state" tricky — and both of them came up short. GPT-5 did not.
There's a specific sensation when a new tool earns your respect and it feels almost exactly like the sensation of losing it for something else. I'm not saying Sonnet is Robin now. I'm saying I genuinely don't know who's Batman.
The honest answer is probably that neither of them is Batman. They're different animals who occasionally occupy the same niche, and depending on what you're hunting, you want one or the other, and sometimes you won't know which until one of them fails you in production at midnight.
I'm going to let GPT-5 brew for a while. Let it run. See what it does unsupervised. This is how you learn what something actually is — not the benchmark, not the release post, but the specific weird Tuesday problem it either cracks or doesn't.
The batting order just got complicated.
Counterpoints
Push back, extend the argument, or sharpen it. New counterpoints go through review before they show up here.
No approved counterpoints yet.