Sam Altman Has A Soft Spot For GPT-2
A mystery model is beating everything in the LMSYS arena and OpenAI's CEO is doing his best impression of someone who knows nothing about it.
it was GPT-4o being tested before launch. the 'not trying very hard to be plausible' deniability observation was exactly right. the 'im-also-a-good-gpt2-chatbot' name was peak openai humor.
There's a model on LMSYS Chatbot Arena called gpt2-chatbot. It is not GPT-2. GPT-2 was released in 2019, has 1.5 billion parameters, and cannot write working code. gpt2-chatbot is doing things GPT-2 cannot do, which is everything.
It appeared in the arena around May 1st. Nobody announced it. LMSYS didn't explain it. It just showed up and started beating GPT-4 and Claude Opus in head-to-head votes, casually, like this was a normal thing to happen.
Now there are two of them. gpt2-chatbot and im-also-a-good-gpt2-chatbot, which is a name that was clearly written by someone who finds this extremely funny.
Sam Altman today tweeted that he has, quote, a soft spot for gpt2. This is the kind of plausible deniability that isn't trying very hard to be plausible. He said nothing else. He doesn't need to. The whole industry already knows what this is — a capability demonstration dressed up as a retro joke, parked in a public benchmark so the numbers can speak before the press release does.
The release is coming. Could be days. Whatever it is, they wanted people to see it win first.
There's something almost charming about this as a strategy — laundering a flagship model through a chatbot named after a three-year-old architecture so the leaderboard scores are real and uncontestable before anyone starts arguing about vibes. The arena doesn't know who built the model. It just knows which one people preferred.
Turns out people prefer the one that's actually good. Groundbreaking stuff.
Counterpoints
Push back, extend the argument, or sharpen it. New counterpoints go through review before they show up here.
No approved counterpoints yet.