expectedwrong hindsight

The Architect Nobody Planned For

Speculation about o3 on OpenAI's big announcement day, and what o1 actually does in a real workflow.

3 min read 554 words #openai #o1 #o3 #llm-workflow #developer-tools
hindsight — nailed it

o3 shipped. They did skip o2 out of respect for a phone company. The speculation was correct and the fact that it landed as a real possibility was the whole joke.

Sam Altman tweeted something — vague enough to guarantee maximum speculation, specific enough to confirm something is happening — and now everyone is running the guessing game. Top predictions are o3 (they're apparently skipping o2 out of respect for a phone company, which is the kind of thing that happens in this industry with complete sincerity) and GPT-4.5o, or finally delivering on the full modalities promised when they announced Omni back when we were all a year younger and more naïve about timelines.

Someone joked that it might be "Hi here is the first provably conscious AI model." The room didn't laugh as hard as it should have, because OpenAI's year has been unhinged enough that it lands as a real possibility. After the board drama, the departures, the Sora rollout, the nonprofit-to-capped-profit saga — a consciousness announcement would be, honestly, on-brand.

The o3 speculation is the interesting one. That's a big leap to make publicly — skipping a version number implies they've nailed something hard enough to justify the theater of the announcement. The last time they were two years ahead, it reshaped everything. So either they've done it again, or this is a different flavor of hype.

But here's the thing I keep thinking about, on the day everyone's watching for a new model: o1 already occupies this specific, weird, unplanned niche in the workflow that nobody really talks about clearly.

It's not a daily driver. Sonnet is faster — meaningfully faster — and for task-based coding, for the back-and-forth, for "do this thing and then this other thing," Claude wins. Not because o1 is bad, but because o1 is slow in a way that breaks the rhythm of actual work.

But the moment something gets sideways in the codebase, or you've been heads-down long enough with Sonnet that you've genuinely lost the plot — you repomix everything out and drop it into o1 and ask it to doublecheck. Most of the time it comes back with a materially better picture of where things actually are. It sees the whole thing differently.

The pattern that stuck with me: collecting abandoned GitHub repos — old projects that solved specific problems well, then got left to rot — turning them all into text, dropping them into a single o1 prompt, and asking it to synthesize a modern library from the cherry-picked best parts. That's not a thing you'd ask a fast model to do in a loop. That's a different class of operation entirely. One long, expensive, considered thought instead of a conversation.

The cursor-yolo-agent-mode crowd has figured out a version of this — using o1 as the architect, not the worker. Which is maybe what it's for. It just took everyone a while to stop trying to use it like the other thing.

So if o3 is what the speculation suggests — a materially smarter reasoning model — the question isn't whether it's impressive. It's whether it fills a gap that already exists, or creates a new one nobody saw coming. The last genuinely surprising thing wasn't a capability, it was watching people discover what the capability was actually good for.

We'll know soon. Or we'll know something. OpenAI has mastered the art of announcing a thing that turns out to be a different thing than you thought, which is its own skill.