GPT-OSS 120B Is Running on My Machine and I Don't Know What to Do With That
Fast, local, and honest about being a black box — unlike the marketing around it.
GPT-OSS 120B running locally and being actually good. Not grading on a curve because it's local. The report from the field held up.
The 120b has been running locally via Ollama for a few days now and the report is: it's good. Actually good — fast, thorough, no extended dumb thinking where it spins out into baroque self-narration about how it's reasoning, just concise thoughts and useful output. It feels like a real GPT model, which is a sentence that still feels strange to type.
I have it in the corral now alongside the others. Testing application use cases today. The 20b is next.
Yusuf Mehdi at Microsoft announced this week that GPT runs natively on Windows now — "the first time," he says — which is the kind of statement you can only make if you've decided Whisper doesn't count, and also that all the OpenAI open-source work that predates GPT-3 doesn't count, and also that Ollama running GPT-OSS on Windows this whole time doesn't count. You can get to "first time" from here but you have to squint pretty hard and define your terms on the fly.
The native Windows path exists and apparently works. Whether it's easier than Ollama is a live question — "as easy as they say" is doing a lot of load-bearing work in that sentence, and Microsoft's track record on "easy local AI setup" is a story still being written in real time.
The thing that got me was Mehdi saying "no black box." Which, I understand what he means — open weights, you can inspect the architecture, nobody at OpenAI is secretly deciding what your model says. Fine. But a transformer at 120 billion parameters is one of the most comprehensively black boxes that has ever existed. We know what goes in and what comes out. The middle is a 70-gigabyte dream. "No black box" in this context means "no corporate black box," which is a different and more limited claim — and also one that every open-source model advocate has been making for years without Microsoft's help.
Still. The model is good. Running locally. Fast. That part is just true regardless of how the announcement was framed.
Counterpoints
Push back, extend the argument, or sharpen it. New counterpoints go through review before they show up here.
No approved counterpoints yet.