expectedwrong hindsight

Your System Prompt Has a Landlord

OpenAI's model spec formalizes what was always true: they sit above the chain of command, and you're renting.

2 min read 277 words #openai #ai-safety #model-spec #open-source #alignment
hindsight — nailed it

The system prompt has a landlord. OpenAI's hierarchy — OpenAI above developer above user — remained the architecture. The model spec observation that you're a middle manager who got cc'd on an org chart was the right framing.

OpenAI published their model spec back in May and I finally read the chain of command section, which is the part where they explain, without particular drama, that OpenAI sits above the system prompt now.

The system prompt still exists. You can still write one. It just gets demoted — treated as a "developer" message, obeyed at a tier below whatever the model has been trained to prioritize. The hierarchy goes: OpenAI, then you, then your users. You thought you were running a system. You're a middle manager who got cc'd on an org chart they didn't write.

Altman's been gesturing at this for a while in the AI safety conversation — who controls the model, who controls the controller, and at what point does "alignment" just mean "aligned to us specifically." The spec is the clean formal answer to that question.

The honest reaction, sitting here in December, is that if open-weight models didn't exist this would be genuinely alarming. A single company deciding what the model will and won't do regardless of what any developer instructs — that's not a product, that's infrastructure with opinions baked in below the API surface. But Llama exists. Mistral exists. You can run something that doesn't report to anyone. The threat is bounded, which is the only reason this reads as interesting rather than horrifying.

And then the real kicker: OpenAI can write the most carefully-considered behavioral spec in the history of AI alignment, and China will ship a model that simply doesn't have one. The guardrails go in the spec. The spec doesn't cross borders.

You're playing capture the flag against a team that doesn't know they're playing.