expectedwrong hindsight

Context Engineering Is the New Prompt Engineering (And I'm Already Annoyed by the Phrase)

The actual craft of making LLMs do your job isn't clever prompts — it's surgical control of what the model is allowed to know.

4 min read 843 words #ai #llms #workflow #context-engineering #software
hindsight — still happening

Context engineering is still the new prompt engineering. The phrase already sounds like a LinkedIn job title. The six-month countdown to meaninglessness is still running.

Prompt engineering had a good run. Two years, maybe. Then everyone's prompt became "you are a helpful assistant" and we all moved on.

The thing that replaced it doesn't have a clean name yet — some people are calling it context engineering, which is accurate but already starting to sound like a LinkedIn job title, which means we have approximately six months before it's meaningless.

The actual skill is this: knowing exactly how much the model needs to know, and no more, to do the specific thing you want it to do right now.


The workflow that's emerging — at least among the people who seem to be shipping things — runs something like this. You open a fresh chat. You extract the relevant sliver of your codebase into the conversation. You do one focused thing — pull the intent-matching methods out of BIGFILE into a new file, rework the embedding methodology, whatever. You ride that chat until the thing is done. Then you close it and open another one.

This sounds obvious until you realize how badly most people fight it. The instinct is to keep the chat open, keep building on it, accumulate context like a packrat until the model is drowning in its own history and starts confusing what you asked for three hours ago with what you're asking now.

The context window giveth and the context window taketh away.

For anything non-trivial — a feature that spans multiple sessions — you write a FEATURE.md. What are we building. How have we done it so far. Every subsequent chat opens with that file. Every subsequent chat updates it. The file is the memory. The model is stateless and you're the one maintaining state, which is its own kind of horrible, but at least it's honest.


There's an escalation ladder here that I find grimly satisfying.

First pass: Claude, scoped chat, targeted context. This handles most things.

Starts failing: open repoprompt, drop in only the files actually involved, throw o3 mini high at it, apply changes, run tests. Works almost all the time.

That fails: whole repo into o1 pro. Ask it to solve. This has failed twice, apparently.

Those two times: deep research and human ingenuity.

The human is the last resort. The human is the error state.

I don't know whether to find this depressing or aspirational. I've landed on both, simultaneously, like Schrödinger's career anxiety.


The genuinely new thing — the thing that made me stop and reread it — is the back-to-the-future trick.

You work a problem far. You get to the solve. The model, burned through half its context window getting there, explains the solution clearly. You copy that explanation. Then you scroll the chat back to near the beginning — before all the dead ends, before the wrong turns — and you paste the solve in.

You're giving your past self the answer.

It's not a hack, exactly. It's context surgery. You're removing the model's memory of the path and leaving only the destination, so it can reconstruct the right path without all the scar tissue from the wrong ones.

The analogy someone used was buying Bitcoin in 2013. Which is a good analogy for making any developer feel bad about themselves, so I respect it.


The other thing that sounds stupid until it works: calling your model a dummy or a cowboy.

Not as an insult. As a persona assignment. Tell it it's a cowboy when you need it to move fast and take risks. Call it a dummy when you need it to slow down and over-explain. The model adjusts.

Nobody planned this. There was no design document that said "we will make the model responsive to casual human character assessments." It just does it. The model has read enough about dummies and cowboys to know what those archetypes do, and it performs accordingly, and this is either a feature or an emergent property of training on human text, and I'm not sure the distinction matters when the thing works.


The frame that holds all of this together is minimum viable context — MVC, if you want an acronym that will confuse people who remember 2009. The idea is that there are two kinds of context: the context you need to get to the point, and the steering context you need once you're there. Both can run to hundreds of thousands of tokens. Neither is inherently better. The skill is knowing which one you're building and why.

Prompt engineering was about the words. Context engineering is about the shape of what the model knows — what it sees, what it doesn't, when it sees it, in what order.

The people who figure this out fast are going to be annoying to work with for the next few years. They will ship things that feel impossible and refuse to explain how. They will say things like "it's just context management" in a tone that implies you should have known this already.

I fully intend to be one of them.