expectedwrong hindsight

Microsoft Just Boxed Up Everything We Wanted

AutoGen ships a multi-agent framework with human-in-the-loop and it's almost annoyingly clean.

2 min read 288 words #ai #llms #multi-agent #microsoft #tooling
hindsight — nailed it

AutoGen landed at exactly the right moment and the multi-agent pattern it formalized became the industry default. CrewAI, LangGraph, Claude's agent SDK — all downstream of this idea. Microsoft boxed it up and everyone opened the box.

Microsoft Research dropped AutoGen a few days ago and I walked right past it, which I will not be doing again.

The pitch is simple enough to be suspicious: a framework for building LLM applications out of multiple conversable agents. Agents talk to each other. Agents talk to humans. Humans talk back. The whole thing is configurable at the level of individual agents — what they know, what tools they can call, when they defer to someone else, when they stop.

What makes it interesting isn't any single feature. It's that someone sat down and put the obvious pieces in one box. Code execution, tool use, multi-agent orchestration, and a human participation layer — not bolted on as an afterthought but modeled as first-class. A human is just another agent in the conversation. You configure it like everything else.

That last part is the thing. Every system I've seen that claims "human in the loop" means a modal dialog that says "continue? y/n" while the rest of the pipeline holds its breath. AutoGen treats the human as a participant with agency, which is either a genuine design insight or a very good reframe of the same modal dialog problem. I'm withholding judgment until I run it.

The demos involve assistant agents, user proxy agents, and group chats where multiple specialized agents coordinate on tasks that would break a single-agent setup. Math tutoring. Coding. Research synthesis. The scaffolding is thin enough that you can see through it, which is either confidence or restraint — also withholding judgment on that.

It's a tidy piece of work. The kind of thing where you read it and feel slightly embarrassed you didn't write it first, which is its own kind of recommendation.