DataInterpreter Will Do Your Data Science Job While You're Still Reading the README

MetaGPT just open-sourced everything except the model, and it does Walmart sales forecasting and Apple stock prediction and customer segmentation while you're still arguing about whether Devin is real.

DeepWisdom dropped DataInterpreter this week — a code-generating, plan-revising, task-executing data science agent that looks at your messy CSV and emerges on the other side with a forecast model — and nobody's talking about it because Devin is still consuming all available oxygen.

They didn't open source the model. They open sourced everything else. Make of that what you will.

The examples are where you want to look. Walmart sales forecasting. E-commerce customer segmentation with purchase prediction. Apple stock price analysis. These aren't toy demos where the agent prints "hello world" and everyone applauds. These are the actual annoying analytical tasks that take a junior data scientist three days and a senior data scientist one frustrated afternoon.

The agent does them. Then it shows its work.

What makes DataInterpreter different from "GPT in a loop" is that it dynamically revises its own plan as it executes — so when the data doesn't conform to the shape the model assumed (and it never does), it doesn't just crash and produce a stacktrace you have to interpret yourself. It updates the plan and keeps going. This is the part people are glossing over. The failure mode of every agent before this was: confident plan, brutal reality, error, human back in the loop. The loop is the whole problem.

The timing is also interesting. Devin drops on March 12th. MetaGPT drops DataInterpreter documentation two days later. The open sourcing of "everything else" reads less like a coincidence and more like someone in a Discord server saw the Devin demos and decided the time for sitting on this was over.

There is a specific experience — I've had it, you've had it — where you build a workflow for something tedious, automate the miserable parts, and then sit with the realization that the thing you spent six weeks building can now be replicated in an afternoon by a model you didn't train. The workflow wasn't the point. The point was that you knew how to build it. Now that's not enough either.

DataInterpreter is that experience but for data science. Not for some hypothetical future data scientist. For the one currently employed.

The AGI convergence theory — that all these pieces will eventually merge into something that pops — is probably right in the same way that "eventually we all die" is right: technically accurate, not super actionable, and mostly used by people who want to feel prophetic without doing the work of being specific. But the pieces are landing faster than the theories can account for. By the time you've updated your mental model, there's another demo.

The thing I keep coming back to isn't whether some final merged model has a double helix spatial embedding structure (it won't, but the person who believes this is at least paying attention). It's the cadence. The gap between "that seems like science fiction" and "here are the docs" is now measured in months.

Check the examples. That's the actual news.

DataInterpreter Will Do Your Data Science Job While You're Still Reading the README

Counterpoints