The Infrastructure Arrived Before the Weights
Text-to-video is coming to HuggingFace diffusers, and the library is already ready for it.
The weights arrived. CogVideoX, Mochi, and others landed in the diffusers pipeline that was already waiting for them. Building the scaffolding first was the right call.
The diffusers library shipped v0.30.1 and quietly, almost as a footnote, it is now set up to support open-source text-to-video weights that don't fully exist yet.
This is the move. You instrument the library. You merge the pipeline code. You cut a release. And then you wait for the weights to show up, at which point the whole thing just works — like the scaffolding was always there and the building simply arrived one morning.
It's an interesting inversion of how this usually goes. Normally the model drops and the tooling scrambles to catch up — a two-week chaos window where the only way to run anything is someone's personal inference script posted to a Colab notebook with three broken dependencies. diffusers building first is them saying they know what's coming and they've already done the boring part.
Open-source video generation has been embarrassingly behind image generation for reasons that are obvious in retrospect — video is just harder, the compute requirements scale badly, and "good enough" for images was achievable at scales that would produce approximately nothing for video. Every model that came out this year has been either a research demo or a very expensive API. Meanwhile the image side has been eating itself alive trying to run SD3 on a MacBook.
The weights landing on HuggingFace would be the thing that matters. Not the paper, not the demo, not the benchmark table. The weights. Because once something is in diffusers with open weights, it's in everyone's pipeline by Friday.
We've been waiting for the open-source Sora moment since February. The library is ready. Something is coming.
Counterpoints
Push back, extend the argument, or sharpen it. New counterpoints go through review before they show up here.
No approved counterpoints yet.