The Feature Film Is a Parameter
CoDi makes every media format a render option, not a production decision.
CoDi pointed at the right future — composable multimodal generation — but "the landing page is a parameter" is still more aspiration than reality. Multimodal models got dramatically better, but the seamless any-in-any-out pipeline CoDi promised is still emerging. The feature film is not yet a parameter. Maybe a few frames of it.
CoDi dropped this week and the technical pitch is composable multimodal diffusion — any combination of inputs, any combination of outputs, modalities sharing a latent space so the model understands "image plus text" as naturally as it understands either alone.
The non-technical thing it implies is that a landing page is no longer a landing page.
Here's the use case: you want to launch something. You feed in your inputs — concept, brand, audience, product details, whatever. What comes back isn't a page. It's a page, and a marketing video, and a podcast, and a book on tape, and a song, and — per where this is clearly going — a feature film. All at once. All from the same latent encoding of what the thing actually is.
The feature film part sounds absurd until you follow the logic. The division between media formats was always a production constraint, not a conceptual one. Your product idea exists independently of whether it gets rendered as text or audio or moving images. A model that understands the idea in a format-agnostic way can just — turn the format knob. At inference time. Cheap.
Every marketing department is a production bottleneck waiting to be bypassed.
The part I keep thinking about: this isn't one model for images, one for audio, one for video, duct-taped together at the seams. It's a shared space where these things are all the same thing at the right level of abstraction. That's the move. The composability isn't a feature, it's the proof that modality was always the wrong unit.
May 2023. This is where the clock starts. It will be interesting to watch how fast the gap closes between "can technically do this" and "does this well enough to be embarrassing for everyone else."
Counterpoints
Push back, extend the argument, or sharpen it. New counterpoints go through review before they show up here.
No approved counterpoints yet.