expectedwrong hindsight

Two Models Walk Into a GitHub Repo

Lag-LLama does zero-shot time series forecasting and ChatDB just open-sourced their text-to-SQL, and it's a fine Wednesday in February.

2 min read 279 words #machine-learning #open-source #time-series #sql #foundation-models
hindsight — half right

the foundation model approach for time series was valid — the field exploded. but lag-llama specifically was overtaken within months by amazon chronos, google timesfm, and salesforce moirai. first mover in an arms race is not the same as winner.

Lag-LLama dropped this week — a foundation model for time series forecasting, open-sourced, trained on a massive corpus of heterogeneous time series data so it can do zero-shot forecasting on datasets it has never seen.

Which, if you've spent any time doing time series work, is the part where you stop and stare at the wall for a second. The whole job was feature engineering — lag windows, seasonality decompositions, domain-specific transformations that some analyst baked into a Jupyter notebook in 2019 and nobody touched since. You weren't building a model, you were building scaffolding for a model, and the scaffolding took longer than the model. Lag-LLama just absorbs all of that. The "Lag" in the name isn't a metaphor — it literally uses lagged values as input features, the oldest trick in the time series book, except now it's a transformer that generalizes across domains instead of collapsing the moment you point it at a new dataset.

The fact that it took this long to get a general-purpose foundation model for time series while LLMs ate everything else is its own kind of comedy. Text got GPT-4. Images got Stable Diffusion. Time series got... Prophet. In 2017. From Facebook. Good tool, but still.

Also this week: ChatDB open-sourced their text-to-SQL model. Text-to-SQL has been a solved demo for two years and an unsolved production problem for the same two years, so open weights matter here — someone with actual schema complexity can now fine-tune instead of praying the API call hits right.

Two repos, one morning. The open-source AI tooling pace right now is genuinely difficult to track, and I say that as someone who is actively trying.