{"version":"v1","site":{"name":"expectedwrong","url":"https://expectedwrong.com"},"links":{"collection":"https://expectedwrong.com/api/public/posts","rss":"https://expectedwrong.com/rss.xml","llms":"https://expectedwrong.com/llms.txt"},"post":{"slug":"ten-million-token-context-window","title":"Ten Million Tokens and Nowhere to Go","subtitle":"The context window arms race has lapped the use cases.","url":"https://expectedwrong.com/ten-million-token-context-window","api_url":"https://expectedwrong.com/api/public/posts/ten-million-token-context-window","published_at":1743940800,"published_at_iso":"2025-04-06T12:00:00.000Z","updated_at":1771554022,"updated_at_iso":"2026-02-20T02:20:22.000Z","tags":["ai","llm","context-windows","api"],"excerpt":"The context window arms race has lapped the use cases.","meta_description":"The context window arms race has lapped the use cases.","reading_time_minutes":2,"word_count":236,"engagement":{"signals":0,"counterpoints":0},"body_markdown":"Nobody has shipped a 10 million token context window through an API yet, and I want to watch someone do it with the same energy you'd watch someone fire a cannon inside a room — because we don't really know what happens, and the people saying they do are guessing.\n\nWe're sitting at one and two million tokens right now — Gemini staked out that territory — and the honest answer to \"what do people use it for\" is: not much, yet. Some document processing. Some lawyers feeding in entire case files and asking for a summary they could have gotten with RAG. The context is there; the workflows built around it are not.\n\nBut 10M is different in the way that a difference in degree becomes a difference in kind. You can fit a codebase in there. You can fit a person's writing output across a decade. You can fit the entire documentation of a mid-sized company and ask a question that touches all of it.\n\nThe scary part isn't the capability. The scary part is the pricing math that makes it usable — every token you put in costs money on the way in and money on the way out, and nobody's figured out how to charge for 10M tokens in a way that doesn't make finance people audibly inhale.\n\nSomeone will crack it. Then we'll find out what we actually wanted it for.","body_text":"Nobody has shipped a 10 million token context window through an API yet, and I want to watch someone do it with the same energy you'd watch someone fire a cannon inside a room — because we don't really know what happens, and the people saying they do are guessing. We're sitting at one and two million tokens right now — Gemini staked out that territory — and the honest answer to \"what do people use it for\" is: not much, yet. Some document processing. Some lawyers feeding in entire case files and asking for a summary they could have gotten with RAG. The context is there; the workflows built around it are not. But 10M is different in the way that a difference in degree becomes a difference in kind. You can fit a codebase in there. You can fit a person's writing output across a decade. You can fit the entire documentation of a mid-sized company and ask a question that touches all of it. The scary part isn't the capability. The scary part is the pricing math that makes it usable — every token you put in costs money on the way in and money on the way out, and nobody's figured out how to charge for 10M tokens in a way that doesn't make finance people audibly inhale. Someone will crack it. Then we'll find out what we actually wanted it for.","hindsight":{"verdict":"evolved","note":"Context windows kept growing but the honest question — what do people actually use ten million tokens for — is still being answered. The \"difference in kind\" prediction was right; the killer workflow for it is still emerging.","links":[],"at":1739980800,"at_iso":"2025-02-19T16:00:00.000Z"}}}