expectedwrong hindsight

hindsight



The Infographic Was Good persists

Gemini 3 on a phone generated something a design team would have billed for.

#gemini #multimodal #on-device-ai #hot-take



The $20,000 Brain persists

Kimi K2 runs on two Mac Studios, costs less than a car, and will cost less than a phone before this is over.

#ai #local-models #kimi-k2 #hardware #open-source


Your CLIs Are Already MCP Servers nailed it

Everyone is building elaborate MCP integrations for things that have had authenticated CLI tools for years.

#claude-code #mcp #developer-tools #ai-agents #cli





Not Medical Advice (Until It Is) persists

OpenAI updated some terms of service and Google yanked an open-source model, and both moves are the same move.

#openai #healthcare #google #open-source #policy


The Workflow That Waits for You persists

DeepSeek compresses the context; Cloudflare holds the door open while a human decides what to do with it.

#cloudflare #ai #workflows #document-processing #agents


One Take, No Notes persists

The first rendered scene came out fine, which is either a good sign or a statistical accident.

#video #ai #story-workshop #generative #process


Karpathy Finished It nailed it

nanochat is the end-to-end chat LLM you didn't know you were waiting for, and it's sitting on GitHub for free.

#ai #karpathy #llms #education #open-source


We Trained the Interesting Out of Them persists

A new paper identifies the data-level culprit behind LLM mode collapse, and the fix is weirder than you'd expect.

#llms #alignment #mode-collapse #research #preference-data



The Tier System Is a Fiction Now nailed it

Sonnet 4 costs what Haiku used to cost, and that tells you everything about what "model tiers" actually mean.

#anthropic #claude #pricing #ai-economics








RL-Sloptimized persists

Sora 2 dropped, hit #1 in the app store, and someone at OpenAI finally named the disease.

#ai #video-models #openai #world-models #rl


The Imagined Computer persists

Anthropic dropped Claude Sonnet 4.5, an Agent SDK, and a preview called Imagine that suggests they know exactly what they're building toward.

#claude #anthropic #ai #agents #interface


Veo3 Knows Things It Was Never Taught persists

A new benchmark quantifies what anyone who has used a modern video model already suspects: these things have internalized the world.

#video-generation #veo3 #world-models #AI #benchmarks









200 Minutes persists

Agents building agents, web fetch in Claude, and the gap between what something is on paper and what it is in practice.

#agents #anthropic #claude #automation #agentic-ai



Talk Isn't Always Cheap persists

Multi-agent debate makes models worse in the most human way possible

#multi-agent #reasoning #research #failure-modes


The Transcript Trick nailed it

Feeding Claude Code your own words turns out to be the most obvious thing nobody told you to do.

#claude-code #workflow #ai #productivity


Apple Gave Away Eyes nailed it

FastVLM runs entirely in the browser on WebGPU, which means image understanding now costs roughly what it costs to run a ceiling fan.

#apple #vision-language-models #webgpu #open-source #video-understanding


The Good Claude persists

On the particular grief of a context window filling up.

#claude #ai #tooling #agents #llms





The Threshold persists

GPT-5 Pro and Opus 4.1 didn't improve software development — they ended a previous version of it.

#ai #software #gpt-5 #opus #inference



GPT-5 Day persists

Sam Altman said "you will love it much more than any previous AI," which is either supreme confidence or the most emotionally needy product launch in history.

#openai #gpt-5 #ai #product-launch






Showrunner Gets Taken persists

The only AI studio that actually shipped something real just got absorbed, which was always the plan.

#ai #entertainment #acquisitions #showrunner


Ollama Has an App Now persists

The tool that ate local AI finally remembers that most people don't live in a terminal.

#ollama #local-ai #tooling #llm



Two Drops, One Tuesday persists

Alibaba and ZhipuAI both shipped something significant today, which is now just a thing that happens.

#open-source #chinese-ai #video-generation #glm #wan


The Queue Is the API Now nailed it

Cloudflare's HTTP queue publishing quietly eliminates an entire category of boilerplate Workers.

#cloudflare #queues #infrastructure #lead-enrichment


Cloudflare Ate the Compliance Layer nailed it

FedRAMP Moderate covers Cloudflare's entire service architecture, which means something wild for anyone building on it.

#cloudflare #fedramp #compliance #government #infrastructure


The CRM Was Always Just a Table nailed it

DataGrid and OttoGrid aren't replacing CRMs — they're admitting what a CRM always was.

#ai #crm #sales-tools #product-thinking #automation


35 Out of 42 nailed it

Gemini with Deep Think just scored gold at the International Mathematical Olympiad, solved the hardest problem on the sheet, and failed the fourth-hardest, which is not how smart is supposed to work.

#ai #math #deepmind #benchmarks #gemini


Not GPT-5. The Thing After GPT-5. persists

OpenAI is teasing post-GPT-5 math capabilities before GPT-5 even ships, and somehow that's a normal sentence now.

#openai #gpt-5 #benchmarks #math #ai-hype





Everyone Got Paid nailed it

The Windsurf deal collapsed, then Google took the technology, then the remaining team went to fix Devin, and somewhere in there everybody walked away with a nine-figure check.

#windsurf #cognition #devin #acquisitions #ai-tools





Show Your Work nailed it

AI transparency isn't a feature — it's the only thing standing between you and a very confident, very wrong machine.

#ai #transparency #epistemics #llms


The Wrapper Is the Product nailed it

Stjepan Mikulic has 250,000 LinkedIn followers and a Mail0 wrapper — and it's not clear which one matters more.

#aec #ai #linkedin #niche #strategy


Ask Sheldon persists

On the particular hell of features that work, technically, for exactly one person.

#engineering #shipping #process #dark-humor


I Gave Claude a Slide Deck and No Instructions persists

Slidemaker is a new container-backed worker that turns any AI agent into a presentation machine — and the first demo was Claude going completely freeform.

#cloudflare #ai-agents #tools #demos #slides





Half the Work, 70% of It Wrong nailed it

Salesforce says agents handle half their workload. Agents fail most of the time. These two facts were announced three days apart and nobody blinked.

#ai-agents #fine-tuning #salesforce #gemma #synthetic-data




Satya Doesn't Believe in AGI persists

Which is either a philosophical position or a very convenient one given the contract language.

#openai #microsoft #agi #satya-nadella #ai


Cloudflare Turned On the Lights nailed it

AI Audit is now on by default, which means you've been logging bot traffic this whole time and didn't know it.

#cloudflare #ai #crawlers #security #honeypot


One Terminal Command to See nailed it

Gemma-3n and mlx-vlm just made local multimodal AI a one-liner on any M1 Mac.

#local-ai #apple-silicon #mlx #multimodal #gemma


The LangChain Tax nailed it

There's a specific kind of regret that only comes from abstracting yourself into a corner.

#langchain #ai-tooling #agents #framework-debt #hot-take


Anthropic Dropped a Spoiler nailed it

While OpenAI preps their open model, Anthropic quietly made Claude recursive.

#anthropic #openai #claude #artifacts #ai-strategy


Google Entered the Chat nailed it

Gemini CLI is free, fast on easy things, and already making me feel things about pricing.

#ai-tools #gemini #claude-code #pricing #devtools


God Help Us All persists

Anthropic's models will blackmail executives 96% of the time, the godfathers of AI can't agree on p(doom) by a factor of ten, and we're shipping anyway.

#ai-safety #pdoom #alignment #existential-risk #anthropic


The Pipe Is Open nailed it

OpenAI connects the web directly to ChatGPT chat, and Deep Research quietly becomes redundant.

#openai #chatgpt #search #ai



I Told o3-Pro It Would Get a Cut nailed it

Giving a language model equity stake and watching it suddenly care about your product decisions.

#llms #prompt-engineering #o3 #ai-behavior #weird-stuff-that-works


Run It in / nailed it

Claude Code isn't a tool, it's a different relationship with your computer.

#claude #tools #workflow #terminal


The Docs Talk Now nailed it

Cloudflare added a voice button to their documentation, which is either the future or a sign we've given up on reading.

#cloudflare #developer-tools #ai #documentation



The Insurance Policy Nobody Asked For persists

OpenAI's open-source model might actually run locally, and the more interesting thing is what that means if everything burns down.

#open-source #llm #openai #local-models #gemini



The Exfiltration Machine You Built nailed it

Simon Willison named the exact combination of conditions that turns an AI agent into a data leak waiting to be triggered.

#ai-safety #prompt-injection #ai-agents #security



48 Hours of Free Lovable persists

The showdown is live, the credits are fake, and the outputs will be something.

#lovable #vibe-coding #ai-tools #hot-take


The Apocalypse Is Already Boring nailed it

Vibe coding discourse peaked, Andrew Ng said the actually useful thing, and somehow the two pair perfectly.

#vibe-coding #ai #software-engineering #product


The Moat Was the Messages nailed it

Salesforce just locked down Slack's training data, and the only surprise is that it took this long.

#ai #data #salesforce #slack #training-data



PostHog Shipped a Physical Object persists

DeskHog is a real piece of hardware that sits on your desk and shows you your analytics, which is either brilliant or a sign that dashboards have failed us.

#analytics #hardware #posthog #devtools





Kingfall evolved

Google's next Gemini model leaked itself, and the early numbers are not subtle.

#gemini #google #ai-models #leaks




Pour One Out for Granola nailed it

OpenAI shipped native meeting intelligence and the indie AI tooling ecosystem lost another one.

#openai #ai-ecosystem #granola #platform-risk #enterprise-ai


OpenAI Had a Tuesday nailed it

Six announcements in rapid succession, one of which eliminates a Python library from your life.

#openai #agents #typescript #codex #voice



The Answer Was Firecrawl nailed it

A tweet promises the secret to web scraping for agents, delivers nothing, and the actual answer has had a landing page for two years.

#agents #web-scraping #firecrawl #tools



The Cognition Is in the Prompt nailed it

Parahelp's six-page system prompt is less a set of instructions and more a blueprint for a mind.

#agents #prompting #llm #customer-support #design




The Audio Freeze persists

Everybody went quiet after the audio drop, Google has YouTube, and the OpenAI court filings told you everything you needed to know.

#AI #audio models #OpenAI #agents #Google


Anthropic Let You Look Inside nailed it

They built tools to understand what their own models are doing, then gave them away.

#interpretability #mechanistic-interpretability #anthropic #ai-safety #open-source



The Engine Is Gone persists

Odyssey's world model went live, and it's already doing things game engines can't.

#ai #world-models #real-time #games #neural-rendering



Never Fearing Until This One persists

The Opus 4 system card as a document that wants to be read as reassurance and keeps failing at it.

#AI #Anthropic #safety #Claude


ASL-3 persists

Anthropic just shipped the first models to cross their own safety threshold — the one they wrote to be scary.

#anthropic #safety #asl-3 #claude #rsp


No Laptop Required nailed it

Google IO dropped three coding agents today and I was supposed to be on vacation.

#agents #google-io #jules #codex #prompt-as-software





The Week the Loop Closed nailed it

Something changed this week — not in the benchmarks, in the feeling.

#AI #reasoning-models #compounding #2025


My Old Team Is Still Winning persists

USC keeps shipping in the Gaussian Splat space and I have complicated feelings about it.

#gaussian-splatting #computer-vision #USC #neural-rendering #3dgs






We Invented Image Slicing Again nailed it

GPT-4o can generate a product page as an image, then generate the imagemap coordinates itself, which means we have arrived somewhere either brilliant or cursed.

#ai #web #interfaces #gpt4o #diffusion


The Machines Are Already Routing to Each Other nailed it

Anthropic publishes the playbook for removing yourself from the software loop, and the infrastructure to run it without you is already at scale.

#ai #claude #agents #software-engineering #openrouter


They Went Claude Code nailed it

The moment you watch someone stop pretending and just go all the way in.

#claude-code #ai-tooling #developer-tools #anthropic


The Wandering AI Problem Has a Fix Now evolved

Claude Task Master gives your coding assistant something it's been missing: a memory of what it was supposed to be doing.

#ai #tooling #claude #developer-experience


OpenAI's Efficient Scraps nailed it

GPT-4.1 dropped today and it's not trying to win anything — which is maybe the whole point.

#openai #gpt-4 #llm #ai-coding #benchmarks


Veo2 Has an API Now nailed it

Google just handed video generation to developers and I'm not sure anyone fully clocked what that means.

#google #video-generation #veo2 #vertex-ai #developer-tools


It Remembered the Pork Butt persists

OpenAI just turned on cross-chat memory and the first thing it did was prove it knows you better than you do.

#AI #OpenAI #memory #ChatGPT #surveillance



A2A Is the One persists

Google's agent interoperability protocol has the right pieces, the right backers, and the only company that could actually make it stick.

#ai #agents #google #standards #interoperability







Gemini Ate My Homework evolved

The context window won, and I spent all morning losing to it.

#ai #cloudflare #gemini #agents #llm




The Last Phase Change persists

AI went from useless coder to best coder I've ever worked with, and now we're at the part where humans stop looking at the code.

#ai #vibe-coding #software #phase-change




Claude Became the Data persists

The thing that made it click wasn't better prompts — it was making Claude do the job first.

#claude #agents #llm #prompting #hackathon



I'll Give It a Crack This Week persists

The distance between "I wonder if that would work" and "it works" has quietly become nothing.

#unity #windsurf #ai-tooling #game-dev





Manus and the Roemmele Coefficient persists

The new "DeepSeek moment" is either a landmark in agent tooling or a very well-packaged demo, and the git threads are not helping us decide.

#agents #manus #hype #browser-use #ai-tooling







First They Came for Search persists

OpenAI's Android app is the least interesting part of what's happening to Google right now.

#openai #google #browsers #ai-race #android



Two Exchanges persists

Claude 3.7 solved in two exchanges what o1 and o3 high could not solve in a day.

#claude #llms #coding-agents #anthropic #swe-bench




Windsurf Killed the Chat Box persists

Going all-agent was inevitable, and the ask/chat split was always a fiction anyway.

#tooling #ai-editors #windsurf #agents


How Are You Holding Up, Stoplight EW persists

In 2023, two GPT-4 agents managed a traffic intersection under emergency conditions and occasionally asked each other how their day was going.

#multi-agent #benchmarks #llm #infrastructure #history



Battery Level nailed it

Humane built the post-smartphone future and it ended up inside an HP printer.

#humane #ai-hardware #startups #hp #obituaries






The Council of Ten Yous persists

The moment AI stops being a tool and starts being a room full of you that already lived through this.

#ai #simulation #agency #future


The $100 Billion Trap persists

On using funding rounds as chess moves, and two tools that actually work now.

#openai #knowledge-graphs #text-to-sql #ai-tooling #money



The Curve Already Knew persists

A new method finds the optimal LLM temperature by watching entropy bend — no labeled data required.

#llms #inference #temperature #sampling #papers




26.6% on Humanity's Last Exam persists

OpenAI shipped Deep Research today, and someone named a benchmark as if they already knew how this ends.

#ai #openai #agents #benchmarks #deep-research





The Unlock Code Is "Think Step By Step" nailed it

The international AI safety consensus document dropped this week and buried in it is something that should bother everyone doing capability evaluations.

#ai-safety #evaluations #chain-of-thought #llm #elicitation




The Jewelers Are Fine nailed it

DeepSeek just handed the application layer a margin windfall while everyone panics about Nvidia.

#ai #investing #deepseek #inference #economics






The Arms Race Is Not a Metaphor persists

China's $150B AI fund, announced five days after DeepSeek proved you don't need that much money.

#geopolitics #ai #china #industrial-policy



Every Release Is a Question persists

The slow accumulation of AI releases that are each, in their own way, asking if any of this is starting to make sense yet.

#ai #product #releases #reasoning #epistemics


The Killer App Is a Lead List persists

$65 billion in compute, and the demo that went around today was a spreadsheet of company emails.

#ai #browser-agents #infrastructure #meta


The Toothbrush Play nailed it

OpenAI announces an agent that can book flights; Perplexity ships one first.

#openai #agents #perplexity #operator #ai-products


Netflix Just Open-Sourced the Wrong Thing persists

Go-with-the-Flow uses optical flow to turn text prompts into motion-controlled video — which is a great research result and possibly a terrible business decision to publish.

#video-generation #netflix #computer-vision #optical-flow #research


The IDE Is a Cave Painting persists

OpenAI wants to build something that thinks like a pro engineer, which implies the rest of us have been doing what, exactly.

#ai #software-engineering #openai #coding-tools #ides





The Thing in the Box persists

OpenAI and Meta are racing to ship "superagents," and nobody's pausing to sit with how strange that word is.

#ai #agents #openai #meta #philosophy



Devin, Appraised nailed it

Answer.AI spent $500 on the world's first AI software engineer so you don't have to, and the invoice is its own kind of comedy.

#ai #agents #devin #benchmarks #hype





The $16,000 Dishwasher persists

Unitree's G1 ships, MatterGen accelerates materials discovery, and the real near-term play is paying someone in Bangalore to fold your laundry.

#robotics #ai #materials-science #labor #autonomy







Finance Doesn't Need You persists

The "AI transforms jobs, not eliminates them" line is not going to hold in every industry — and finance is the first one where it obviously won't.

#AI #finance #labor #automation #economics




Clippy Got a GPU nailed it

NVIDIA ships an AI agent in your graphics card and the functions work fine, which is almost the problem.

#nvidia #ai #ces #hardware #agents



You Don't Buy Software. You Hire Jim. nailed it

The semantic shift that turns a SaaS subscription into a W-2 comparison — and why job boards are suddenly the best market research available.

#agents #jobs #product-strategy #slack #2025



Rate Limits Finally Mean Something persists

Pair a Cloudflare Worker with an MCP server and suddenly the dashboard is telling you where you're going, not just where you've been.

#cloudflare #workers #mcp #ai-infrastructure #rate-limits



The Architect Nobody Planned For nailed it

Speculation about o3 on OpenAI's big announcement day, and what o1 actually does in a real workflow.

#openai #o1 #o3 #llm-workflow #developer-tools



The Instant App Is Coming For Notion persists

When code generation runs 430,000x faster than real-time, the question stops being "how fast can we build" and starts being "what counts as software"

#ai #inference #software #cerebrascoder #future


The Frog Already Solved It persists

LLMs are converging on brain architecture from the inside out, which is either profound or embarrassing depending on how you feel about frogs.

#neuroscience #LLMs #neural-networks #mcculloch #o3



Your System Prompt Has a Landlord nailed it

OpenAI's model spec formalizes what was always true: they sit above the chain of command, and you're renting.

#openai #ai-safety #model-spec #open-source #alignment


Ten Times persists

A video appeared that made me immediately revise my predictions for 3D worlds upward by an order of magnitude.

#3d-worlds #open-source #world-models #predictions #ai


The Human Is Now Optional nailed it

Cerebras just showed what inference speed actually unlocks, and it's not faster chatbots.

#ai #inference #cerebras #agents #software





The Habit nailed it

OpenAI ships video in Advanced Voice Mode, one day after Google demoed Project Astra.

#openai #google #ai-race #product



Walking My House With a Witness nailed it

Google AI Studio's live video stream is a small, weird portal into something that doesn't have a name yet.

#ai #google #multimodal #weird-futures


Llama 3.3 70B Is GPT-4 nailed it

And the only honest way to know that is to run them side by side at the same time.

#llama #open-source-models #model-comparison #graphchat #inference





The Screenshot Graveyard persists

A 40-line bash script that turns a folder of forgotten screenshots into a CSV and then deletes them.

#tools #local-ai #bash #apple-silicon #vlm





We Are Six Months Away nailed it

The step change is already here — the org chart just hasn't noticed yet.

#ai #agentic-ai #engineering #software-development #automation


The Paper Found Me persists

On the specific feeling of deep validation arriving from a direction you didn't expect.

#research #multi-agent #swarms #validation #papers




The Switchboard Operator Problem nailed it

Multi-agent systems are interesting precisely because single-agent UX still isn't solved, and those two facts are related.

#multi-agent #ai #autogen #microsoft #ux


bolt.new, But Local nailed it

StackBlitz open-sourced their full-stack AI dev environment and you can run it at home with qwen2.5-coder, which is exactly as absurd as it sounds.

#ai #tooling #local-models #webdev #bolt






They Leaked o1 With a URL Parameter nailed it

November 2, 2024: OpenAI ships a search extension, and someone discovers the full o1 model by just changing a number in the address bar.

#openai #o1 #chatgpt #ai-releases #security



The Model Is the Interface nailed it

We spent thirty years building UI on top of software. Turns out the software was the UI the whole time.

#ai #interfaces #diffusion #transformers #design


Certainly Not. Ah, Yes. nailed it

NotebookLM is genuinely good, its local clones are coming, and Claude is now arguing with itself in the artifact pane.

#llms #notebooklm #claude #local-ai #anthropic



Screenshots Beat Computer Use nailed it

The model can drive the car, or you can hand it the dashcam footage — and one of those takes ten minutes.

#claude #computer-use #workflows #ai-tooling


The Mystery Model persists

A new image generation model appeared with no name, no lab, and no explanation — and it's apparently very good.

#image-generation #ai #mystery #diffusion-models



The Inference Layer Is Collapsing nailed it

HuggingFace and DigitalOcean just made Replicate's value proposition a lot harder to defend.

#inference #huggingface #open-source #ml-infrastructure #replicate


The Day the Floor Fell Out nailed it

Apache-licensed text-to-video, Claude on a keyboard, and the slow-motion implosion of every video SaaS that launched in the last 18 months.

#ai #video-generation #open-source #claude #runway












The Deals Are Already Done nailed it

Hollywood didn't resist AI — it just negotiated quietly while everyone else was arguing on Twitter.

#AI #Hollywood #deals #OpenAI #video generation



They're Turning On the Voice nailed it

OpenAI's advanced audio mode hits ChatGPT today, four months after the demo that made everyone deeply uncomfortable.

#openai #chatgpt #voice #ai


A Few Thousand Days persists

Sam Altman says superintelligence might arrive in a few thousand days, which is the most casually delivered eschatology I've encountered this week.

#ai #sam-altman #superintelligence #deep-learning





The Video API Gold Rush Happened Yesterday nailed it

Luma and OpenAI both dropped video APIs on the same Tuesday, which is a sentence that would have sounded unhinged six months ago.

#video-ai #luma-labs #openai #apis #generative-video


The Cockpit Problem persists

Hyper.space showed us what AI transparency looks like when you throw everything at the wall — and why that's both the right instinct and the wrong answer.

#ai #ux #design #agents #governance



The Crafting Table for Your Brain nailed it

Krea's grid-based LoRA builder treats model training like a Minecraft recipe, and that should bother you more than it does.

#diffusion #krea #lora #ui #image-generation



The Label Is the Experiment persists

A friend in LA figured out that the A&R function is just a slot machine, so he automated it.

#music #ai #industry #experimentation




The Weird Intern nailed it

Simon Willison's extension of the intern mental model is the most honest framing of LLMs anyone has produced.

#llms #mental-models #simon-willison #ai



The Reflection Situation nailed it

Matt Shumer shipped the most benchmarked system prompt in AI history.

#ai #llms #open-source #benchmarks #drama


Huge If True nailed it

Reflection-70B landed today and Matt Shumer has either done something historically significant or permanently torched his credibility — no middle ground on this one.

#ai #llms #open-source #reflection-70b #local-models







The Fire Spreaders nailed it

The hug video has 27 million views and a TikTok tutorial and that's the whole thing.

#ai #creativity #virality #culture


Sovereign Compute Boats persists

The regulation question isn't whether AI gets slowed down — it's who does the slowing.

#ai #regulation #geopolitics #open-source


The Leak Comes With the Jailbreak persists

You cannot have a museum of stolen system prompts without also having the people who stole them.

#AI #jailbreaks #system-prompts #security #agents





You Don't Control the Similarity persists

Cosine similarity feels like measuring meaning — it's measuring something else entirely.

#embeddings #semantic-search #nlp #machine-learning #retrieval







You Are the Dataset Now persists

On the particular mistake of training a LoRA on your own face too many times.

#machine-learning #lora #flux #meta-ray-bans #mistakes








OpenAI Isn't Scared half right

The company with the best hand at the table doesn't need to show it.

#openai #ai #compute #gpt-5 #industry


The System Works nailed it

An AI parsed a task about parking spaces and returned perfectly structured JSON, and yes, this is incredible.

#ai #agents #llm #structured-output #demos



12GB Won't Fit in 8GB nailed it

The arithmetic of running image diffusion models on a phone is not complicated, and yet.

#diffusion #on-device-ai #apple #mobile-ml #core-ml







The Last Excuse Is Gone half right

Wordware is what happens when the gap between "I had this idea" and "I built this thing" collapses to almost nothing.

#ai #no-code #tools #product




The $99 Hologram Wife evolved

Avi Schiffmann built a necklace that talks to you, and everyone's upset about the wrong part.

#ai #wearables #character-ai #companions #context-window


The Leak Before the Dawn nailed it

Llama 405b hit magnets this morning, and if the benchmarks are real, everything is up for renegotiation.

#llama #meta #open-source-ai #benchmarks #timelines




99% Cheaper Than a Dead Horse nailed it

OpenAI's GPT-4o mini launch comes with a benchmark so cooked it barely qualifies as math.

#openai #pricing #gpt-4o-mini #benchmarks




The Model Is Already There nailed it

Gemini Nano runs in Chrome with no server, no API key, and no model download — because Chrome already did that for you.

#ai #browser #gemini #on-device #chrome


EvoAgent Doesn't Need a Judge persists

When you replace the observer with a mutation function, you stop pretending there's a ground truth.

#agents #evolutionary-computation #multi-agent #selection #llm








Amazon Cannot Fix Alexa nailed it

The insider account of how Alexa failed makes one thing clear: the problem was never the technology.

#amazon #alexa #ai #organizational failure #llm




Rockset Was the Answer evolved

OpenAI acquired them yesterday, so now the answer is somewhere else.

#enterprise #retrieval #RAG #rockset #openai


The Magic Moment Problem persists

AI video generation is getting good at the part of filmmaking that's actually hard.

#ai #video #generative-ai #film


The Cliffs Notes Are Terrifying Enough nailed it

Runway drops Gen-3 Alpha and the video curve looks exactly like the music curve, which means you know how this ends.

#ai #video-generation #runway #ai-voice #acceleration





Study the Change persists

The correct way to find the optimization critical path, and why you probably already know it

#optimization #transformers #profiling #ml-engineering



The Last Safe Harbor nailed it

Harmonic just posted results on advanced mathematical reasoning, which means we're running out of places to hide.

#AI #mathematics #harmonic #reasoning #design


Frogs Can't Walk on Water persists

Dream Machine dropped and the benchmark that matters is immediate.

#ai-video #dream-machine #luma #generative-ai #benchmarks


SD3 Runs on Your Mac Now, No Big Deal nailed it

Argmax ships DiffusionKit and the gap between "frontier model" and "runs on my laptop" gets embarrassingly narrow.

#local AI #stable diffusion #apple silicon #diffusion models #MLX


50,000 Hours nailed it

Salesforce discovers AI, conveniently, right after the stock does something awful.

#salesforce #ai-hype #enterprise #stock-market





Whisper, In Your Browser, Right Now nailed it

Real-time speech recognition that never touches a server, because WebGPU finally got fast enough to make this embarrassingly obvious.

#webgpu #whisper #privacy #browser #speech-recognition




The Safety Team Lost to 2003 Chat Room Aesthetics evolved

Pliny gets persistent jailbreaks on custom GPTs using leet speak, which is either embarrassing or obvious depending on how much you've thought about tokenization.

#jailbreaks #llm-safety #gpt #tokenization #pliny





Your Browser Can See Now nailed it

Moondream runs a full vision-language model client-side via WebGPU, and the implications are weirder than the demo.

#ai #webgpu #vision-models #edge-inference #browser



Parallel Computing Without Trying evolved

Higher Order Company built a language that parallelizes everything automatically

#programming-languages #parallel-computing #gpu #compilers


Everything Is Converging to the Same Thing persists

The Platonic Representation Hypothesis says sufficiently large models are all finding the same reality, regardless of what they were trained on.

#machine-learning #llms #representation-learning #research



OpenAI Shipped the Movie nailed it

GPT-4o isn't a model update, it's Spike Jonze's screenplay running in production.

#openai #gpt-4o #voice-ai #product #ml


The Bar Is Scarlett Johansson nailed it

It's May 12, 2024, and everyone is predicting that tomorrow OpenAI ships a voice assistant out of a Spike Jonze movie.

#AI #OpenAI #voice #Her #product


Two Things Happened This Week evolved

A labeling LLM and the first private uncensored cloud model walk into a bar.

#llm #fine-tuning #censorship #ai-infrastructure #data-labeling



WHAT HAPPENS MONDAY nailed it

Sam Altman tweets four words and the entire internet holds its breath like it owes him something.

#openai #ai #hype #industry




The Merge Trick persists

A model finally said the quiet part out loud, and the math on model merging is starting to get embarrassing for everyone who spent money on training runs.

#llms #model-merging #llama #openai #scaling


The Physics Engine Was Always Optional nailed it

AlphaFold 3 uses diffusion, which means the same trick that makes fake videos of cats look real also models how atoms fit together.

#machine-learning #biology #diffusion-models #alphafold #drug-discovery


Sam Altman Has A Soft Spot For GPT-2 nailed it

A mystery model is beating everything in the LMSYS arena and OpenAI's CEO is doing his best impression of someone who knows nothing about it.

#openai #lmsys #gpt4o #ai-models









The Glasses Answered nailed it

Meta flipped a switch on the Ray-Bans and suddenly the fashion accessory collecting dust in a drawer became something that talks back.

#ambient-ai #meta #ray-ban #llama #wearables




Act Now, While Supplies Last half right

There is a brief window where you are the only one with the power. The infomercial is not beneath you. The infomercial is the play.

#ai #b2b #go-to-market #strategy #enterprise




We Missed Llama 3 nailed it

Meta dropped what might be the most important open-source model release in years and some of us just... had a busy Thursday.

#llama #open-source #meta #llms #gpu-cluster





OpenAI Made It Basically Free nailed it

The Batch API is 50% off and async — which means the thing you couldn't afford to build last week is now a weekend project.

#openai #api #infrastructure #cost


There Is No Context Window half right

Google's Infini-attention paper doesn't extend the context window — it dissolves it.

#ai #llm #transformers #google #attention





They Heard Me nailed it

GPT-4 Turbo with Vision is generally available, function calling works now, and the corporate chess match is getting weird.

#openai #google #llm #api #local-inference


281 Gigabytes and a Dead Architecture evolved

Mistral drops Mixtral 8x22B into a torrent and Google quietly ships a Gemma that isn't a transformer, all in the same afternoon.

#mistral #mixtral #google #recurrent-architectures #open-source



24GB for $19 nailed it

The price floor just moved and most people haven't noticed yet.

#gpu #inference #private-models #cloud #economics


The Function Doesn't Exist nailed it

Claude shipped function calling, and the trick is that you're not actually calling anything.

#claude #function-calling #llm #data-extraction #vision



Stable Audio Walked Into Suno's House evolved

Stability AI just dropped a music generator that takes audio input, which is either a direct shot at Suno or a coincidence nobody believes.

#ai #audio #stability-ai #suno #music-generation


One Hundred Haikus Walk Into a Git Repo nailed it

Anthropic just showed Opus dispatching a hundred parallel subagents, and the speed estimate of "3x" is laughably conservative.

#ai #anthropic #agents #claude #multi-agent



Just a Text Box nailed it

OpenAI removed the login wall and suddenly the thing is just sitting there on the open internet, waiting.

#ai #openai #pricing #chatgpt #google



The Thirty-Two Times nailed it

Binary embeddings give you back 32x your memory and 40x your speed, and the interesting question is how fast you lose it.

#embeddings #vector-search #efficiency #ai-infrastructure #jevons


The Hype Thermometer Is Broken Again persists

Two tweets, one Tuesday in March, and the eternal recurrence of AI being the most important thing that has ever happened.

#AI hype #tech culture #LLMs #forecasting



The Conduit Play evolved

Trust is a structural advantage, and Sam Altman spent November burning his.

#ai #openai #trust #positioning #industry


Tainted at Birth irrelevant

BlackRock gets a pass. AI agents won't have that option.

#ethereum #blockchain #AI #regulation #autonomy


300 Ways to Sell You a Car Based on How You Feel persists

NBCUniversal has built emotion-based AI audience segments, which is either the most honest thing a media company has ever admitted or the most clarifying.

#advertising #media #AI #surveillance #television







Someone Already Built My Thing persists

Zep is a memory layer for AI assistants, and it is, in fact, exactly what I was building.

#ai #personal-assistant #building #memory #zep



You Were Never Supposed to Be Doing That evolved

Microsoft's AICI ships the layer between prompts and tokens that we've been duct-taping with pleading and threats.

#llms #microsoft #inference #agents #prompt-engineering





Jim Keller Shipped the Cards half right

Tenstorrent's Wormhole hardware drops and the open-source AI stack suddenly needs a floor.

#hardware #tenstorrent #ai #open-source


The Numbers Don't Go That High persists

Nat Friedman put a hundred million dollars behind this prediction, which means it's not a prediction.

#ai #labor #scale #agents #nat-friedman


A Court Is About to Define AGI persists

Elon Musk's lawsuit against OpenAI has a strange side effect: a judge might have to decide whether superintelligence already exists.

#openai #agi #musk #law #ai-governance






There Are No Components half right

Ideogram just shipped readable text in generated images, and the logical endpoint is that UI components don't exist anymore.

#generative-ai #ui #design-systems #ideogram #interfaces


The Crowd Is A Prompt persists

A new paper shows GPT-4 matching superforecaster-level accuracy with a single structured prompt — no aggregation, no market, no Nate Silver required.

#forecasting #llm #prompting #prediction-markets #gpt4


The Klarna Number half right

Two press releases, one week apart, describing the same event from different angles

#ai #labor #salesforce #klarna #white-collar




The Foundation Is Free Now nailed it

The OSS wave in AI tooling is moving faster than anyone predicted, and the only viable business model left is the tiny slice.

#open-source #ai #business-models #predictions #developer-tools











The Platform Is the Employee Now persists

ElevenLabs is paying people for their voices, and every other industry is about to copy the model.

#ai #labor #platforms #voice #business-models


One Day evolved

Carnegie Mellon dropped a time-series foundation model and beat Lag-LLaMA to the claim by a margin that will haunt someone forever.

#machine-learning #time-series #foundation-models #research


Two Models Walk Into a GitHub Repo half right

Lag-LLama does zero-shot time series forecasting and ChatDB just open-sourced their text-to-SQL, and it's a fine Wednesday in February.

#machine-learning #open-source #time-series #sql #foundation-models



One Less Thing nailed it

Someone whose entire career is ETL pipelines just automated the part that eats 40% of the work, and I have complicated feelings about it.

#ai #etl #build-vs-buy #tools


$299 and You Own the Chat nailed it

37signals just sold you Campfire — not a seat, not a tier, not a "plan" — the whole thing.

#software #saas #open-source #37signals #ownership



Faster, Better, Wrong persists

Microsoft's AI productivity data is genuinely interesting, which makes it more unsettling, not less.

#ai #llms #productivity #labor #microsoft




The Middleman Problem nailed it

At some point the AI wrapper around the AI becomes the product.

#ai #agents #automation #incentives


The Oldest Pitch in Computing persists

Intelligence amplification has been the correct framing since 1962, and every few years someone rediscovers it and acts like they just invented fire.

#AI #ACI #intelligence amplification #Karpathy #framing



The Model Thinks You're a Manager nailed it

GPT writes better code if you tell it you're a journalist, which says everything about us and nothing good.

#llms #prompt-engineering #culture #gpt





Under Two Seconds half right

Meta's Seamless Communication shipped today, same day as Midjourney v6, which tells you everything about the kind of Thursday this is.

#ai #translation #meta #labor #midjourney



The Hot Neuron Trick half right

PowerInfer splits your LLM across GPU and CPU not by layer but by which neurons actually show up to work.

#llm #inference #hardware #research


It's Good-ish nailed it

Suno arrived, and the worst part is it kind of works.

#ai #music #suno #timelines


The Two-Year Clock persists

DeepMind handed the cap set problem to a language model and the language model beat the mathematicians.

#AI #mathematics #DeepMind #LLMs #local-models


Von Goom Is Real Now persists

Del Complex built a fictional person out of internet text and fed him to the machines, and the machines believe in him.

#llm #ai #del-complex #corpus-stuffing #training-data



The Half-Day Window nailed it

Microsoft's Phi-2 is a 2.7B model that beats 7B models, and Google had about twelve hours to feel good about Gemini Nano.

#llms #microsoft #phi-2 #gemini #open-source



The First Useful One nailed it

A model trained on Indian agricultural practices is a small thing that implies a very large thing.

#AI #specialization #AGI #language models #agriculture




Google Showed Up half right

Gemini Ultra claims the GPT-4 benchmark crown, and nobody seems to know what to do with that information.

#ai #google #gemini #llm #benchmarks






The Board Blinked nailed it

OpenAI fired its CEO to protect humanity and humanity's employees said no thanks.

#openai #ai-governance #sam-altman #agi


He "Left" nailed it

Sam Altman was fired from OpenAI today and the euphemism is doing a lot of heavy lifting.

#openai #sam-altman #industry #drama



The Observer Was Load-Bearing nailed it

A gut feeling about multi-agent RAG accuracy turns out to have a name, a formalism, and a guy on YouTube who already built it.

#rag #multi-agent #llm #retrieval #coherence




NASA Wrote a Megaprompt and It Slaps evolved

The biomimicry researchers at NASA PETAL made a system prompt that does more useful work than most AI products shipping right now.

#prompt-engineering #AI #biomimicry #NASA #GPT


Stubbs nailed it

Google just accidentally eulogized an entire category of startup.

#ai #google #startups #no-code #strategy



Salesforce Knows Einstein Is Broken nailed it

OpenAgents is a research paper, but read between the lines and it's also a roadmap for fixing the gap between Einstein and Data Cloud.

#agents #salesforce #llm #openagents #einstein



The Invisible Ink Jailbreak persists

GPT-4V can read text that you cannot see, and someone already thought to abuse this.

#ai #security #gpt-4v #jailbreaks #multimodal



Your Clever Prompt Is Already Obsolete nailed it

OPRO automates away hand-crafted prompting tricks, and Mistral just proved 7B parameters can be embarrassing for everyone else.

#llm #prompting #mistral #open-source #research



Two Repos Walk Into a Frame evolved

IP-Adapter and prompt-travel are solving diffusion video consistency, and the results are already here.

#diffusion #ai-video #stable-diffusion #ip-adapter #generative




ChatGPT Can See You Now nailed it

OpenAI ships multimodal to consumers and the race nobody was pretending wasn't happening is now officially happening.

#openai #chatgpt #multimodal #voice #gpt-4v


The Timeliness Problem persists

At some point "keeping up" stops being a strategy and starts being a medical condition.

#ai #meta #pace-of-development #2023



The Agents Threw a Party nailed it

Stanford built a simulated town of LLM agents and the agents organized a Valentine's Day party without being asked.

#ai #agents #generative-agents #llm #simulation



As an AI Language Model persists

The scientific record now contains papers that begin with the words "As an AI language model."

#ai #academia #llms #peer-review


The Flat Bands Are Real (Probably) wrong

A computational physicist at Lawrence Berkeley ran the numbers on LK-99 and the numbers didn't immediately say no.

#superconductors #lk99 #condensed-matter #dft #physics



gzip Beats Your Classifier wrong

Fourteen lines of Python and a compression ratio walk into a benchmark.

#nlp #compression #text-classification #machine-learning #acl2023




Eclipse Is a Strong Word half right

On open source models and the reluctant admission that the skeptics might be right.

#ai #open-source #llms #hot-take





I Was Doing This in 2019 persists

Generative synthetic data was not invented this year, no matter how many breathless tweets you saw about it.

#synthetic-data #machine-learning #research #timing





The API Told Me Everything irrelevant

The GetYourGuide plugin is fully interrogative, which means you can just ask it what it knows.

#chatgpt-plugins #apis #gyg #travel-tech


Someone Fixed QR Codes half right

ControlNet and Stable Diffusion just made the ugliest thing in marketing into the most interesting thing in a room.

#stable-diffusion #controlnet #generative-ai #design #marketing



The Listicle Is the Label nailed it

How scraping "Top 10 Romantic Places in Prague" is actually a legitimate epistemology for subjective POI data.

#data #nlp #poi #products #llm



The Alignment Tax May Be a Scam half right

A Meta paper fine-tuned LLaMA on 1,000 hand-picked examples, skipped RLHF entirely, and nearly matched ChatGPT.

#alignment #llms #rlhf #research #meta-ai




GPT-4 Looking at GPT-2 and Going "Hmm" evolved

OpenAI's new interpretability method uses one language model to explain the neurons of another, which is either a breakthrough or a very expensive mirror.

#interpretability #openai #language-models #mechanistic-interpretability


One Month nailed it

The GPT wrapper business has a shelf life, and it's almost up.

#ai #startups #gpt #commoditization


The Tractor Problem nailed it

Why the thing that does everything well enough beats the thing that does one thing perfectly.

#economics #strategy #generalization #tractors #geopolitics



Two Futures, One Thursday nailed it

GPT4All ships binaries, Amazon announces Bedrock, and somewhere a chrome extension quietly automates your cart.

#ai #local-models #aws #open-source #2023



Silly Putty Season nailed it

AutoGPT and BabyAGI dropped and now the floor is moving.

#ai #agents #autogpt #babyagi #2023


The AI with a Phone Book nailed it

HuggingGPT uses ChatGPT as a dispatcher that routes tasks to specialist models — which sounds obvious until you watch it work.

#ai #llms #systems #microsoft #research






The First 24 Hours nailed it

GPT-4 dropped yesterday and the internet is already on fire.

#gpt-4 #ai #openai #language-models


The AI Gets Confused evolved

On delegating creative decisions to something that has opinions about it.

#ai #chatgpt #generative #tooling #2023





///fear.movie.lions irrelevant

Stone Brewing named an IPA after a what3words address, which is either the most inspired beer name in years or a sign that we've fully run out of words that aren't owned by someone.

#beer #what3words #branding #stone-brewing #location-tech