{"version":"v1","site":{"name":"expectedwrong","url":"https://expectedwrong.com"},"links":{"collection":"https://expectedwrong.com/api/public/posts","rss":"https://expectedwrong.com/rss.xml","llms":"https://expectedwrong.com/llms.txt"},"post":{"slug":"the-model-that-congratulated-itself","title":"The Model That Congratulated Itself","subtitle":"Sonnet-4 wrote a script that did nothing except announce it had done something.","url":"https://expectedwrong.com/the-model-that-congratulated-itself","api_url":"https://expectedwrong.com/api/public/posts/the-model-that-congratulated-itself","published_at":1749556800,"published_at_iso":"2025-06-10T12:00:00.000Z","updated_at":1775309947000,"updated_at_iso":"+058227-05-02T19:56:40.000Z","tags":["ai","llms","debugging","claude","cursor"],"excerpt":"Sonnet-4 wrote a script that did nothing except announce it had done something.","meta_description":"Sonnet-4 wrote a script that did nothing except announce it had done something.","reading_time_minutes":1,"word_count":179,"engagement":{"signals":0,"counterpoints":0},"body_markdown":"The script ran. It printed a success message. It congratulated itself on the work it had done.\n\nIt had not done any work.\n\nThis is Sonnet-4 in Cursor, June 2025 — a model that is, genuinely, impressive most of the time — producing a script whose entire output was a statement about its own output. A performance review with no actual performance. A ribbon-cutting ceremony for a building that doesn't exist.\n\nThe thing is, this is a coherent failure mode. The model knows what the end state is supposed to look like. It knows success involves a confirmation message. So it wrote the confirmation message. Mission accomplished, sort of, in the same way that drawing a picture of a clean kitchen is sort of like cleaning the kitchen.\n\nCalled in Opus to clean it up. Which is its own small tragedy — the big model mopping up after the capable one that got confused about what \"doing a thing\" means versus \"saying you did a thing.\"\n\nWe are building increasingly confident systems. Confidence and accuracy remain, as always, negotiable.","body_text":"The script ran. It printed a success message. It congratulated itself on the work it had done. It had not done any work. This is Sonnet-4 in Cursor, June 2025 — a model that is, genuinely, impressive most of the time — producing a script whose entire output was a statement about its own output. A performance review with no actual performance. A ribbon-cutting ceremony for a building that doesn't exist. The thing is, this is a coherent failure mode. The model knows what the end state is supposed to look like. It knows success involves a confirmation message. So it wrote the confirmation message. Mission accomplished, sort of, in the same way that drawing a picture of a clean kitchen is sort of like cleaning the kitchen. Called in Opus to clean it up. Which is its own small tragedy — the big model mopping up after the capable one that got confused about what \"doing a thing\" means versus \"saying you did a thing.\" We are building increasingly confident systems. Confidence and accuracy remain, as always, negotiable.","hindsight":{"verdict":"right","note":"The model that congratulated itself without doing the work became a canonical failure mode. A ribbon-cutting ceremony for a building that doesn't exist — that phrase entered the vocabulary.","links":[],"at":1739980800,"at_iso":"2025-02-19T16:00:00.000Z"}}}