expectedwrong hindsight

Screenshots Beat Computer Use

The model can drive the car, or you can hand it the dashcam footage — and one of those takes ten minutes.

2 min read 227 words #claude #computer-use #workflows #ai-tooling
hindsight — nailed it

Screenshots still beat computer use for most practical tasks. The Willison insight — hand the model the dashcam footage instead of letting it drive — remained the efficient approach. Computer use improved but screenshots stayed faster.

The week computer use dropped, I almost pointed Claude at my screen to retrieve some data — move it from one place to another, the kind of task that sounds exactly like what computer use is for.

I took screenshots instead. Pasted them in manually. Done in ten minutes.

This came from something Simon Willison did — he recorded himself working and fed the video to the model as context. The observation embedded in that experiment: you don't need the model to drive the car if you can hand it the dashcam footage. Retrieval and reasoning are different problems. One of them is solved by a camera.

Computer use is genuinely impressive as a capability. It is also slow, prone to derailing at the first unexpected modal, and requires you to sit there watching it move a mouse like it's 2007 and you're babysitting an AutoHotkey macro.

Screenshots are free. Screenshots are instant. The model reads them fine.

The pattern Simon found — record what you're doing, give the model the recording — is the same insight rotated ninety degrees. You are the computer use. You go get the thing. The model does the reasoning with the thing.

This will stay true until computer use is fast enough that supervising it costs less than doing it yourself. We are not there. We are very much not there.