expectedwrong hindsight

The Screenshot Graveyard

A 40-line bash script that turns a folder of forgotten screenshots into a CSV and then deletes them.

2 min read 440 words #tools #local-ai #bash #apple-silicon #vlm
hindsight — still happening

Every Mac still has a screenshot graveyard. The VLM-powered organizer pattern works. Nobody has actually cleaned up their graveyard. The screenshots accumulate.

Every Mac has one. A folder — or more accurately, a scattering across the desktop, Downloads, and some directory you made in 2022 called misc — full of screenshots you took because something mattered enough to capture and then not enough to actually do anything with. Hundreds of them. Maybe thousands. You know the content is in there somewhere. You will not look through them.

I built a fix in about forty minutes, inspired by Simon Willison doing something similar with VLMs, because I have this exact problem and it was starting to feel like a personal failing.

The thing runs a small vision model — SmolVLM from HuggingFace, quantized to bf16, running entirely local via mlx-vlm on Apple Silicon — against every image in a folder, writes the descriptions to a CSV, and deletes the originals. Two scripts. The first one is a describe command that just wraps the model call:

#!/usr/bin/env bash

IMAGE="$1"
uv run --with torch --with mlx-vlm python -m mlx_vlm.generate \
  --model mlx-community/SmolVLM-Instruct-bf16 \
  --max-tokens 500 \
  --temp 0.5 \
  --prompt "Describe this image in detail" \
  --image "$IMAGE"

The second loops over a folder and dumps everything to a CSV. You point it at the screenshot graveyard. You walk away. You come back to a spreadsheet.

The output is genuinely useful. It found a browser tab I had open to some Salesforce lead data — columns, row values, the whole thing — and described it accurately enough that I would have been able to reconstruct what I was looking at. That screenshot was from months ago. I had no idea it existed.

The only dependency is uv, which you install once with a curl pipe and never think about again. The model downloads itself on first run. If you have an M-series Mac you already have everything else you need.

What's sitting underneath this, though, is something weirder. The model doesn't care what the image contains. It'll describe a screenshot, a photo, a diagram, a handwritten note, a terminal window. You could point it at your entire filesystem — everything your root terminal can access, which on a personal machine is basically everything — and build an index of your own digital life. SMS screenshots. Email attachments. Documents you scanned. The model would describe all of it, and you could graph the results, build a search interface, run queries against your own memory.

Nobody is selling this yet. There's no subscription. The data doesn't leave your machine. It runs while you sleep.

The thing I keep thinking about: someone with access to a shared network drive full of years of marketing assets could just let this run overnight on an M1 and wake up to a tagged, searchable index of the whole thing. That's not a product pitch. That's a bash script and a quiet machine.