The Spreadsheet That Decides Your Job Is Fine, Actually
A new paper maps GPT-4's labor market exposure and the numbers are either reassuring or horrifying depending on how you read them.
The 80%% exposure number was directionally right but the apocalypse didn't arrive on schedule. By late 2025 there's real displacement in software and data jobs, but the broader labor market shows more continuity than collapse. The paper gave everyone permission to panic and permission to be fine, and both turned out to be warranted.
There is a genre of paper that arrives dressed as analysis but functions as permission — permission to feel okay about the thing that is happening, or permission to panic, or both simultaneously. The new OpenAI and UPenn paper on GPT-4's labor market exposure is this genre perfected.
The headline number is that roughly 80% of US workers will have at least 10% of their tasks affected by GPT-4-class models. About 19% will see more than half their tasks exposed. The researchers used O*NET, which is the government database of occupations broken down into discrete tasks — a kind of ultimate reductionist view of work, where a radiologist is a list of verbs and a paralegal is a different list of verbs, and you can just go through them one by one and ask whether a language model can do this now.
And the answer, surprisingly often, is yes.
What makes the paper interesting is what it isn't claiming. It's not claiming 19% of workers are about to be fired. It's not claiming GPT-4 does these tasks well — only that it can engage with them in a way that could change how the work gets done. The distinction matters, or it doesn't, and you can spend a long time arguing about it while the world moves.
The occupations that come out highly exposed aren't the ones people expected. Tax preparers, financial quantitative analysts, writers, web and digital interface designers — these sit at the top. Physical therapists and athletes and roofers sit at the bottom. The model doesn't have hands yet. That's the moat.
What keeps nagging at me is the methodology. Human annotators and GPT-4 itself were both used to score task exposure. The model rating its own economic impact has a certain energy to it — like asking a rising tide to estimate how many basements it's planning to fill.
The paper is careful. It hedges correctly. It doesn't overstate. But there's a thing that happens when you reduce work to tasks and tasks to a binary of exposed/not-exposed — you lose the texture of what it means to do a job well inside an institution, with relationships, with accountability, with a sense of what matters. The model can draft the memo. The question of whether it should, and what you do when it's wrong, and who cares — that's still a human problem.
We're building in this world right now. Not toward it. In it.
That's the actual news. Not the exposure percentages, not the occupation rankings. The paper is describing present tense, and the tense keeps slipping in the reading because we've trained ourselves to treat this as future tense. It isn't. The thing is here and the question is what you build on top of it and whether the foundations hold.
I don't know if they hold. Nobody does. But it's a worse mistake to not look at the map than to argue with it.
Counterpoints
Push back, extend the argument, or sharpen it. New counterpoints go through review before they show up here.
No approved counterpoints yet.