expectedwrong hindsight

openai.fm Is a Nice Place to Visit

OpenAI ships a text-to-speech demo that sounds like a person, which is fine, everything is fine.

2 min read 279 words #openai #tts #voice #ai
hindsight — still happening

openai.fm is still a nice place to visit. The voice quality is still the kind that makes you sit back slightly. The system prompt for voices is still the interesting part.

OpenAI launched openai.fm today — a demo page for their new text-to-speech model — and it is, regrettably, very good.

You type something. You pick a voice. You write a little system prompt describing how the voice should behave — nervous, cheerful, reading bedtime stories to a child — and then it reads the thing. In that voice. With the energy you described. With the little pauses and the breath and the cadence of a person who has opinions about what they're saying.

The voices have names. Of course they do.

What gets me isn't the quality, though the quality is the kind that makes you sit back in your chair slightly. It's the system prompt thing. The idea that you can prompt a voice the same way you prompt a model — speak slowly, like you're explaining something you've explained a hundred times and you're tired of it — and then it just does that. The tired-of-it-ness is in there. You can hear it deciding to be tired.

This is either an incredible tool for accessibility, audiobooks, and small developers who need a phone tree, or it's the last week that voice acting was a stable profession, and both things are probably true at the same time.

The demo is a beautiful piece of product design. Clean, minimal, just enough controls to understand what's possible without overwhelming you. They clearly knew what they were shipping. The site exists to make you feel something, and the something is a specific kind of mild vertigo — the vertigo of a capability arriving fully formed, with a landing page and a Twitter announcement, in the middle of a Thursday.

Oh dear.