Superhuman Speech

· Zach Ocean

Sometimes it feels like superhuman text to speech (and generative audio in general) is solved.

But it’s not—even though TTS models are close or even superior to average humans in tasks like audiobook narration, they’re still way worse than the best humans in most speech domains.

What follows is a collection of links that I believe represent the highest level of human speech generation.

This list was made off the top of my head in just a few minutes, so please send me more that come to mind.

Also, please send me examples of AI-generated outputs or demos that approach or exceed these in terms of quality. Contact me here.

Monologue

Coffee Is For Closers

Drill Sergeant - Full Metal Jacket

Patrice O’Neal Breaks Down Radiohead’s Creep

Eric Thomas (the Hip Hop Preacher)

Raw

Produced

Dialogue

Jim Carrey on David Letterman

Royale With Cheese

There Will Be Blood - ending

My Cousin Vinny - Final Court Scene

Musical Vocals

Twinz