Tatsumoto's guide to learning Japanese. How to use Free Software to learn Japanese, and more.

Is listening to TTS (text-to-speech) bad?

December 05, 2022 — Tatsumoto Ren

No TTS can compete with a native speaker. There's so much professionally voiced content in Japanese that going out of your way to use TTS doesn't make sense. With anime alone, you can listen to native speech all the time. Content that isn't voiced can be set aside until you know enough Japanese to read it without help.


The robot voice doesn't sound like real Japanese. Particularly, it makes a lot of pitch accent mistakes. Even if you don't count pitch accent, the computer-generated audio is still very bad. You never want to be feeding your brain toxic input.

On the word level, pitch accent data may be wrong, outdated, or there could be multiple accents. When the pitch accent depends on the usage, the algorithm often can't pick the right one.

On the sentence level, text-to-speech is even less accurate because there are rules that modify pitch accents of words in a sentence. Computers don't necessarily know these rules.

Always listen to real native audio. For example, instead of generating text-to-speech audio for a book, download an audiobook. Instead of adding TTS audio to your Anki cards, copy pronunciations from Qolibri, Forvo, or other sources (banks) that provide native audio. Also, just mine sentences from movies and TV shows more often. They have audio built in.

Tags: faq