Fix for: Decoder stopped with `max_decoder_steps` 500
Intro
Text-to-speech (TTS) is in rapid development and the field is enormous fun. One of the leading projects is the erstwhile Mozilla TTS that is now Coqui.ai TTS.
I have about 2 days experience in the field but find the models based on Double Decode Consistency (DDC) to deliver the best results, beating contenders FastPitch and MSs FastSpeech out of the water from what I heard.
Spoiling the fun a bit, the excellent model "tts_models/en/ljspeech/tacotron2-DDC" and possibly other DDC models have issues with truncating long sentences, giving error "Decoder stopped with max_decoder_steps
500".
The solution is to proceed like described here: https://github.com/thorstenMueller/deep-learning-german-tts/issues/22
So go to the model's config.json file, search for "mixed_precision": false, and add line below reading "max_decoder_steps": 2000,. It shouldn't matter but apparently this line can't be added just anywhere. Then you should have long sentences crisp and clear.
Hope it helps.
Posted: 31 October 2021