You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have some question regarding quality of Fastspeech2 output compared to Glow TTS. Currently I am using Glow TTS generated Mels with HifiGan vocoder and quality is good. There is scope of improvement in prosody. Tacotron2 works better in this regard but has high inference time as well as performs poorly when input sentence length increases. Fastspeech2's inference speed is faster that of Glow TTS but given that contribution of TTS is small compared to time taken by vocoder. I am rather interested in knowing whether Fastspeech2 would help increase quality in terms of intonation, pauses and stress of output sentences? Does anyone here trained both using Glow TTS vs Fastspeech2?
The text was updated successfully, but these errors were encountered:
I have some question regarding quality of Fastspeech2 output compared to Glow TTS. Currently I am using Glow TTS generated Mels with HifiGan vocoder and quality is good. There is scope of improvement in prosody. Tacotron2 works better in this regard but has high inference time as well as performs poorly when input sentence length increases. Fastspeech2's inference speed is faster that of Glow TTS but given that contribution of TTS is small compared to time taken by vocoder. I am rather interested in knowing whether Fastspeech2 would help increase quality in terms of intonation, pauses and stress of output sentences? Does anyone here trained both using Glow TTS vs Fastspeech2?
The text was updated successfully, but these errors were encountered: