Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T2S - number conversion does not work #125

Open
ivan-homoliak-sutd opened this issue Oct 12, 2023 · 1 comment
Open

T2S - number conversion does not work #125

ivan-homoliak-sutd opened this issue Oct 12, 2023 · 1 comment

Comments

@ivan-homoliak-sutd
Copy link

As I see in the wis log, there is a number conversion, which, however, works only when the number is not part of the text or not delimited by dash.

Examples that do not work: 30-minutes, GPT4.

@nikito
Copy link
Contributor

nikito commented Oct 12, 2023

I assume you are referring to some of the preprocessing done in WIS around the TTS functionality? That logic is fairly rudimentary for basic number cases (temperature for instance, or listing the value from a numeric entity), but wasn't really fleshed out for more complex cases. The TTS in general isn't the most complex (we are currently using SpeechT5), and we are aware it has trouble handling numbers like in your example among others. I've tried other TTS and noticed that they all have their quirks (for instance I tried your above cases on Coqui, and while it handled 30-Minutes just fine, it pronounced GPT4 as "upt" and didn't pronounce the number at all.
In my own case I have been a bit more explicit with how I generate the text being fed into TTS (for instance I have converters for time that change the text into more natural spoken text, such as changing 02:05PM into "two oh five in the afternoon"). I believe in the future we may look into using some NLU/preprocessing to help reduce these issues. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants