Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling multiple languages doesn't display text in the language that is spoken #30

Open
Rohith-Altrai opened this issue Mar 12, 2024 · 3 comments

Comments

@Rohith-Altrai
Copy link

I tried using different languages, for some of the languages it showed the text in the language that was spoken and for others the app displays the text in english.

@Rohith-Altrai Rohith-Altrai changed the title How to create a filters_vocab_multilingual.bin for a medium whisper model Enabling multiple languages doesn't display text in the language that is spoken Mar 12, 2024
@nyadla-sys
Copy link
Owner

@Rohith-Altrai whisper tflite multilanguage support needs more attention and require seperate tflite file based on the language selection

@Rohith-Altrai
Copy link
Author

Rohith-Altrai commented Mar 14, 2024

@Rohith-Altrai whisper tflite multilanguage support needs more attention and require seperate tflite file based on the language selection

Can I know why is it translating some sentences? and can you give me some instructions on how I can make it work on a specific language with the transcript text being the same language which was spoken

Note: I am totally new to AI and a beginner in mobile development

Thank you for the help

@woheller69
Copy link

woheller69 commented Jan 2, 2025

whisper tflite multilanguage support needs more attention and require seperate tflite file based on the language selection

Does this mean we would need a separate tflite file for each language?

https://github.com/woheller69/whisperIME

I built a new Android app which also implements an Android InputMethod which makes whisper voice input available via e.g. HeliBoard for every app. I use the whisper small tflite for multi-lingual which works quite good. But sometimes it translates instead of transcribing, even though it detected the right language. Is there no way to pass the language token and/or the transcribe/translate token to the model when running inference?

Is it possible when creating a model with several signatures?
Something like this? (not tested):

def serving_korean(self, input_features):
outputs = self.model.generate(
input_features,
max_new_tokens=500, #change as needed
return_dict_in_generate=True,
forced_decoder_ids=[[1,50264], [2, 50359], [3, 50363]], # 50264: Korean | 50359:transcribe | 50363: notimestamp
)
return {"sequences": outputs["sequences"]}

def serving_german(self, input_features):
outputs = self.model.generate(
input_features,
max_new_tokens=500, #change as needed
return_dict_in_generate=True,
forced_decoder_ids=[[1,50264], [2, 50359], [3, 50363]], # 50261: German | 50359:transcribe | 50363: notimestamp
)
return {"sequences": outputs["sequences"]}

generate_model = GenerateModel(model=model)

saved_model_dir = 'content/small_saved'
tflite_model_path = 'whisper-smal.tflite'

tf.saved_model.save(generate_model, saved_model_dir, signatures={"serving_korean": generate_model.serving_korean, "serving_german": generate_model.serving_german})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants