Enabling multiple languages doesn't display text in the language that is spoken #30

Rohith-Altrai · 2024-03-12T06:13:05Z

I tried using different languages, for some of the languages it showed the text in the language that was spoken and for others the app displays the text in english.

nyadla-sys · 2024-03-13T20:39:49Z

@Rohith-Altrai whisper tflite multilanguage support needs more attention and require seperate tflite file based on the language selection

Rohith-Altrai · 2024-03-14T04:13:44Z

@Rohith-Altrai whisper tflite multilanguage support needs more attention and require seperate tflite file based on the language selection

Can I know why is it translating some sentences? and can you give me some instructions on how I can make it work on a specific language with the transcript text being the same language which was spoken

Note: I am totally new to AI and a beginner in mobile development

Thank you for the help

woheller69 · 2025-01-02T16:36:19Z

whisper tflite multilanguage support needs more attention and require seperate tflite file based on the language selection

Does this mean we would need a separate tflite file for each language?

https://github.com/woheller69/whisperIME

I built a new Android app which also implements an Android InputMethod which makes whisper voice input available via e.g. HeliBoard for every app. I use the whisper small tflite for multi-lingual which works quite good. But sometimes it translates instead of transcribing, even though it detected the right language. Is there no way to pass the language token and/or the transcribe/translate token to the model when running inference?

Is it possible when creating a model with several signatures?
Something like this? (not tested):

def serving_korean(self, input_features):
outputs = self.model.generate(
input_features,
max_new_tokens=500, #change as needed
return_dict_in_generate=True,
forced_decoder_ids=[[1,50264], [2, 50359], [3, 50363]], # 50264: Korean | 50359:transcribe | 50363: notimestamp
)
return {"sequences": outputs["sequences"]}

def serving_german(self, input_features):
outputs = self.model.generate(
input_features,
max_new_tokens=500, #change as needed
return_dict_in_generate=True,
forced_decoder_ids=[[1,50264], [2, 50359], [3, 50363]], # 50261: German | 50359:transcribe | 50363: notimestamp
)
return {"sequences": outputs["sequences"]}

generate_model = GenerateModel(model=model)

saved_model_dir = 'content/small_saved'
tflite_model_path = 'whisper-smal.tflite'

tf.saved_model.save(generate_model, saved_model_dir, signatures={"serving_korean": generate_model.serving_korean, "serving_german": generate_model.serving_german})

Rohith-Altrai changed the title ~~How to create a filters_vocab_multilingual.bin for a medium whisper model~~ Enabling multiple languages doesn't display text in the language that is spoken Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enabling multiple languages doesn't display text in the language that is spoken #30

Enabling multiple languages doesn't display text in the language that is spoken #30

Rohith-Altrai commented Mar 12, 2024

nyadla-sys commented Mar 13, 2024

Rohith-Altrai commented Mar 14, 2024 •

edited

Loading

woheller69 commented Jan 2, 2025 •

edited

Loading

Enabling multiple languages doesn't display text in the language that is spoken #30

Enabling multiple languages doesn't display text in the language that is spoken #30

Comments

Rohith-Altrai commented Mar 12, 2024

nyadla-sys commented Mar 13, 2024

Rohith-Altrai commented Mar 14, 2024 • edited Loading

woheller69 commented Jan 2, 2025 • edited Loading

Rohith-Altrai commented Mar 14, 2024 •

edited

Loading

woheller69 commented Jan 2, 2025 •

edited

Loading