-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for distil-large-v3 #755
Conversation
Nice to see some comradare. |
cc @trungkienbkhn - this one's ready for review! |
Hey @trungkienbkhn! The model is now live under https://huggingface.co/distil-whisper/distil-large-v3-ct2 Feel free to merge this PR at your convenience to enable Faster-Whisper support! Also cc @nguyendc-systran @metame-none @Purfview |
@sanchit-gandhi , thanks for your contribution. However, when I added the option segments, info = model.transcribe("audio.mp3", condition_on_previous_text=False, language="en", word_timestamps=True)
This is caused by wrong So we should use the default alignment_heads for fw-distil-large-v3 model, same as fw-distil-large-v2 model. For more details on this logic, you can refer to the implementation in ctranslate2. |
@trungkienbkhn |
Oh yeah, almost forgot, wouldn't the be necessary because the faster-whisper library automatically points to the Systran repository unless another one is explicitly specified? |
Brilliant idea! That way all the faster-whisper checkpoints remain in one org. |
Alignment heads updated according to those in Note that we'd get a better alignment by inspecting the alignment from the DTW algorithm, e.g. as done here. However, the heuristic to use the last half layers for alignment by default should suffice for a first version. |
FYI, we have released a new ct2 conversion model (using float16) for distil-large-v3: https://huggingface.co/Systran/faster-distil-whisper-large-v3 |
Awesome, thanks @trungkienbkhn. I've updated this PR to use these fp16 weights. |
Feel free to merge this PR at your convenience - it would be awesome to unblock faster-whisper for the distil-whisper community. |
The latest Distil-Whisper model, distil-large-v3, is intrinsically designed to work with the OpenAI sequential algorithm.
This PR adds support for this checkpoint.