-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate faster version of whisper (batched faster whisper) to Aana SDK #41
Comments
As per discussion, we will create a separate Endpoint for batched faster-whisper. We could even consider it as a separate target in the future. |
Comments from Jilt Below is the benchmarking result: |
Steps:
|
Feature Summary
You can integrate the faster-batched version of a whisper into the Aana SDK.
Justification/Rationale
This feature enables a faster version of whisper that uses VAD(voice activity detection) and improves batching to improve the throughput by approximately 4x.
Proposed Implementation (if any)
There are 2 options for the implementation.
1st A separate endpoint for the batched whisper.
2nd A flag/param for the existing endpoint to enable batched inference with a trade-off on WER. A bunch of parameters usually familiar to the user are not, e.g. without_timestamps (no word word-level timestamps).
VAD would be introduced as a separate deployment.
The text was updated successfully, but these errors were encountered: