Replies: 1 comment
-
This means that the model is overflowing from the VRAM to the RAM which causes huge slowdowns |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've been using faster whisper for a while, and finally figured out how to use cuda. Performance is much better (cuda vs. cpu, same data file) with the tiny, base, small, and medium models, but much worse with large, large-v1, and large-v2.
The following data is for a 3:20 audio file. Time is in seconds to generate all segments.
So smaller models show an improved speed of 2 to 5 times using cuda over cpu, while large models show cuda taking roughly 3 times longer using cuda than cpu. I've tested this with multiple files with essentially the same results.
I'm using Windows 11, an NVidia GTX 1660 Ti video card (it's old, but it's what I have), python 3.11, and both faster-whisper 1.1.1 and 0.9.0.
So is this due to my old video card? Am I missing something fundamental? Do others see this too? Any thoughts or suggestions would be greatly appreciated.
Here's my code:
Beta Was this translation helpful? Give feedback.
All reactions