When "chunk_size" is set to 30 it skips chunks #574

kulyasov-aleksey · 2023-11-13T08:34:55Z

whisperx 2.wav --model large-v2

Inorrect Output:

0 => array:4 [▼
"start" => 0.009
"end" => 0.469
"text" => " million dollars."
"words" => array:3 [▶]
]
Skiped
1 => array:4 [▼
"start" => 20.396
"end" => 23.658
"text" => " You know, and a lot of rich people, there's nothing that I buy that brings me happiness."
"words" => array:17 [▶]
]

whisperx 2.wav --model large-v2 --chunk_size 20

Correct Output:

0 => array:4 [▼
"start" => 0.009
"end" => 0.55
"text" => " million dollars."
"words" => array:2 [▶]
]
1 => array:4 [▼
"start" => 0.59
"end" => 4.473
"text" => "You know, a lot of people equate money to happiness and it's not, it's more freedom."
"words" => array:16 [▶]
]

This is quite strange behaviour, I can't understand why this happens. Notice the sentence at the beginning of the audio and at 20 seconds are very similar. But when arg "chunk_size 30" (default) it skips the first 20 second. Setting "chunk_size 20" or "chunk_size 10" helps. But why does this happen or is it normal whisperx behaviour, what do you think about it?

The text was updated successfully, but these errors were encountered:

pneuly mentioned this issue Aug 21, 2024

WhisperX missing audio part while original whisper and fast whisper working fine #828

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When "chunk_size" is set to 30 it skips chunks #574

When "chunk_size" is set to 30 it skips chunks #574

kulyasov-aleksey commented Nov 13, 2023 •

edited

Loading

When "chunk_size" is set to 30 it skips chunks #574

When "chunk_size" is set to 30 it skips chunks #574

Comments

kulyasov-aleksey commented Nov 13, 2023 • edited Loading

kulyasov-aleksey commented Nov 13, 2023 •

edited

Loading