Feature request - [X]Make VADIterator work like get_speech_timestamps function #405

Simon-chai · 2023-12-13T00:56:45Z

🚀 Feature

When we use get_speech_timestamps function we can assgin Parameters like min_speech_duration_ms,as I see it, they two actually are doing the same thing,does it mean that VADIterator can work just the same as get_speech_timestamps function in theory

Motivation

When I doing audio stream detecting by VADIterator (code like the example offer by silero-vad),I found that output end - start <= 0.1s,mostly just noise in environment,and I think it's better to be filtered inside the VADIterator .

Pitch

When I initiate the VADIterator instance I can assign Parameters like get_speech_timestamps function so I can make sure the result are under controll.

Alternatives

Additional context

snakers4 · 2024-03-26T07:04:59Z

VADIterator can work just the same as get_speech_timestamps function in theory

This is not possible, because get_speech_timestamps "looks into the future" to improve the results.

Simon-chai · 2024-11-12T19:05:30Z

Recently I try to solve it. I find out the fact that

You don't need to "looks into the future" to limit the max speech length ,but you have to accept the consequence of more fragmented speeches. For example,when you limit the max speech length to 5 seconds, it's most likely to cut a 6 seconds speech into a 5s and 1s.
Limit the min speech length is kind of meaningless, no matter we can or can not "looks into the future" ,we just can't truely limit this. Instead we can make the min speech length more controllable by min_silence_duration_ms parameter.

And BTW,I think get_speech_timestamps may fail to trigger the final 'end' if the last few chunk of audio is all speech,hence we now "looks into the future",we can forced trigger an 'end'

I simply implement max speech limitation on VADIterator,inspire by get_speech_timestamps,the make VADIterator work more like get_speech_timestamps to some degree. I willl post the code after fully tested,maybe someone will be interested

varrerohit · 2025-01-29T15:06:25Z

Did you manage to implement this code?

Simon-chai added the enhancement New feature or request label Dec 13, 2023

Simon-chai assigned snakers4 Dec 13, 2023

snakers4 closed this as not planned Won't fix, can't repro, duplicate, stale Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request - [X]Make VADIterator work like get_speech_timestamps function #405

Feature request - [X]Make VADIterator work like get_speech_timestamps function #405

Simon-chai commented Dec 13, 2023

snakers4 commented Mar 26, 2024

Simon-chai commented Nov 12, 2024

varrerohit commented Jan 29, 2025

Feature request - [X]Make VADIterator work like get_speech_timestamps function #405

Feature request - [X]Make VADIterator work like get_speech_timestamps function #405

Comments

Simon-chai commented Dec 13, 2023

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

snakers4 commented Mar 26, 2024

Simon-chai commented Nov 12, 2024

varrerohit commented Jan 29, 2025