Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force silence using built-in VAD during encoding #82

Open
vadimkantorov opened this issue Oct 25, 2023 · 2 comments
Open

Force silence using built-in VAD during encoding #82

vadimkantorov opened this issue Oct 25, 2023 · 2 comments

Comments

@vadimkantorov
Copy link

vadimkantorov commented Oct 25, 2023

Hi!

Is there an option for opusenc to reset to exact zero the frames without detected speech during encoding (leveraging this built-in VAD)? Is it possible to control the used thresholds from the opusenc frontend?

Do encoded frames contain a silence bit? (if so, these frames can be skipped during decoding as they would not contribute to speech recognized text further down the pipeline) I've read RFC and it seems so! This is also relevant for stereo and multi-channel call recording files as during decoding most of frames are silent when only one person is speaking.

Thank you :)

@vadimkantorov
Copy link
Author

vadimkantorov commented Dec 5, 2023

New --channels discrete can be used to force uncoupled encoding, but it would still be nice to have a mode for super-storage-space-efficient silent frame encoding (using VAD or by passing some threshold) of long effective silence (e.g. one silent person listening to another person speaking)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant