Force silence using built-in VAD during encoding #82

vadimkantorov · 2023-10-25T18:16:40Z

Hi!

Is there an option for opusenc to reset to exact zero the frames without detected speech during encoding (leveraging this built-in VAD)? Is it possible to control the used thresholds from the opusenc frontend?

Do encoded frames contain a silence bit? (if so, these frames can be skipped during decoding as they would not contribute to speech recognized text further down the pipeline) I've read RFC and it seems so! This is also relevant for stereo and multi-channel call recording files as during decoding most of frames are silent when only one person is speaking.

Thank you :)

The text was updated successfully, but these errors were encountered:

vadimkantorov · 2023-12-05T18:14:02Z

New --channels discrete can be used to force uncoupled encoding, but it would still be nice to have a mode for super-storage-space-efficient silent frame encoding (using VAD or by passing some threshold) of long effective silence (e.g. one silent person listening to another person speaking)

vadimkantorov · 2023-12-06T23:25:52Z

Related on DTX:

Option for opusenc to omit DTX/silence frames from output completely and provide options for controlling aggressivity of silence detection #89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force silence using built-in VAD during encoding #82

Force silence using built-in VAD during encoding #82

vadimkantorov commented Oct 25, 2023 •

edited

Loading

vadimkantorov commented Dec 5, 2023 •

edited

Loading

vadimkantorov commented Dec 6, 2023

Force silence using built-in VAD during encoding #82

Force silence using built-in VAD during encoding #82

Comments

vadimkantorov commented Oct 25, 2023 • edited Loading

vadimkantorov commented Dec 5, 2023 • edited Loading

vadimkantorov commented Dec 6, 2023

vadimkantorov commented Oct 25, 2023 •

edited

Loading

vadimkantorov commented Dec 5, 2023 •

edited

Loading