Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multi-channel audio data #8728

Closed
yunbin opened this issue Mar 22, 2024 · 3 comments
Closed

Support for multi-channel audio data #8728

yunbin opened this issue Mar 22, 2024 · 3 comments
Assignees
Labels
feature request/PR for a new feature

Comments

@yunbin
Copy link

yunbin commented Mar 22, 2024

Describe the bug

NeMo training and decoding scripts do not support multi-channel audio data.

Steps/Code to reproduce bug

It does not support specifying which channel to use for each audio file in each line in train.manifest.json or test.manifest.json file.

I was able to run ./examples/asr/speech_to_text_eval.py with "channel_selector=" to specify the channel for all the audio in a manifest.json file, but I can't find a way to specifying them for each audio file inside the manifest.json file.

Expected behavior

Can NeMo team add this useful feature to work with a diverse set of multi-channel training and testing audio data so data from different channel can be mixed within a manifest.json file?

Environment overview (please complete the following information)

NeMo was installed by pip in a conda environment. It works for single channel audio data.

@yunbin yunbin added the bug Something isn't working label Mar 22, 2024
@anteju anteju added the feature request/PR for a new feature label Mar 22, 2024
@anteju anteju self-assigned this Mar 22, 2024
@anteju anteju removed the bug Something isn't working label Mar 22, 2024
@anteju
Copy link
Collaborator

anteju commented Mar 22, 2024

Thanks @yunbin, we'll take a look at adding this functionality over the next couple weeks. We'll post future updates here.

@yunbin
Copy link
Author

yunbin commented Apr 4, 2024

@anteju Any update on getting the channel feature implemented in NeMo training scripts?

@anteju
Copy link
Collaborator

anteju commented Apr 23, 2024

@yunbin, please check #9007.
The change there enables using channel selector from manifest for nvidia/canary-1b model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request/PR for a new feature
Projects
None yet
Development

No branches or pull requests

3 participants