Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LazyFilter issues #608

Closed
KajiMaCN opened this issue Oct 9, 2022 · 4 comments
Closed

LazyFilter issues #608

KajiMaCN opened this issue Oct 9, 2022 · 4 comments

Comments

@KajiMaCN
Copy link
Contributor

KajiMaCN commented Oct 9, 2022

I'm at it again, this time with a problem arising from data preparation. I'm running prepare.sh in aishell and it produces the following error:

2022-10-09 10:58:56 (prepare.sh:122:main) Stage 4: Compute fbank for musan
/mnt/data/Anaconda3/envs/icefall/lib/python3.8/site-packages/torchaudio/backend/utils.py:53: UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to pytorch/audio#903 for the detail.
warnings.warn(
2022-10-09 10:58:57,185 INFO [compute_fbank_musan.py:78] Extracting features for Musan
/mnt/data/Anaconda3/envs/icefall/lib/python3.8/site-packages/lhotse/lazy.py:357: UserWarning: A lambda was passed to LazyFilter: it may prevent you from forking this process. If you experience issues with num_workers > 0 in torch.utils.data.DataLoader, try passing a regular function instead.
warnings.warn(
Traceback (most recent call last):
File "./local/compute_fbank_musan.py", line 109, in
compute_fbank_musan()
File "./local/compute_fbank_musan.py", line 85, in compute_fbank_musan
CutSet.from_manifests(
File "/mnt/data/Anaconda3/envs/icefall/lib/python3.8/site-packages/lhotse/cut/set.py", line 1455, in compute_and_store_features
cut_sets = self.split(num_jobs, shuffle=True)
File "/mnt/data/Anaconda3/envs/icefall/lib/python3.8/site-packages/lhotse/cut/set.py", line 540, in split
for subset in split_sequence(
File "/mnt/data/Anaconda3/envs/icefall/lib/python3.8/site-packages/lhotse/utils.py", line 331, in split_sequence
seq = list(seq)
File "/mnt/data/Anaconda3/envs/icefall/lib/python3.8/site-packages/lhotse/cut/set.py", line 1988, in len
return len(self.cuts)
File "/mnt/data/Anaconda3/envs/icefall/lib/python3.8/site-packages/lhotse/lazy.py", line 370, in len
raise NotImplementedError(
NotImplementedError: LazyFilter does not support len because it would require iterating over the whole iterator, which is not possible in a lazy fashion. If you really need to know the length, convert to eager mode first using .to_eager(). Note that this will require loading the whole iterator into memory.

@csukuangfj
Copy link
Collaborator

csukuangfj commented Oct 9, 2022

Same issue as
lhotse-speech/lhotse#835

@pzelasko
Could you have a look at it?


@KajiMaCN
You can use one of the following methods to fix it:
(1) Use a previous version of lhotse, i.e., use v1.7
(2) Change

.filter(lambda c: c.duration > 5)

to

 .filter(lambda c: c.duration > 5).to_eager()

@pzelasko
Copy link
Collaborator

Yeah I'll look into remaining lazy/eager related conflicts in Lhotse methods.

@csukuangfj
Copy link
Collaborator

Should be fixed by lhotse-speech/lhotse#844

@KajiMaCN
Copy link
Contributor Author

Should be fixed by lhotse-speech/lhotse#844

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants