按照文档安装过程中没有任何报错,但是执行语音识别命令报错,重装好几次都是一样 #3692

LjPro opened this issue Mar 3, 2024 · 6 comments


LjPro commented Mar 3, 2024

General Question

pip list:
`Package Version

absl-py 2.1.0
aiohttp 3.9.3
aiosignal 1.3.1
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 4.3.0
astor 0.8.1
asttokens 2.4.1
async-timeout 4.0.3
attrs 23.2.0
audioread 3.0.1
Babel 2.14.0
bce-python-sdk 0.9.4
blinker 1.7.0
bokeh 3.3.4
boltons 23.1.1
Bottleneck 1.3.8
braceexpand 0.1.7
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
coloredlogs 15.0.1
colorlog 6.8.2
contourpy 1.2.0
cycler 0.12.1
Cython 3.0.8
datasets 2.18.0
decorator 5.1.1
dill 0.3.4
Distance 0.1.3
editdistance 0.8.1
einops 0.7.0
exceptiongroup 1.2.0
executing 2.0.1
fastapi 0.110.0
filelock 3.13.1
Flask 3.0.2
flask-babel 4.0.0
flatbuffers 23.5.26
fonttools 4.49.0
frozenlist 1.4.1
fsspec 2024.2.0
ftfy 6.1.3
future 1.0.0
g2p-en 2.1.0
h11 0.14.0
h5py 3.10.0
httpcore 1.0.4
httpx 0.27.0
huggingface-hub 0.21.3
humanfriendly 10.0
HyperPyYAML 1.2.2
idna 3.6
inflect 7.0.0
intervaltree 3.1.0
ipython 8.22.1
itsdangerous 2.1.2
jedi 0.19.1
jieba 0.42.1
Jinja2 3.1.3
joblib 1.3.2
jsonlines 4.0.0
kaldiio 2.18.0
kiwisolver 1.4.5
librosa 0.8.1
llvmlite 0.42.0
loguru 0.7.2
lxml 5.1.0
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.8.3
matplotlib-inline 0.1.6
mdurl 0.1.2
mido 1.3.2
mock 5.1.0
mpmath 1.3.0
multidict 6.0.5
nara-wpe 0.0.9
nltk 3.8.1
note-seq 0.0.3
numba 0.59.0
numpy 1.23.5
omegaconf 2.3.0
onnx 1.15.0
onnxruntime 1.17.1
OpenCC 1.1.7
opencc-python-reimplemented 0.1.7
opt-einsum 3.3.0
packaging 23.2
paddle2onnx 1.0.6
paddleaudio 1.1.0
paddlefsl 1.1.0
paddlenlp 2.6.1
paddlepaddle-gpu 2.6.0
paddlesde 0.2.5
paddleslim 2.6.0
paddlespeech 0.0.0
paddlespeech-feat 0.1.0
pandas 2.2.1
parameterized 0.9.0
parso 0.8.3
pathos 0.2.8
pattern-singleton 1.2.0
pillow 10.2.0
pip 23.3.1
platformdirs 4.2.0
pooch 1.8.1
portalocker 2.8.2
pox 0.3.4
ppdiffusers 0.19.4
praatio 5.1.1
pretty-midi 0.2.10
prettytable 3.10.0
prompt-toolkit 3.0.43
protobuf 3.20.2
psutil 5.9.8
pure-eval 0.2.2
pyarrow 15.0.0
pyarrow-hotfix 0.6
pybind11 2.11.1
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.3
pydantic_core 2.16.3
pydub 0.25.1
Pygments 2.17.2
pygtrie 2.5.0
pyparsing 3.1.1
pypinyin 0.44.0
pypinyin-dict 0.7.0
pyreadline3 3.4.1
pytest-runner 6.0.1
python-dateutil 2.9.0.post0
pytz 2024.1
pywin32 306
pyworld 0.3.4
PyYAML 6.0.1
pyzmq 25.1.2
rarfile 4.1
regex 2023.12.25
requests 2.31.0
requests-mock 1.11.0
resampy 0.4.2
rich 13.7.1
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
sacrebleu 2.4.0
safetensors 0.4.2
scikit-learn 1.4.1.post1
scipy 1.12.0
sentencepiece 0.2.0
seqeval 1.2.2
setuptools 68.2.2
six 1.16.0
sniffio 1.3.1
sortedcontainers 2.4.0
soundfile 0.12.1
stack-data 0.6.3
starlette 0.36.3
swig 4.2.1
sympy 1.12
tabulate 0.9.0
TextGrid 1.6.1
threadpoolctl 3.3.0
timer 0.2.2
ToJyutping 0.2.1
tornado 6.4
tqdm 4.66.2
traitlets 5.14.1
trampoline 0.1.2
typeguard 2.13.3
typer 0.9.0
typing_extensions 4.10.0
tzdata 2024.1
urllib3 1.26.18
uvicorn 0.27.1
visualdl 2.5.3
wcwidth 0.2.13
webrtcvad 2.0.10
websockets 12.0
Werkzeug 3.0.1
wheel 0.41.2
win32-setctime 1.1.0
xxhash 3.4.1
xyzservices 2023.10.1
yacs 0.1.8
yarl 1.9.4
zhon 2.0.2`

powershell执行: paddlespeech asr --lang zh --input zh.wav
`(paddle_test) PS E:\AI_WorkSpace> paddlespeech asr --lang zh --input zh.wav
C:\Users\an.conda\envs\paddle_test\lib\site-packages\ UserWarning: paddleaudio C++ extension is not available.
warnings.warn("paddleaudio C++ extension is not available.")
C:\Users\an.conda\envs\paddle_test\lib\ UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
2024-03-03 10:39:49.952 | INFO | paddlespeech.s2t.modules.ctc::45 - paddlespeech_ctcdecoders not installed!
W0303 10:39:49.955922 31980] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.3, Runtime API Version: 11.8
W0303 10:39:49.970871 31980] device: 0, cuDNN Version: 8.9.
2024-03-03 10:39:50.355 | INFO | paddlespeech.s2t.modules.embedding:init:153 - max len: 5000
[2024-03-03 10:39:52,263] [ ERROR] - (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)
Traceback (most recent call last):
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\cli\asr\", line 314, in infer
result_transcripts = self.model.decode(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\", line 232, in fun
return caller(func, *(extras + args), **kw)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\base\dygraph\", line 352, in _decorate_function
return func(*args, **kwargs)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\", line 818, in decode
hyp = self.attention_rescoring(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\", line 543, in attention_rescoring
hyps, encoder_out = self._ctc_prefix_beam_search(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\", line 424, in _ctc_prefix_beam_search
encoder_out, encoder_mask = self._forward_encoder(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\", line 229, in _forward_encoder
encoder_out, encoder_mask = self.encoder(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\nn\layer\", line 1429, in call
return self.forward(*inputs, **kwargs)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\", line 184, in forward
chunk_masks = add_optional_chunk_mask(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\", line 202, in add_optional_chunk_mask
chunk_masks = masks.logical_and(chunk_masks) # (B, L, L)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\tensor\", line 143, in logical_and
return _C_ops.logical_and(x, y)
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)

KeyError: 'result'`

我的也是,尝试了各种版本,安装成功,最终也是这个错误Broadcast dimension mismatch.

开发者你好,感谢关注 PaddleSpeech 开源项目,抱歉给你带来了不好的开发体验,目前开源项目维护人力有限,建议参考:#3246

General Question

pip list:

absl-py 2.1.0 aiohttp 3.9.3 aiosignal 1.3.1 annotated-types 0.6.0 antlr4-python3-runtime 4.9.3 anyio 4.3.0 astor 0.8.1 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 audioread 3.0.1 Babel 2.14.0 bce-python-sdk 0.9.4 blinker 1.7.0 bokeh 3.3.4 boltons 23.1.1 Bottleneck 1.3.8 braceexpand 0.1.7 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 contourpy 1.2.0 cycler 0.12.1 Cython 3.0.8 datasets 2.18.0 decorator 5.1.1 dill 0.3.4 Distance 0.1.3 editdistance 0.8.1 einops 0.7.0 exceptiongroup 1.2.0 executing 2.0.1 fastapi 0.110.0 filelock 3.13.1 Flask 3.0.2 flask-babel 4.0.0 flatbuffers 23.5.26 fonttools 4.49.0 frozenlist 1.4.1 fsspec 2024.2.0 ftfy 6.1.3 future 1.0.0 g2p-en 2.1.0 g2pM h11 0.14.0 h5py 3.10.0 httpcore 1.0.4 httpx 0.27.0 huggingface-hub 0.21.3 humanfriendly 10.0 HyperPyYAML 1.2.2 idna 3.6 inflect 7.0.0 intervaltree 3.1.0 ipython 8.22.1 itsdangerous 2.1.2 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.3 joblib 1.3.2 jsonlines 4.0.0 kaldiio 2.18.0 kiwisolver 1.4.5 librosa 0.8.1 llvmlite 0.42.0 loguru 0.7.2 lxml 5.1.0 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.8.3 matplotlib-inline 0.1.6 mdurl 0.1.2 mido 1.3.2 mock 5.1.0 mpmath 1.3.0 multidict 6.0.5 multiprocess nara-wpe 0.0.9 nltk 3.8.1 note-seq 0.0.3 numba 0.59.0 numpy 1.23.5 omegaconf 2.3.0 onnx 1.15.0 onnxruntime 1.17.1 OpenCC 1.1.7 opencc-python-reimplemented 0.1.7 opencv-python opt-einsum 3.3.0 packaging 23.2 paddle2onnx 1.0.6 paddleaudio 1.1.0 paddlefsl 1.1.0 paddlenlp 2.6.1 paddlepaddle-gpu 2.6.0 paddlesde 0.2.5 paddleslim 2.6.0 paddlespeech 0.0.0 paddlespeech-feat 0.1.0 pandas 2.2.1 parameterized 0.9.0 parso 0.8.3 pathos 0.2.8 pattern-singleton 1.2.0 pillow 10.2.0 pip 23.3.1 platformdirs 4.2.0 pooch 1.8.1 portalocker 2.8.2 pox 0.3.4 ppdiffusers 0.19.4 ppft praatio 5.1.1 pretty-midi 0.2.10 prettytable 3.10.0 prompt-toolkit 3.0.43 protobuf 3.20.2 psutil 5.9.8 pure-eval 0.2.2 pyarrow 15.0.0 pyarrow-hotfix 0.6 pybind11 2.11.1 pycparser 2.21 pycryptodome 3.20.0 pydantic 2.6.3 pydantic_core 2.16.3 pydub 0.25.1 Pygments 2.17.2 pygtrie 2.5.0 pyparsing 3.1.1 pypinyin 0.44.0 pypinyin-dict 0.7.0 pyreadline3 3.4.1 pytest-runner 6.0.1 python-dateutil 2.9.0.post0 pytz 2024.1 pywin32 306 pyworld 0.3.4 PyYAML 6.0.1 pyzmq 25.1.2 rarfile 4.1 regex 2023.12.25 requests 2.31.0 requests-mock 1.11.0 resampy 0.4.2 rich 13.7.1 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 sacrebleu 2.4.0 safetensors 0.4.2 scikit-learn 1.4.1.post1 scipy 1.12.0 sentencepiece 0.2.0 seqeval 1.2.2 setuptools 68.2.2 six 1.16.0 sniffio 1.3.1 sortedcontainers 2.4.0 soundfile 0.12.1 stack-data 0.6.3 starlette 0.36.3 swig 4.2.1 sympy 1.12 tabulate 0.9.0 TextGrid 1.6.1 threadpoolctl 3.3.0 timer 0.2.2 ToJyutping 0.2.1 tornado 6.4 tqdm 4.66.2 traitlets 5.14.1 trampoline 0.1.2 typeguard 2.13.3 typer 0.9.0 typing_extensions 4.10.0 tzdata 2024.1 urllib3 1.26.18 uvicorn 0.27.1 visualdl 2.5.3 wcwidth 0.2.13 webrtcvad 2.0.10 websockets 12.0 Werkzeug 3.0.1 wheel 0.41.2 win32-setctime 1.1.0 xxhash 3.4.1 xyzservices 2023.10.1 yacs 0.1.8 yarl 1.9.4 zhon 2.0.2`

powershell执行: paddlespeech asr --lang zh --input zh.wav `(paddle_test) PS E:\AI_WorkSpace> paddlespeech asr --lang zh --input zh.wav C:\Users\an.conda\envs\paddle_test\lib\site-packages\ UserWarning: paddleaudio C++ extension is not available. warnings.warn("paddleaudio C++ extension is not available.") C:\Users\an.conda\envs\paddle_test\lib\ UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.") 2024-03-03 10:39:49.952 | INFO | paddlespeech.s2t.modules.ctc::45 - paddlespeech_ctcdecoders not installed! W0303 10:39:49.955922 31980] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.3, Runtime API Version: 11.8 W0303 10:39:49.970871 31980] device: 0, cuDNN Version: 8.9. 2024-03-03 10:39:50.355 | INFO | paddlespeech.s2t.modules.embedding:init:153 - max len: 5000 [2024-03-03 10:39:52,263] [ ERROR] - (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86) Traceback (most recent call last): File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\cli\asr\", line 314, in infer result_transcripts = self.model.decode( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\", line 232, in fun return caller(func, *(extras + args), **kw) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\base\dygraph\", line 352, in _decorate_function return func(*args, **kwargs) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\", line 818, in decode hyp = self.attention_rescoring( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\", line 543, in attention_rescoring hyps, encoder_out = self._ctc_prefix_beam_search( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\", line 424, in _ctc_prefix_beam_search encoder_out, encoder_mask = self._forward_encoder( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\", line 229, in _forward_encoder encoder_out, encoder_mask = self.encoder( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\nn\layer\", line 1429, in call return self.forward(*inputs, **kwargs) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\", line 184, in forward chunk_masks = add_optional_chunk_mask( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\", line 202, in add_optional_chunk_mask chunk_masks = masks.logical_and(chunk_masks) # (B, L, L) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\tensor\", line 143, in logical_and return _C_ops.logical_and(x, y) ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)

KeyError: 'result'`

可以参考 #3697 看看有没有帮助,底部我贴了一个博客链接,有详情的安装过程和部分报错的处理,你这个主要还是版本的问题

我的也是,尝试了各种版本,安装成功,最终也是这个错误Broadcast dimension mismatch.

可以参考 #3697 看看有没有帮助,底部我贴了一个博客链接,有详情的安装过程和部分报错的处理

18721688783 commented Mar 22, 2024 via email

hbjhyhb commented Jun 25, 2024


