-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
按照文档安装过程中没有任何报错,但是执行语音识别命令报错,重装好几次都是一样 #3692
Comments
我的也是,尝试了各种版本,安装成功,最终也是这个错误Broadcast dimension mismatch. |
开发者你好,感谢关注 PaddleSpeech 开源项目,抱歉给你带来了不好的开发体验,目前开源项目维护人力有限,建议参考:#3246 |
可以参考 #3697 看看有没有帮助,底部我贴了一个博客链接,有详情的安装过程和部分报错的处理,你这个主要还是版本的问题 |
可以参考 #3697 看看有没有帮助,底部我贴了一个博客链接,有详情的安装过程和部分报错的处理 |
您好,问题在上面截图中描述,谢谢。
Message ID: ***@***.***>
|
这个项目的包版本管理真的是一塌糊涂。本来在requirement.txt可以一次性解决包匹配问题,可就是不写包的版本号,故意折腾各位。真是服了。 |
General Question
pip list:
`Package Version
absl-py 2.1.0
aiohttp 3.9.3
aiosignal 1.3.1
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 4.3.0
astor 0.8.1
asttokens 2.4.1
async-timeout 4.0.3
attrs 23.2.0
audioread 3.0.1
Babel 2.14.0
bce-python-sdk 0.9.4
blinker 1.7.0
bokeh 3.3.4
boltons 23.1.1
Bottleneck 1.3.8
braceexpand 0.1.7
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
coloredlogs 15.0.1
colorlog 6.8.2
contourpy 1.2.0
cycler 0.12.1
Cython 3.0.8
datasets 2.18.0
decorator 5.1.1
dill 0.3.4
Distance 0.1.3
editdistance 0.8.1
einops 0.7.0
exceptiongroup 1.2.0
executing 2.0.1
fastapi 0.110.0
filelock 3.13.1
Flask 3.0.2
flask-babel 4.0.0
flatbuffers 23.5.26
fonttools 4.49.0
frozenlist 1.4.1
fsspec 2024.2.0
ftfy 6.1.3
future 1.0.0
g2p-en 2.1.0
g2pM 0.1.2.5
h11 0.14.0
h5py 3.10.0
httpcore 1.0.4
httpx 0.27.0
huggingface-hub 0.21.3
humanfriendly 10.0
HyperPyYAML 1.2.2
idna 3.6
inflect 7.0.0
intervaltree 3.1.0
ipython 8.22.1
itsdangerous 2.1.2
jedi 0.19.1
jieba 0.42.1
Jinja2 3.1.3
joblib 1.3.2
jsonlines 4.0.0
kaldiio 2.18.0
kiwisolver 1.4.5
librosa 0.8.1
llvmlite 0.42.0
loguru 0.7.2
lxml 5.1.0
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.8.3
matplotlib-inline 0.1.6
mdurl 0.1.2
mido 1.3.2
mock 5.1.0
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.12.2
nara-wpe 0.0.9
nltk 3.8.1
note-seq 0.0.3
numba 0.59.0
numpy 1.23.5
omegaconf 2.3.0
onnx 1.15.0
onnxruntime 1.17.1
OpenCC 1.1.7
opencc-python-reimplemented 0.1.7
opencv-python 4.6.0.66
opt-einsum 3.3.0
packaging 23.2
paddle2onnx 1.0.6
paddleaudio 1.1.0
paddlefsl 1.1.0
paddlenlp 2.6.1
paddlepaddle-gpu 2.6.0
paddlesde 0.2.5
paddleslim 2.6.0
paddlespeech 0.0.0
paddlespeech-feat 0.1.0
pandas 2.2.1
parameterized 0.9.0
parso 0.8.3
pathos 0.2.8
pattern-singleton 1.2.0
pillow 10.2.0
pip 23.3.1
platformdirs 4.2.0
pooch 1.8.1
portalocker 2.8.2
pox 0.3.4
ppdiffusers 0.19.4
ppft 1.7.6.8
praatio 5.1.1
pretty-midi 0.2.10
prettytable 3.10.0
prompt-toolkit 3.0.43
protobuf 3.20.2
psutil 5.9.8
pure-eval 0.2.2
pyarrow 15.0.0
pyarrow-hotfix 0.6
pybind11 2.11.1
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.3
pydantic_core 2.16.3
pydub 0.25.1
Pygments 2.17.2
pygtrie 2.5.0
pyparsing 3.1.1
pypinyin 0.44.0
pypinyin-dict 0.7.0
pyreadline3 3.4.1
pytest-runner 6.0.1
python-dateutil 2.9.0.post0
pytz 2024.1
pywin32 306
pyworld 0.3.4
PyYAML 6.0.1
pyzmq 25.1.2
rarfile 4.1
regex 2023.12.25
requests 2.31.0
requests-mock 1.11.0
resampy 0.4.2
rich 13.7.1
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
sacrebleu 2.4.0
safetensors 0.4.2
scikit-learn 1.4.1.post1
scipy 1.12.0
sentencepiece 0.2.0
seqeval 1.2.2
setuptools 68.2.2
six 1.16.0
sniffio 1.3.1
sortedcontainers 2.4.0
soundfile 0.12.1
stack-data 0.6.3
starlette 0.36.3
swig 4.2.1
sympy 1.12
tabulate 0.9.0
TextGrid 1.6.1
threadpoolctl 3.3.0
timer 0.2.2
ToJyutping 0.2.1
tornado 6.4
tqdm 4.66.2
traitlets 5.14.1
trampoline 0.1.2
typeguard 2.13.3
typer 0.9.0
typing_extensions 4.10.0
tzdata 2024.1
urllib3 1.26.18
uvicorn 0.27.1
visualdl 2.5.3
wcwidth 0.2.13
webrtcvad 2.0.10
websockets 12.0
Werkzeug 3.0.1
wheel 0.41.2
win32-setctime 1.1.0
xxhash 3.4.1
xyzservices 2023.10.1
yacs 0.1.8
yarl 1.9.4
zhon 2.0.2`
powershell执行: paddlespeech asr --lang zh --input zh.wav
`(paddle_test) PS E:\AI_WorkSpace> paddlespeech asr --lang zh --input zh.wav
C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddleaudio_extension.py:141: UserWarning: paddleaudio C++ extension is not available.
warnings.warn("paddleaudio C++ extension is not available.")
C:\Users\an.conda\envs\paddle_test\lib\site-packages_distutils_hack_init_.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
2024-03-03 10:39:49.952 | INFO | paddlespeech.s2t.modules.ctc::45 - paddlespeech_ctcdecoders not installed!
W0303 10:39:49.955922 31980 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.3, Runtime API Version: 11.8
W0303 10:39:49.970871 31980 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
2024-03-03 10:39:50.355 | INFO | paddlespeech.s2t.modules.embedding:init:153 - max len: 5000
[2024-03-03 10:39:52,263] [ ERROR] - (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)
Traceback (most recent call last):
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\cli\asr\infer.py", line 314, in infer
result_transcripts = self.model.decode(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\base\dygraph\base.py", line 352, in _decorate_function
return func(*args, **kwargs)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 818, in decode
hyp = self.attention_rescoring(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 543, in attention_rescoring
hyps, encoder_out = self._ctc_prefix_beam_search(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 424, in _ctc_prefix_beam_search
encoder_out, encoder_mask = self._forward_encoder(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 229, in _forward_encoder
encoder_out, encoder_mask = self.encoder(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\nn\layer\layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\encoder.py", line 184, in forward
chunk_masks = add_optional_chunk_mask(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\mask.py", line 202, in add_optional_chunk_mask
chunk_masks = masks.logical_and(chunk_masks) # (B, L, L)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\tensor\logic.py", line 143, in logical_and
return _C_ops.logical_and(x, y)
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)
KeyError: 'result'`
The text was updated successfully, but these errors were encountered: