-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
whisper v3微调过程中出现乱码的情况 #13
Comments
I have same issue You need to start to think what kind of data (or augmentation) will better represent your real data? |
check the format of training data(text) is utf-8? |
I have checked and the data is fine, it is in UTF-8 encoding format. I rented a GPU , A800 with 80g memory. I tryed to finetune all parameters use the scripts 'finetune_all.py', but there were also garbled characters. May I ask how you prepare data on your end? Can we share the training parameter environment information?
|
Use this project, it has active maintenance: https://github.com/yeyupiaoling/Whisper-Finetune I have successfully train on wenet-speech dataset without having the garble |
Thank you for your guidance. I used wenet-speech dataset to finetune , with a learning rate of 1e-5 and 200 hours of data, the audios length are between 20 seconds and 30 seconds. But still garbled. Is it a problem with my environment here? Can you give me some training advice? What is used https://github.com/yeyupiaoling/Whisper-Finetune fine- tuning scripts in the project. |
try to out put the file and listen what kind of data you input to the model
Another thing is timestamp, it will break the model easily if you don't handle it properly |
我这边使用2卡的4090,实验数据是aishell1 带标点的数据。
环境如下:
Package Version Editable project location
accelerate 0.28.0
aiohappyeyeballs 2.4.4
aiohttp 3.11.10
aiosignal 1.3.1
annotated-types 0.7.0
anyio 4.7.0
async-timeout 5.0.1
attrs 24.2.0
audioread 3.0.1
av 14.0.1
bitsandbytes 0.41.3
Brotli 1.0.9
certifi 2024.8.30
cffi 1.17.1
charset-normalizer 3.3.2
click 8.1.7
coloredlogs 15.0.1
ctranslate2 4.5.0
dataclasses 0.6
datasets 3.2.0
decorator 5.1.1
dill 0.3.8
evaluate 0.4.3
exceptiongroup 1.2.2
fastapi 0.115.6
faster-whisper 1.1.0
filelock 3.13.1
flatbuffers 24.3.25
frozenlist 1.5.0
fsspec 2024.9.0
gmpy2 2.1.2
h11 0.14.0
huggingface-hub 0.26.5
humanfriendly 10.0
idna 3.10
Jinja2 3.1.4
jiwer 3.0.5
joblib 1.4.2
lazy_loader 0.4
librosa 0.10.2.post1
llvmlite 0.43.0
MarkupSafe 3.0.2
mkl_fft 1.3.11
mkl_random 1.2.8
mkl-service 2.4.0
mpmath 1.3.0
msgpack 1.1.0
multidict 6.1.0
multiprocess 0.70.16
networkx 3.2.1
numba 0.60.0
numpy 2.0.1
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
onnxruntime 1.16.3
packaging 24.2
pandas 2.2.3
peft 0.7.0 # 我自己的路径,使用源码安装的
pillow 11.0.0
pip 24.2
platformdirs 4.3.6
pooch 1.8.2
propcache 0.2.1
protobuf 5.29.1
psutil 6.1.0
pyarrow 18.1.0
pycparser 2.22
pydantic 2.10.3
pydantic_core 2.27.1
pydub 0.25.1
PySocks 1.7.1
python-dateutil 2.9.0.post0
pytz 2024.2
PyYAML 6.0.2
RapidFuzz 3.10.1
regex 2024.11.6
requests 2.32.3
safetensors 0.4.5
scikit-learn 1.5.2
scipy 1.13.1
setuptools 75.1.0
six 1.17.0
sniffio 1.3.1
SoundCard 0.4.3
soundfile 0.12.1
soxr 0.5.0.post1
starlette 0.41.3
sympy 1.13.1
tensorboardX 2.6.2.2
threadpoolctl 3.5.0
tokenizers 0.21.0
torch 2.5.1
torchaudio 2.5.1
torchvision 0.20.1
tqdm 4.67.1
transformers 4.47.0
triton 3.1.0
typing_extensions 4.12.2
tzdata 2024.2
urllib3 2.2.3
uvicorn 0.32.1
wheel 0.44.0
xxhash 3.5.0
yarl 1.18.3
zhconv 1.4.3
实验结果如下
请问您这边可以提供一些建议吗?麻烦啦
The text was updated successfully, but these errors were encountered: