你好，在微调fine-tuning/run_classifier.py在运行时报错run_classifier.py: error: unrecognized arguments: --vocab_path models/encryptd_vocab.txt，我查了run_classifier.py并没有看到vocab_path参数的定义，请问怎么解决？谢谢 #100

fjlinww · 2024-12-22T14:19:21Z

python3 fine-tuning/run_classifier.py --pretrained_model_path models/pre-trained_model.bin
--vocab_path models/encryptd_vocab.txt
--train_path datasets/fine-tuning_dataset/cstnet-tls1.3/packet/train_dataset.tsv
--dev_path datasets/fine-tuning_dataset/cstnet-tls1.3/packet/valid_dataset.tsv
--test_path datasets/fine-tuning_dataset/cstnet-tls1.3/packet/test_dataset.tsv
--epochs_num 10 --batch_size 32 --embedding word pos seg
--encoder transformer --mask fully_visible
--seq_length 128 --learning_rate 2e-5
usage: run_classifier.py [-h] [--pretrained_model_path PRETRAINED_MODEL_PATH] [--output_model_path OUTPUT_MODEL_PATH] --train_path TRAIN_PATH --dev_path DEV_PATH [--test_path TEST_PATH]
[--config_path CONFIG_PATH] [--embedding {word,pos,seg,sinusoidalpos,dual} [{word,pos,seg,sinusoidalpos,dual} ...]]
[--tgt_embedding {word,pos,seg,sinusoidalpos,dual} [{word,pos,seg,sinusoidalpos,dual} ...]] [--max_seq_length MAX_SEQ_LENGTH] [--relative_position_embedding] [--share_embedding]
[--remove_embedding_layernorm] [--factorized_embedding_parameterization] [--encoder {transformer,rnn,lstm,gru,birnn,bilstm,bigru,gatedcnn,dual}] [--decoder {None,transformer}]
[--mask {fully_visible,causal,causal_with_prefix}] [--layernorm_positioning {pre,post}] [--feed_forward {dense,gated}] [--relative_attention_buckets_num RELATIVE_ATTENTION_BUCKETS_NUM]
[--remove_attention_scale] [--remove_transformer_bias] [--layernorm {normal,t5}] [--bidirectional] [--parameter_sharing] [--has_residual_attention] [--has_lmtarget_bias]
[--target {sp,lm,mlm,bilm,cls} [{sp,lm,mlm,bilm,cls} ...]] [--tie_weights] [--pooling {mean,max,first,last}] [--prefix_lm_loss] [--learning_rate LEARNING_RATE] [--warmup WARMUP]
[--lr_decay LR_DECAY] [--optimizer {adamw,adafactor}] [--scheduler {linear,cosine,cosine_with_restarts,polynomial,constant,constant_with_warmup,inverse_sqrt,tri_stage}]
[--batch_size BATCH_SIZE] [--seq_length SEQ_LENGTH] [--dropout DROPOUT] [--epochs_num EPOCHS_NUM] [--report_steps REPORT_STEPS] [--seed SEED] [--log_path LOG_PATH]
[--log_level {ERROR,INFO,DEBUG,NOTSET}] [--log_file_level {ERROR,INFO,DEBUG,NOTSET}] [--pooling-type {mean,max,first,last}] [--tokenizer {bert,char,space}] [--soft_targets]
[--soft_alpha SOFT_ALPHA]
run_classifier.py: error: unrecognized arguments: --vocab_path models/encryptd_vocab.txt

linwhitehat · 2024-12-23T06:02:53Z

你好，根据你反馈的错误，应该是参数识别错误，你可以试试以下命令：

python3 fine-tuning/run_classifier.py --pretrained_model_path models/pre-trained_model.bin \
                                   --vocab_path models/encryptd_vocab.txt \
                                   --train_path datasets/cstnet-tls1.3/packet/train_dataset.tsv \
                                   --dev_path datasets/cstnet-tls1.3/packet/valid_dataset.tsv \
                                   --test_path datasets/cstnet-tls1.3/packet/test_dataset.tsv \
                                   --epochs_num 10 --batch_size 32 --embedding word_pos_seg \
                                   --encoder transformer --mask fully_visible \
                                   --seq_length 128 --learning_rate 2e-5

相关使用命令参数可以参考说明using-et-bert。

fjlinww · 2024-12-23T06:23:59Z

感谢回复！其实这个错误是这么来的：

按仓库首页readme命令执行微调时会先报
File "/home/fjlinww/ET-BERT/fine-tuning/run_classifier.py", line 10, in
from uer.layers import *
ModuleNotFoundError: No module named 'uer'
通过export PYTHONPATH=$PYTHONPATH:/home/fjlinww/ET-BERT/uer解决了
继续执行会报错run_classifier.py: error: argument --embedding: invalid choice: 'word_pos_seg' (choose from 'word', 'pos', 'seg', 'sinusoidalpos', 'dual')
这个参数是来自main函数的finetune_opts(parser)，查看了uer.opts.py当中的finetune_opts
parser.add_argument("--embedding", choices=["word", "pos", "seg", "sinusoidalpos", "dual"], default="word", nargs='+',
所以我改成了--embedding word pos seg
继续执行就会报错run_classifier.py: error: unrecognized arguments: --vocab_path models/encryptd_vocab.txt
这个参数也是来自main函数的finetune_opts(parser)，查看了uer.opts.py当中的tokenizer_opts
parser.add_argument("--vocab_path", default=None, type=str, help="Path of the vocabulary file.")
但是目前的run_classifier.py没有类似tokenizer_opts(parser)的，不知道是不是要加上才能够解析

linwhitehat · 2024-12-23T09:35:20Z

感谢回复！其实这个错误是这么来的：

按仓库首页readme命令执行微调时会先报
File "/home/fjlinww/ET-BERT/fine-tuning/run_classifier.py", line 10, in
from uer.layers import *
ModuleNotFoundError: No module named 'uer'
通过export PYTHONPATH=$PYTHONPATH:/home/fjlinww/ET-BERT/uer解决了

继续执行会报错run_classifier.py: error: argument --embedding: invalid choice: 'word_pos_seg' (choose from 'word', 'pos', 'seg', 'sinusoidalpos', 'dual')
这个参数是来自main函数的finetune_opts(parser)，查看了uer.opts.py当中的finetune_opts
parser.add_argument("--embedding", choices=["word", "pos", "seg", "sinusoidalpos", "dual"], default="word", nargs='+',
所以我改成了--embedding word pos seg

继续执行就会报错run_classifier.py: error: unrecognized arguments: --vocab_path models/encryptd_vocab.txt
这个参数也是来自main函数的finetune_opts(parser)，查看了uer.opts.py当中的tokenizer_opts
parser.add_argument("--vocab_path", default=None, type=str, help="Path of the vocabulary file.")
但是目前的run_classifier.py没有类似tokenizer_opts(parser)的，不知道是不是要加上才能够解析

十分抱歉出现这个错误，我们大概了解情况了，这个错误可能和前阵子更新了uer的文件有关系，有一些参数的适配项没有同步。如果迫切的话，你可以尝试回退一下uer中的文件版本试试。

fjlinww · 2024-12-24T01:51:17Z

我clone的是最新的代码，目前暂时没有看到除了main以外的其他分支了

linwhitehat · 2024-12-24T07:53:26Z

我clone的是最新的代码，目前暂时没有看到除了main以外的其他分支了

由于我们目前没有空闲的资源进行uer代码的验证，所以已经回退uer的旧版本，你可以使用已更新的相应仓库内容进行替换。后续我们将在更新新版uer时进行测试并更新其余相关文件与代码。

fjlinww · 2024-12-24T08:13:10Z

感谢！我重新clone，已经没有之前的报错了。
https://github.com/linwhitehat/ET-BERT?tab=readme-ov-file#using-et-bert
这里--pretrained_model_path models/pre-trained_model.bin要改成--pretrained_model_path models/pretrained_model.bin，才会跟您提供的保持一致

linwhitehat · 2024-12-24T09:08:34Z

感谢！我重新clone，已经没有之前的报错了。 https://github.com/linwhitehat/ET-BERT?tab=readme-ov-file#using-et-bert 这里--pretrained_model_path models/pre-trained_model.bin要改成--pretrained_model_path models/pretrained_model.bin，才会跟您提供的保持一致

好的，已修正。

fjlinww · 2024-12-25T02:08:50Z

您好，微调的时候发现了新的问题
~/ET-BERT$ python3 fine-tuning/run_classifier.py --pretrained_model_path models/pretrained_model.bin
--vocab_path models/encryptd_vocab.txt
--train_path datasets/fine-tuning_dataset/cstnet-tls1.3/packet/train_dataset.tsv
--dev_path datasets/fine-tuning_dataset/cstnet-tls1.3/packet/valid_dataset.tsv
--test_path datasets/fine-tuning_dataset/cstnet-tls1.3/packet/test_dataset.tsv
--epochs_num 10 --batch_size 32 --embedding word_pos_seg
--encoder transformer --mask fully_visible
--seq_length 128 --learning_rate 2e-5
/home/fjlinww/ET-BERT/fine-tuning/run_classifier.py:90: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
model.load_state_dict(torch.load(args.pretrained_model_path, map_location=map_location), strict=False)
Batch size: 32
The number of training instances: 465367
2 GPUs are available. Let's use them.
Start training.
0%| | 0/10 [00:00<?, ?it/s]

进度一直是0，我的环境是A6000x2，存储也是足够的

由于我只有2块GPU，所以我对应修改了fine-tuning/run_classifier.py：

只是进度一直是0，请教可能是什么原因？微调数据集是https://drive.google.com/drive/folders/1KlZatGoNm-4qu04z0LfrTpZr2oDaHfzr

weiyuhao2021 · 2024-12-26T14:28:36Z

可以试试在脚本里添加
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
或者在shell里export一下
有的时候卡住是因为有bug，但是多卡并发的时候可能不报错

LeiPudd · 2025-01-10T09:06:11Z

感谢！我重新clone，已经没有之前的报错了。 https://github.com/linwhitehat/ET-BERT?tab=readme-ov-file#using-et-bert 这里--pretrained_model_path models/pre-trained_model.bin要改成--pretrained_model_path models/pretrained_model.bin，才会跟您提供的保持一致

你好请问，我clone的是最新的代码，但是还是报这个错误ModuleNotFoundError: No module named 'uer'，怎么调整一下？

weiyuhao2021 · 2025-01-15T09:08:52Z

感谢！我重新clone，已经没有之前的报错了。 https://github.com/linwhitehat/ET-BERT?tab=readme-ov-file#using-et-bert 这里--pretrained_model_path models/pre-trained_model.bin要改成--pretrained_model_path models/pretrained_model.bin，才会跟您提供的保持一致

你好请问，我clone的是最新的代码，但是还是报这个错误ModuleNotFoundError: No module named 'uer'，怎么调整一下？

应该需要把本项目涉及的uer项目加到python解释器的path中

Chen9crane · 2025-02-18T13:38:58Z

感谢！我重新clone，已经没有之前的报错了。 https://github.com/linwhitehat/ET-BERT?tab=readme-ov-file#using-et-bert 这里--pretrained_model_path models/pre-trained_model.bin要改成--pretrained_model_path models/pretrained_model.bin，才会跟您提供的保持一致

你好，请问你重新clone的是哪个版本？我clone当前最新版本后微调时仍然遇到uer相关的报错。

linwhitehat added the help wanted Extra attention is needed label Dec 23, 2024

linwhitehat added the bug Something isn't working label Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

你好，在微调fine-tuning/run_classifier.py在运行时报错run_classifier.py: error: unrecognized arguments: --vocab_path models/encryptd_vocab.txt，我查了run_classifier.py并没有看到vocab_path参数的定义，请问怎么解决？谢谢 #100

你好，在微调fine-tuning/run_classifier.py在运行时报错run_classifier.py: error: unrecognized arguments: --vocab_path models/encryptd_vocab.txt，我查了run_classifier.py并没有看到vocab_path参数的定义，请问怎么解决？谢谢 #100

fjlinww commented Dec 22, 2024

linwhitehat commented Dec 23, 2024

fjlinww commented Dec 23, 2024

linwhitehat commented Dec 23, 2024

fjlinww commented Dec 24, 2024

linwhitehat commented Dec 24, 2024

fjlinww commented Dec 24, 2024

linwhitehat commented Dec 24, 2024

fjlinww commented Dec 25, 2024 •

edited

Loading

weiyuhao2021 commented Dec 26, 2024

LeiPudd commented Jan 10, 2025

weiyuhao2021 commented Jan 15, 2025

Chen9crane commented Feb 18, 2025

你好，在微调fine-tuning/run_classifier.py在运行时报错run_classifier.py: error: unrecognized arguments: --vocab_path models/encryptd_vocab.txt，我查了run_classifier.py并没有看到vocab_path参数的定义，请问怎么解决？谢谢 #100

你好，在微调fine-tuning/run_classifier.py在运行时报错run_classifier.py: error: unrecognized arguments: --vocab_path models/encryptd_vocab.txt，我查了run_classifier.py并没有看到vocab_path参数的定义，请问怎么解决？谢谢 #100

Comments

fjlinww commented Dec 22, 2024

linwhitehat commented Dec 23, 2024

fjlinww commented Dec 23, 2024

linwhitehat commented Dec 23, 2024

fjlinww commented Dec 24, 2024

linwhitehat commented Dec 24, 2024

fjlinww commented Dec 24, 2024

linwhitehat commented Dec 24, 2024

fjlinww commented Dec 25, 2024 • edited Loading

weiyuhao2021 commented Dec 26, 2024

LeiPudd commented Jan 10, 2025

weiyuhao2021 commented Jan 15, 2025

Chen9crane commented Feb 18, 2025

fjlinww commented Dec 25, 2024 •

edited

Loading