We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Here is my log:
(eole) PS E:\AI\NLP\EOLE> eole train -config en_zh.yaml [2025-01-04 12:38:31,154 INFO] Missing transforms field for corpus_1 data, set to default: []. [2025-01-04 12:38:31,154 INFO] Missing transforms field for valid data, set to default: []. [2025-01-04 12:38:31,154 INFO] Parsed 2 corpora from -data. [2025-01-04 12:38:31,156 INFO] Get special vocabs from Transforms: {'src': [], 'tgt': []}. [2025-01-04 12:38:31,762 INFO] Transforms applied: [] [2025-01-04 12:38:31,770 INFO] The first 10 tokens of the vocabs are:['<unk>', '<blank>', '<s>', '</s>', 'the\t308938\r', 'to\t163517\r', 'of\t163299\r', 'and\t146616\r', 'in\t106330\r', 'a\t102887\r'] [2025-01-04 12:38:31,770 INFO] The decoder start token is: <s> [2025-01-04 12:38:31,770 INFO] bos_token token is: <s> id: [2] [2025-01-04 12:38:31,772 INFO] eos_token token is: </s> id: [3] [2025-01-04 12:38:31,772 INFO] pad_token token is: <blank> id: [1] [2025-01-04 12:38:31,773 INFO] unk_token token is: <unk> id: [0] [2025-01-04 12:38:31,774 INFO] Building model... [2025-01-04 12:38:32,084 INFO] EncoderDecoderModel( (encoder): TransformerEncoder( (transformer_layers): ModuleList( (0-1): 2 x TransformerEncoderLayer( (input_layernorm): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (self_attn): SelfMHA( (linear_keys): Linear(in_features=512, out_features=512, bias=False) (linear_values): Linear(in_features=512, out_features=512, bias=False) (linear_query): Linear(in_features=512, out_features=512, bias=False) (softmax): Softmax(dim=-1) (dropout): Dropout(p=0.1, inplace=False) (final_linear): Linear(in_features=512, out_features=512, bias=False) ) (dropout): Dropout(p=0.3, inplace=False) (post_attention_layernorm): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (gate_up_proj): Linear(in_features=512, out_features=2048, bias=False) (down_proj): Linear(in_features=2048, out_features=512, bias=False) (dropout_1): Dropout(p=0.3, inplace=False) (dropout_2): Dropout(p=0.3, inplace=False) ) ) ) (layer_norm): LayerNorm((512,), eps=1e-06, elementwise_affine=True) ) (decoder): TransformerDecoder( (transformer_layers): ModuleList( (0-1): 2 x TransformerDecoderLayer( (input_layernorm): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (self_attn): SelfMHA( (linear_keys): Linear(in_features=512, out_features=512, bias=False) (linear_values): Linear(in_features=512, out_features=512, bias=False) (linear_query): Linear(in_features=512, out_features=512, bias=False) (softmax): Softmax(dim=-1) (dropout): Dropout(p=0.1, inplace=False) (final_linear): Linear(in_features=512, out_features=512, bias=False) ) (dropout): Dropout(p=0.3, inplace=False) (post_attention_layernorm): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (gate_up_proj): Linear(in_features=512, out_features=2048, bias=False) (down_proj): Linear(in_features=2048, out_features=512, bias=False) (dropout_1): Dropout(p=0.3, inplace=False) (dropout_2): Dropout(p=0.3, inplace=False) ) (precontext_layernorm): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (context_attn): ContextMHA( (linear_keys): Linear(in_features=512, out_features=512, bias=False) (linear_values): Linear(in_features=512, out_features=512, bias=False) (linear_query): Linear(in_features=512, out_features=512, bias=False) (softmax): Softmax(dim=-1) (dropout): Dropout(p=0.1, inplace=False) (final_linear): Linear(in_features=512, out_features=512, bias=False) ) ) ) (layer_norm): LayerNorm((512,), eps=1e-06, elementwise_affine=True) ) (src_emb): Embeddings( (embeddings): Embedding(32760, 512, padding_idx=1) (dropout): Dropout(p=0.3, inplace=False) (pe): PositionalEncoding() ) (tgt_emb): Embeddings( (embeddings): Embedding(32768, 512, padding_idx=1) (dropout): Dropout(p=0.3, inplace=False) (pe): PositionalEncoding() ) (generator): Linear(in_features=512, out_features=32768, bias=True) ) [2025-01-04 12:38:32,085 INFO] embeddings: 33550336 [2025-01-04 12:38:32,085 INFO] encoder: 6296576 [2025-01-04 12:38:32,085 INFO] decoder: 8395776 [2025-01-04 12:38:32,086 INFO] generator: 16809984 [2025-01-04 12:38:32,086 INFO] other: 0 [2025-01-04 12:38:32,086 INFO] * number of parameters: 65052672 [2025-01-04 12:38:32,086 INFO] Trainable parameters = {'torch.float32': 65052672} [2025-01-04 12:38:32,087 INFO] Non trainable parameters = {} [2025-01-04 12:38:32,087 INFO] * src vocab size = 32760 [2025-01-04 12:38:32,087 INFO] * tgt vocab size = 32768 [2025-01-04 12:38:32,089 INFO] Starting training on GPU: [0] [2025-01-04 12:38:32,089 INFO] Start training loop and validate every 500 steps... [2025-01-04 12:38:32,089 INFO] Scoring with: None [2025-01-04 12:38:34,975 INFO] Weighted corpora loaded so far: * corpus_1: 1 [2025-01-04 12:38:37,847 INFO] Weighted corpora loaded so far: * corpus_1: 1 [2025-01-04 12:38:38,808 INFO] Step 50/ 1000; acc: 60.0; ppl: 4215.71; xent: 8.35; aux: 0.000; lr: 1.00e+00; sents: 3104; bsz: 1454/ 179/62; 10824/1330 tok/s; 7 sec; [2025-01-04 12:38:39,481 INFO] Step 100/ 1000; acc: 67.4; ppl: 66.66; xent: 4.20; aux: 0.000; lr: 1.00e+00; sents: 3104; bsz: 1330/ 166/62; 98699/12342 tok/s; 7 sec; [2025-01-04 12:38:40,144 INFO] Step 150/ 1000; acc: 62.3; ppl: 84.85; xent: 4.44; aux: 0.000; lr: 1.00e+00; sents: 3152; bsz: 1375/ 174/63; 103715/13117 tok/s; 8 sec; [2025-01-04 12:38:40,828 INFO] Step 200/ 1000; acc: 66.9; ppl: 50.40; xent: 3.92; aux: 0.000; lr: 1.00e+00; sents: 3104; bsz: 1474/ 182/62; 107841/13329 tok/s; 9 sec; [2025-01-04 12:38:41,492 INFO] Step 250/ 1000; acc: 66.9; ppl: 23.07; xent: 3.14; aux: 0.000; lr: 1.00e+00; sents: 3152; bsz: 1331/ 171/63; 100208/12857 tok/s; 9 sec; [2025-01-04 12:38:42,161 INFO] Step 300/ 1000; acc: 67.3; ppl: 19.81; xent: 2.99; aux: 0.000; lr: 1.00e+00; sents: 3152; bsz: 1305/ 170/63; 97668/12754 tok/s; 10 sec; [2025-01-04 12:38:42,821 INFO] Step 350/ 1000; acc: 65.0; ppl: 21.15; xent: 3.05; aux: 0.000; lr: 1.00e+00; sents: 3104; bsz: 1406/ 178/62; 106431/13447 tok/s; 11 sec; [2025-01-04 12:38:43,473 INFO] Step 400/ 1000; acc: 64.1; ppl: 22.80; xent: 3.13; aux: 0.000; lr: 1.00e+00; sents: 3152; bsz: 1331/ 174/63; 102137/13355 tok/s; 11 sec; [2025-01-04 12:38:44,144 INFO] Step 450/ 1000; acc: 63.1; ppl: 17.22; xent: 2.85; aux: 0.000; lr: 1.00e+00; sents: 3104; bsz: 1440/ 182/62; 107283/13569 tok/s; 12 sec; [2025-01-04 12:38:44,832 INFO] Step 500/ 1000; acc: 64.6; ppl: 30.48; xent: 3.42; aux: 0.000; lr: 1.00e+00; sents: 3104; bsz: 1385/ 177/62; 100644/12860 tok/s; 13 sec; [2025-01-04 12:39:00,069 INFO] valid stats calculation took: 15.235411167144775 s. [2025-01-04 12:39:00,070 INFO] Train perplexity: 52.1136 [2025-01-04 12:39:00,070 INFO] Train accuracy: 64.7299 [2025-01-04 12:39:00,071 INFO] Sentences processed: 31232 [2025-01-04 12:39:00,071 INFO] Average bsz: 1383/ 175/62 [2025-01-04 12:39:00,071 INFO] Validation perplexity: 213.562 [2025-01-04 12:39:00,071 INFO] Validation accuracy: 71.4428 [2025-01-04 12:39:00,072 INFO] Saving optimizer and weights to step_500, and symlink to en-zh/run/model [2025-01-04 12:39:00,281 INFO] Saving transforms artifacts, if any, to en-zh/run/model\step_500 [2025-01-04 12:39:00,282 INFO] Saving config and vocab to en-zh/run/model Traceback (most recent call last): File "\\?\C:\ProgramData\anaconda3\envs\eole\Scripts\eole-script.py", line 33, in <module> sys.exit(load_entry_point('eole', 'console_scripts', 'eole')()) File "e:\ai\nlp\eole\eole\eole\bin\main.py", line 39, in main bin_cls.run(args) File "e:\ai\nlp\eole\eole\eole\bin\run\train.py", line 70, in run train(config) File "e:\ai\nlp\eole\eole\eole\bin\run\train.py", line 57, in train train_process(config, device_id=0) File "e:\ai\nlp\eole\eole\eole\train_single.py", line 244, in main trainer.train( File "e:\ai\nlp\eole\eole\eole\trainer.py", line 375, in train self.model_saver.save(step, moving_average=self.moving_average) File "e:\ai\nlp\eole\eole\eole\models\model_saver.py", line 359, in save self._save(step) File "e:\ai\nlp\eole\eole\eole\models\model_saver.py", line 336, in _save self._save_vocab() File "e:\ai\nlp\eole\eole\eole\models\model_saver.py", line 280, in _save_vocab json.dump(vocab_data, f, indent=2, ensure_ascii=False) File "C:\ProgramData\anaconda3\envs\eole\lib\json\__init__.py", line 180, in dump fp.write(chunk) UnicodeEncodeError: 'gbk' codec can't encode character '\u011f' in position 12: illegal multibyte sequence
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Here is my log:
The text was updated successfully, but these errors were encountered: