-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapting LM_scoring.py, PPL differs from tensorboard #2
base: main
Are you sure you want to change the base?
Conversation
Two suspicions: I will first see if the config used to compute loss on tensorboard is the same compared to the script. |
Filtertoolong is applied for valid but not in the script, I forced it in the inference config but the ppl did not lower.
|
When I comment these 3 settings in the inference config file, nothing happens, ppl is still at 22.34
edit: perfectly normal, scoring isn't inferring |
Directly in LossCompute, I tried to overwrite differing values, ppl is still at 22.34
|
1a90719
to
61f79da
Compare
The LM_scoring.py script did not return any output when used in a stand-alone way (which is normal) but also there was an error when launching it the standard way
python3 eole/eole/bin/main.py tools LM_scoring -config inference.yaml
(
config.update(stuff_to_update)
adds{'bin': 'tools', 'sub_bin': 'LM_scoring', 'config': 'inference.yaml'}
which is not need for inference)I removed the old code logic, every arguments lies in yaml config file now, the only arg needed is the config.
Moreover I adapted the loading of the model, which is now handled via
BaseModel.load_test_model(config, 0)
and I also retrieve the corresponding unk token index using
padding_idx = vocabs["tgt"].tokens_to_ids[pad_token]
Regarding the
CrossEntropyLoss
torch module initialization, it now requires a vocab, andlambda_coverage
andlambda_align
are now elsewhere in the config.I also make sure transforms are correctly set-up and that
config.tgt
is duplicated fromconfig.src
if absentThere are light adaptations regarding loss computations, because it now returns
estims
.An heavier adaptations is that the expected tensor format changed, we have now 2 dimensions instead of 3.
To print the loss, I now have to perform
cumul_loss / cumul_length.item()
instead of directly referring tocumul_loss
For my tests, I'm using a LM built with the wiki-103 recipe, I only changed training data (https://github.com/eole-nlp/eole/blob/main/recipes/wiki_103/wiki_103.yaml) and I observe differences regarding perplexities:
tensorboard: 20.9643 (3.0428 loss)
lm_scoring: 22.34 (3.11 loss)