Adapting LM_scoring.py, PPL differs from tensorboard #2

anthdr · 2025-01-16T10:26:15Z

The LM_scoring.py script did not return any output when used in a stand-alone way (which is normal) but also there was an error when launching it the standard way
python3 eole/eole/bin/main.py tools LM_scoring -config inference.yaml

Traceback (most recent call last):
  File "/nas-labs/MT/pytorchwork/Recipes/toy-antoine/eole/eole/bin/main.py", line 38, in <module>
    main()
  File "/nas-labs/MT/pytorchwork/Recipes/toy-antoine/eole/eole/bin/main.py", line 34, in main
    bin_cls.run(args)
  File "/nas-labs/MT/pytorchwork/Recipes/toy-antoine/eole/eole/bin/tools/LM_scoring.py", line 68, in run
    config = PredictConfig(**config)
  File "/usr/local/lib/python3.10/dist-packages/pydantic/main.py", line 214, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 3 validation errors for PredictConfig
bin
  Extra inputs are not permitted [type=extra_forbidden, input_value='tools', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/extra_forbidden
sub_bin
  Extra inputs are not permitted [type=extra_forbidden, input_value='LM_scoring', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/extra_forbidden
config
  Extra inputs are not permitted [type=extra_forbidden, input_value='inference.yaml', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/extra_forbidden

(config.update(stuff_to_update) adds {'bin': 'tools', 'sub_bin': 'LM_scoring', 'config': 'inference.yaml'} which is not need for inference)
I removed the old code logic, every arguments lies in yaml config file now, the only arg needed is the config.

Moreover I adapted the loading of the model, which is now handled via BaseModel.load_test_model(config, 0)
and I also retrieve the corresponding unk token index using
padding_idx = vocabs["tgt"].tokens_to_ids[pad_token]

Regarding the CrossEntropyLoss torch module initialization, it now requires a vocab, and lambda_coverage and lambda_align are now elsewhere in the config.

I also make sure transforms are correctly set-up and that config.tgt is duplicated from config.src if absent

There are light adaptations regarding loss computations, because it now returns estims.
An heavier adaptations is that the expected tensor format changed, we have now 2 dimensions instead of 3.

To print the loss, I now have to perform cumul_loss / cumul_length.item() instead of directly referring to cumul_loss

For my tests, I'm using a LM built with the wiki-103 recipe, I only changed training data (https://github.com/eole-nlp/eole/blob/main/recipes/wiki_103/wiki_103.yaml) and I observe differences regarding perplexities:
tensorboard: 20.9643 (3.0428 loss)
lm_scoring: 22.34 (3.11 loss)

anthdr · 2025-01-16T10:56:55Z

Two suspicions:
either the way to compute loss is not the same:
https://github.com/eole-nlp/eole/blob/d2992b7595e5c0ded0252fb908e434f557313401/eole/utils/loss.py#L105-L109 (reduction="sum" but this is unlikely)
Or maybe padding masking differs.

I will first see if the config used to compute loss on tensorboard is the same compared to the script.

anthdr · 2025-01-16T11:56:48Z

Filtertoolong is applied for valid but not in the script, I forced it in the inference config but the ppl did not lower.

transforms: [onmt_tokenize, filtertoolong]
transforms_configs:
  onmt_tokenize:
    src_subword_type: bpe
    src_subword_model: train/subwords.bpe
    src_onmttok_kwargs: {"mode": "aggressive", "joiner_annotate": True, "preserve_placeholders":
    True, "case_markup": True, "soft_case_regions": True, "preserve_segmented_tokens":
    True}
  filtertoolong:
    src_seq_length: 512
    tgt_seq_length: 512

anthdr · 2025-01-16T12:01:45Z

When I comment these 3 settings in the inference config file, nothing happens, ppl is still at 22.34

n_best: 3
top_p: 0.9
beam_size: 10

edit: perfectly normal, scoring isn't inferring

anthdr · 2025-01-16T13:18:24Z

Directly in LossCompute, I tried to overwrite differing values, ppl is still at 22.34

self.lambda_coverage = 0
self.lm_prior_lambda = 0.0
self.lm_prior_tau = 1.0

anthdr added 2 commits January 16, 2025 11:25

initial commit, working but ppl not exact

d8aa5bf

black reformating

61f79da

anthdr force-pushed the fix_LM_scoring_script branch from 1a90719 to 61f79da Compare January 16, 2025 14:09

anthdr changed the title ~~initial commit, working but ppl not exact~~ Adapting LM_scoring.py, PPL differs from tensorboard Jan 16, 2025

anthdr added 3 commits January 16, 2025 15:28

simplifications

68a0e6b

removed inference settings and removed main

484b745

load model using device.index

0f3ea55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapting LM_scoring.py, PPL differs from tensorboard #2

Adapting LM_scoring.py, PPL differs from tensorboard #2

anthdr commented Jan 16, 2025 •

edited

Loading

anthdr commented Jan 16, 2025

anthdr commented Jan 16, 2025

anthdr commented Jan 16, 2025 •

edited

Loading

anthdr commented Jan 16, 2025 •

edited

Loading

Adapting LM_scoring.py, PPL differs from tensorboard #2

Are you sure you want to change the base?

Adapting LM_scoring.py, PPL differs from tensorboard #2

Conversation

anthdr commented Jan 16, 2025 • edited Loading

anthdr commented Jan 16, 2025

anthdr commented Jan 16, 2025

anthdr commented Jan 16, 2025 • edited Loading

anthdr commented Jan 16, 2025 • edited Loading

anthdr commented Jan 16, 2025 •

edited

Loading

anthdr commented Jan 16, 2025 •

edited

Loading

anthdr commented Jan 16, 2025 •

edited

Loading