Make it possible to customize generation config for Trainer's training loop evaluation #24427

antonioalegria · 2023-06-22T15:00:08Z

Feature request

When using predict_with_generate and we want to compute generation-based metrics during the eval happening during training, it would be good if the model's generation config is used and/or if we can pass the indented generation config into the train method so that it can be passed to evaluate.

As it is, the generation done is using the default parameters only.

Motivation

The current way GenerationConfigs are used is pretty inconsistent and muddled IMO. You can set it at the model level but it's only used sometimes. You can pass it directly to evaluate, predict or generate but it's not clear if you should pass it as kwargs or as a full GenerationConfig.

Would be great to clean this up so that it's super clear on how to use it and have a very consistent way to use it, as in python. My suggestion would be to set it at the Trainer level and be able to override it in the evaluate, predict, generate methods with a simple generation_config: GenerationConfig parameter.

Your contribution

Happy to discuss different possibilities and see where I could help.

The text was updated successfully, but these errors were encountered:

sgugger · 2023-06-22T15:06:13Z

You can customize the model generation_config as you want and it will be the one being used. cc @gante to make sure I'm not saying something untrue.

antonioalegria · 2023-06-22T15:20:04Z

Maybe it's because I was setting the Generation Config in the model without putting _from_model_config = False and it was triggering the following, which resets the config:

if generation_config is None:
    # legacy: users may modify the model configuration to control generation -- update the generation config
    # model attribute accordingly, if it was created from the model config
    if self.generation_config._from_model_config:
        new_generation_config = GenerationConfig.from_model_config(self.config)
        if new_generation_config != self.generation_config:
            warnings.warn(
                "You have modified the pretrained model configuration to control generation. This is a"
                " deprecated strategy to control generation and will be removed soon, in a future version."
                " Please use a generation configuration file (see"
                " https://huggingface.co/docs/transformers/main_classes/text_generation)"
            )
            self.generation_config = new_generation_config
    generation_config = self.generation_config

antonioalegria · 2023-06-22T15:29:28Z

Yeah, that was it. Though it is a bit confusing because that message says it's a deprecated strategy but doesn't say it will revert the configs.

So, maybe docs should be improved to say when Generation Config is used where, with what priority between the default one, the model one, the gen_kwargs, etc...

For example, in Seq2SeqTrainer.evaluate and predict there is this code which doesn't take into consideration if there is a generation config set in the model.

gen_kwargs = gen_kwargs.copy()
if gen_kwargs.get("max_length") is None and gen_kwargs.get("max_new_tokens") is None:
    gen_kwargs["max_length"] = self.args.generation_max_length
gen_kwargs["num_beams"] = (
    gen_kwargs["num_beams"] if gen_kwargs.get("num_beams") is not None else self.args.generation_num_beams
)

Furthermore, in Seq2SeqTrainer.prediction_step there is this code that, again, doesn't take into account the model generation config and goes for the gen_kwargs (which aren't passed in the training loop).

gen_kwargs = self._gen_kwargs.copy()
if gen_kwargs.get("max_length") is None and gen_kwargs.get("max_new_tokens") is None:
    gen_kwargs["max_length"] = self.model.config.max_length
gen_kwargs["num_beams"] = (
    gen_kwargs["num_beams"] if gen_kwargs.get("num_beams") is not None else self.model.config.num_beams
)
default_synced_gpus = True if is_deepspeed_zero3_enabled() else False
gen_kwargs["synced_gpus"] = (
    gen_kwargs["synced_gpus"] if gen_kwargs.get("synced_gpus") is not None else default_synced_gpus
)

If you do have a Generation Config set in the model or passed as generation_config parameter in evaluate/predict, where you set max_new_tokens this will yield a warning: "Both max_new_tokens and max_length seem to have been set. It was the code above that set max_length because it didn't see the passed GenerationConfig.

So it seems if I set a GenerationConfig in the model, with max_new_tokens, then I will always get this warning because the training loop doesn't pass anything directly to evaluate/predict.

Let me know if I should close this.

antonioalegria · 2023-06-22T17:09:29Z

Also, the shape[1] of outputs.predictions coming out of Seq2SeqTrainer.predict is 20 and it does not respect max_new_tokens passed in the generation config.

gante · 2023-06-24T18:41:13Z

Hi @antonioalegria 👋

We do support parameterizing Seq2SeqTrainer with a GenerationConfig object. You seem to be hitting an issue due to _from_model_config being True, which simply means that you've issued a sequence of commands that we did not account for :)

Most of the issues you described are intentional temporary workarounds -- when a new feature is introduced, we have to ensure our library goes through a deprecation cycle, in which the old way of doing things take precedence. That's why you see strange patterns like the use of _from_model_config or "duplicated" pieces of code to control the same thing. Due to the large number of possibilities within transformers, sometimes we simply miss a few combinations.

That being said, let's start with the basics (which you haven't done 😉 ): what version of transformers are you using, and how can I reproduce your issue?

antonioalegria · 2023-06-28T15:56:15Z

Hi, I have removed the _from_model_config from the duplicated and altered config and the first issue no longer happens.

I still get those "Both max_new_tokens and max_length seem to have been set." warnings though.

transformers: 4.30.2

To reproduce use this code you just need to call Seq2SeqTrainer.evaluate with generation_config=your_gen_config, or set the generation_config in the model. In any case, you have to set max_new_tokens.

You will then see those warnings, which shouldn't happen.

Let me know if you'd like me to provide a running script.

n-splv · 2023-07-17T16:27:21Z

So what is the correct way to parametrize the generation (e.g. to use contrastive search) during the model training? The documentation misses this point.

github-actions · 2023-08-11T15:02:34Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

n-splv · 2023-09-04T10:26:09Z

@gante @sgugger sorry to bother you, but it seems like the issue with GenerationConfig has not been resolved. Please, take a look at this code:

my_config = GenerationConfig.from_pretrained(CHECKPOINT_DIR)
my_config.forced_bos_token_id = tokenizer.encode("[")
my_config.forced_eos_token_id = tokenizer.encode("]")
my_config.max_new_tokens = 1000
my_config.max_length = 1000

args = Seq2SeqTrainingArguments(
    ...
    predict_with_generate=True,
    generation_config=my_config,
)

trainer = Seq2SeqTrainer(
    model_init=model_init,
    args=args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

I check the config before starting the training, and everything looks fine:

>>> trainer.model.generation_config
GenerationConfig {
  "decoder_start_token_id": 0,
  "eos_token_id": 2,
  "forced_bos_token_id": [
    63
  ],
  "forced_eos_token_id": [
    65
  ],
  "max_length": 1000,
  "max_new_tokens": 1000,
  "pad_token_id": 0,
  "transformers_version": "4.32.1"
}

Then during the evaluation phase of the training I print the generated sentences and notice that the model doesn't follow my generation config. The forced start and end tokens are ignored and the number of generated tokens is 20 for any sample. I then stop the training and run the previous command again, and to my surprise see the following:

>>> trainer.model.generation_config
GenerationConfig {
  "decoder_start_token_id": 0,
  "eos_token_id": 2,
  "pad_token_id": 0,
  "transformers_version": "4.32.1"
}

Why has the config been reset and how can I avoid it? It seems like a bug to me.

@antonioalegria FYI

n-splv · 2023-09-04T13:59:48Z

Upd: the source of the problem was my model_init() function, which was being called at the start of the training process. This function returns a new instance of model with default generation config, and so the custom one just gets lost.

I managed to achieve the desired behavior by modifying my model_init() function like that:

def model_init():
    model = T5ForConditionalGeneration.from_pretrained(MODEL_NAME)
    model.resize_token_embeddings(len(tokenizer), pad_to_multiple_of=16)
    model.generation_config = my_config
    return model

However, now at each evaluation I get this annoying warning:

Both `max_new_tokens` (=1000) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information.

At this point I'm not sure where does the max_length = 20 come from, but luckily it doesn't matter, so for now I just silenced it:

logging.getLogger("transformers.generation.utils").setLevel(logging.ERROR)

It seems like it would be more convenient if the explicitly passed GenerationConfig would be automatically applied to the model returned by the model_init() function, otherwise it's kind of pointless to use both these arguments together. What do you think?

gante · 2023-09-05T11:24:08Z

Hey @nick-maykr 👋 Thank you for raising the issue!

I've just opened a PR that tackles this issue (fine-tuning from older models often ignoring generation_config changes), which should no longer happen after it gets merged 👉 #25962

As for the max_length warning: the seq2seq trainer was doing something dodgy regarding max_length, fixing it. EDIT 👉 #25987

Let me know if you come across further issues. As I've written above, we have changed how we control generation late 2022, and maintaining retrocompatibility on top of the new features is challenging :)

github-actions bot closed this as completed Aug 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it possible to customize generation config for Trainer's training loop evaluation #24427

Make it possible to customize generation config for Trainer's training loop evaluation #24427

antonioalegria commented Jun 22, 2023

sgugger commented Jun 22, 2023

antonioalegria commented Jun 22, 2023 •

edited

Loading

antonioalegria commented Jun 22, 2023 •

edited

Loading

antonioalegria commented Jun 22, 2023

gante commented Jun 24, 2023

antonioalegria commented Jun 28, 2023

n-splv commented Jul 17, 2023

github-actions bot commented Aug 11, 2023

n-splv commented Sep 4, 2023 •

edited

Loading

n-splv commented Sep 4, 2023

gante commented Sep 5, 2023 •

edited

Loading

Make it possible to customize generation config for Trainer's training loop evaluation #24427

Make it possible to customize generation config for Trainer's training loop evaluation #24427

Comments

antonioalegria commented Jun 22, 2023

Feature request

Motivation

Your contribution

sgugger commented Jun 22, 2023

antonioalegria commented Jun 22, 2023 • edited Loading

antonioalegria commented Jun 22, 2023 • edited Loading

antonioalegria commented Jun 22, 2023

gante commented Jun 24, 2023

antonioalegria commented Jun 28, 2023

n-splv commented Jul 17, 2023

github-actions bot commented Aug 11, 2023

n-splv commented Sep 4, 2023 • edited Loading

n-splv commented Sep 4, 2023

gante commented Sep 5, 2023 • edited Loading

antonioalegria commented Jun 22, 2023 •

edited

Loading

antonioalegria commented Jun 22, 2023 •

edited

Loading

n-splv commented Sep 4, 2023 •

edited

Loading

gante commented Sep 5, 2023 •

edited

Loading