TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 #16332

gante · 2022-03-22T12:50:16Z

Context

From the discussion in #16311 (PR that applies @unpack_inputs to TF gpt2): In the generate refactor, TF gpt2 got an updated prepare_inputs_for_generation(), where its output past got renamed into past_key_values (i.e. as in FLAX/PT). Patrick suggested reverting it since this prepared input could be used externally.

What did I find while working on this PR?

Reverting as suggested above makes TF gpt2 fail tests related to encoder_decoder, which got an updated prepare_inputs_for_generation() in the same PR that expects a past_key_values (and not a past).

Meanwhile, I've also noticed a related bug in the new @unpack_inputs decorator, where it was not preserving a previous behavior -- when the model received a past_key_values but expected a past input (and vice-versa), it automatically swapped the keyword. This feature was the key enabler behind encoder_decoder+gpt2, as encoder_decoder was throwing out past_key_values prepared inputs that were caught by gpt2's past argument.

So, what's in this PR?

This PR fixes the two issues above, which are needed for proper behavior in all combinations of inputs to TF gpt2, after the introduction of the decorator:

corrects the bug in the @unpack_inputs decorator and adds tests to ensure we don't regress on some key properties of our TF input handling. After this PR, gpt2 preserves its ability to receive past (and past_key_values, if through encoder_decoder-like), with and without the decorator.
It also reverts past_key_values into past whenever the change was introduced in TF generate refactor - past without encoder outputs #15944, and makes the necessary changes in encoder_decoder-like models.

gante · 2022-03-22T12:55:15Z

~~(Wait, there is an error)~~
~~Should be good now~~
Nope

gante · 2022-03-22T13:13:26Z

@patrickvonplaten hold your review, this change is not conflicting with the encoder_decoder models. I believe I know why, digging deeper.

HuggingFaceDocBuilderDev · 2022-03-22T13:13:41Z

The documentation is not available anymore as the PR was closed or merged.

gante · 2022-03-22T19:41:28Z

@patrickvonplaten now it is properly fixed -- please check the updated description at the top :)

Meanwhile, the scope increased a bit, so I'm tagging a 2nd reviewer (@Rocketknight1 )

gante · 2022-03-22T19:45:00Z

src/transformers/modeling_tf_utils.py

@@ -423,13 +424,13 @@ def input_processing(func, config, input_ids, **kwargs):
        )
        output["past_key_values"] = kwargs["kwargs_call"].pop("decoder_cached_states")

-    if "past" in kwargs["kwargs_call"] and "past_key_values" in kwargs:
+    if "past" in kwargs["kwargs_call"] and "past_key_values" in parameter_names:


This was the root cause for the problem in the decorator -- previously, this function was called inside call, where kwargs contained all keyword arguments (at the very least, with their default value).

The decorator now calls this before call and, because it does not have default values, kwargs was empty. This meant that the past<>past_key_values magic, needed for gpt2+encoder_decoder, was not happening when the decorator was applied on gpt2.

Makes sense!

patrickvonplaten · 2022-03-23T12:39:33Z

src/transformers/models/encoder_decoder/modeling_tf_encoder_decoder.py

@@ -694,14 +694,17 @@ def prepare_inputs_for_generation(
    ):
        decoder_inputs = self.decoder.prepare_inputs_for_generation(input_ids, past=past)
        decoder_attention_mask = decoder_inputs["attention_mask"] if "attention_mask" in decoder_inputs else None
+        past_key_values = decoder_inputs.get("past_key_values")
+        if past_key_values is None:
+            past_key_values = decoder_inputs.get("past")  # e.g. on TF GPT2


patrickvonplaten · 2022-03-23T12:39:42Z

src/transformers/models/gpt2/modeling_tf_gpt2.py

@@ -878,7 +878,7 @@ def prepare_inputs_for_generation(self, inputs, past=None, use_cache=None, use_x
            "input_ids": inputs,
            "attention_mask": attention_mask,
            "position_ids": position_ids,
-            "past_key_values": past,
+            "past": past,


Thanks for reverting this

patrickvonplaten

Thanks for fixing - I'm not to familiar with the changes in modeling_tf_utils.py so if possible it'd be nice if someone else could take a look here

Rocketknight1 · 2022-03-23T13:23:53Z

I think Sylvain also tries to avoid modeling_tf_utils.py as much as possible too these days, lol. Let me take a look!

Rocketknight1 · 2022-03-23T13:25:35Z

src/transformers/modeling_tf_utils.py

-    parameter_names = list(signature.keys())
+    parameter_names_list = list(signature.keys())
+    parameter_names = set(parameter_names_list)


Is it possible for the signature to have duplicate keys, or is this just to make if x in parameter_names faster?

Yeah, I don't see why we need two of those. Creating them is probably slower than the lookup in the list (models have 10 arguments usually).

Haha yeah, I went overboard with this one -- with the number of lookups we do per call, it is faster to create the set, but we're talking about microseconds (went on to check it with timeit). Clearly not worth adding code.

Reverting.

Rocketknight1

Overall, this looks like a clean fix to the GPT2 workaround, so I'm happy to approve it. I'd really like to get rid of these old non-standard arguments next time we can make a breaking change, though!

sgugger

Thanks for fixing this @gante !

sgugger · 2022-03-23T13:46:30Z

src/transformers/modeling_tf_utils.py

-    parameter_names = list(signature.keys())
+    parameter_names_list = list(signature.keys())
+    parameter_names = set(parameter_names_list)


Yeah, I don't see why we need two of those. Creating them is probably slower than the lookup in the list (models have 10 arguments usually).

gante · 2022-03-23T15:36:49Z

I'd really like to get rid of these old non-standard arguments next time we can make a breaking change, though!

@Rocketknight1 me too 🙈 that function is a mess

…ble name in GPT2 (#16332) * revert tf gpt2 * add test for unpack_inputs and fix test case * add changes to vision encoder decoder

gante requested a review from patrickvonplaten March 22, 2022 12:51

gante changed the title ~~Revert past variable name in TF GPT~~ Revert past variable name in TF GPT2 Mar 22, 2022

gante changed the title ~~Revert past variable name in TF GPT2~~ TF - Fix interchangeable past/past_key_values and revert past variable name in GPT2 Mar 22, 2022

gante changed the title ~~TF - Fix interchangeable past/past_key_values and revert past variable name in GPT2~~ TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 Mar 22, 2022

gante requested a review from Rocketknight1 March 22, 2022 19:41

gante commented Mar 22, 2022

View reviewed changes

gante mentioned this pull request Mar 22, 2022

Adding type hints & decorator for TFT5 #16275

Closed

patrickvonplaten reviewed Mar 23, 2022

View reviewed changes

patrickvonplaten approved these changes Mar 23, 2022

View reviewed changes

patrickvonplaten requested a review from sgugger March 23, 2022 12:41

Rocketknight1 reviewed Mar 23, 2022

View reviewed changes

Rocketknight1 approved these changes Mar 23, 2022

View reviewed changes

sgugger approved these changes Mar 23, 2022

View reviewed changes

gante added 10 commits March 23, 2022 18:00

revert tf gpt2

137a1d9

remove unwanted change

a3fbcb0

add test for unpack_inputs and fix test case

d4c84e9

move import

203aa11

re-revert gpt2

7538baa

maybe like this?

f894706

add changes to vision encoder decoder

f050d3e

add more tests and fix another corner case

d746dc9

simplify tests

1520ce1

simplify change

8e0b1ae

revert use of set

c246b8d

gante force-pushed the tf_past_revert branch from 315c2a7 to c246b8d Compare March 23, 2022 18:01

gante merged commit 9e8c37d into huggingface:main Mar 23, 2022

gante deleted the tf_past_revert branch March 28, 2022 16:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 #16332

TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 #16332

gante commented Mar 22, 2022 •

edited

Loading

gante commented Mar 22, 2022 •

edited

Loading

gante commented Mar 22, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 22, 2022 •

edited

Loading

gante commented Mar 22, 2022

gante Mar 22, 2022 •

edited

Loading

Rocketknight1 Mar 23, 2022

patrickvonplaten Mar 23, 2022

patrickvonplaten Mar 23, 2022

patrickvonplaten left a comment

Rocketknight1 commented Mar 23, 2022

Rocketknight1 Mar 23, 2022

sgugger Mar 23, 2022

gante Mar 23, 2022

Rocketknight1 left a comment

sgugger left a comment

sgugger Mar 23, 2022

gante commented Mar 23, 2022

TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 #16332

TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 #16332

Conversation

gante commented Mar 22, 2022 • edited Loading

Context

What did I find while working on this PR?

So, what's in this PR?

gante commented Mar 22, 2022 • edited Loading

gante commented Mar 22, 2022 • edited Loading

HuggingFaceDocBuilderDev commented Mar 22, 2022 • edited Loading

gante commented Mar 22, 2022

gante Mar 22, 2022 • edited Loading

Choose a reason for hiding this comment

Rocketknight1 Mar 23, 2022

Choose a reason for hiding this comment

patrickvonplaten Mar 23, 2022

Choose a reason for hiding this comment

patrickvonplaten Mar 23, 2022

Choose a reason for hiding this comment

patrickvonplaten left a comment

Choose a reason for hiding this comment

Rocketknight1 commented Mar 23, 2022

Rocketknight1 Mar 23, 2022

Choose a reason for hiding this comment

sgugger Mar 23, 2022

Choose a reason for hiding this comment

gante Mar 23, 2022

Choose a reason for hiding this comment

Rocketknight1 left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

sgugger Mar 23, 2022

Choose a reason for hiding this comment

gante commented Mar 23, 2022

gante commented Mar 22, 2022 •

edited

Loading

gante commented Mar 22, 2022 •

edited

Loading

gante commented Mar 22, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 22, 2022 •

edited

Loading

gante Mar 22, 2022 •

edited

Loading