Fixing tests for Perceiver #14739

Narsil · 2021-12-13T09:31:30Z

What does this PR do?

Do not run image-classification pipeline (_CHECKPOINT_FOR_DOC uses the checkpoint for
langage, which cannot load a FeatureExtractor so current logic fails).
Add a safeguard to not run tests when tokenizer_class or
feature_extractor_class are defined, but cannot be loaded
This happens for Perceiver for the "FastTokenizer" (which doesn't exist
so None) and FeatureExtractor (which does exist but cannot be loaded
because the checkpoint doesn't define one which is reasonable for the
said checkpoint)
Added get_vocab function to PerceiverTokenizer since it is used by
fill-mask pipeline when the argument targets is used to narrow a
subset of possible values.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

NielsRogge · 2021-12-13T09:54:09Z

Do not run image-classification pipeline (_CHECKPOINT_FOR_DOC uses the checkpoint for
language, which cannot load a FeatureExtractor so current logic fails).

Ok but the Perceiver has 3 variants (PerceiverForImageClassificationLearned, PerceiverForImageClassificationFourier, PerceiverForImageClassificationConvProcessing) that should work with the image classification pipeline. So with the current logic, we can't test them?

These 3 checkpoints each have a feature extractor defined (I've uploaded the preprocessor_config.json to the hub for these checkpoints).

Narsil · 2021-12-13T10:28:05Z

@NielsRogge , I added some slow tests for now to make sure we can run the pipeline.

Ideally we would be able to run the run_pipeline_test but the config are different in each case.
At least for now we have some proof that it works.

Narsil · 2021-12-13T11:56:51Z

And readded the fast tests too.

It works by using update_config_with_model_class in the model_tester. Not sure it's the best way, but there's definitely a dependency between the ModelClass and the desired config.d_model.

LysandreJik

Thanks for looking into it, @Narsil. I'll take a closer look in a bit, will skip the tests in the meantime as all PRs rebasing off of master are red right now.

tests/test_modeling_perceiver.py

tests/test_pipelines_common.py

LysandreJik · 2021-12-13T12:57:01Z

tests/test_pipelines_common.py

-        feature_extractor = None
+        try:
+            feature_extractor = feature_extractor_class()
+        except Exception:


Can this be more defined than just Exception ?

Well, I have no idea what could crash on arbitrary initialization.

If we want to keep the magic being very tolerant here seems OK to me.
The code could fail because feature_extractor_class requires some argument to be defined, but also any exception that could be raised during __init__ so I don't see a way to be exhaustive here.

What's the reason this feature_extractor_class argument is added?

Because we need to initialize the feature extractor.

It does not work with from_pretrained above because the checkpoint == "deepmind/language-perceiver" (it retrieves from _CHECKPOINT_FOR_DOC.

But we do have the class, and feature_extractors usually don't carry a lot of information (like vocab.txt with tokenizers I mean) so trying to instantiate with default arguments make sense to me. Otherwise we would need another indirection to make from_pretrained work here.

Ok I see, makes sense!

LysandreJik · 2021-12-13T13:21:48Z

For readers: merged #14745 to skip perceiver tests while we work on this PR.

LysandreJik

Now this looks great! Thanks to both you and @NielsRogge for working on this!

LysandreJik · 2021-12-13T17:48:29Z

Could you please rebase on master and ensure everything is green before merging? Thanks!

with FeatureExtractor some text only pipelines.

src/transformers/models/perceiver/feature_extraction_perceiver.py

src/transformers/pipelines/__init__.py

NielsRogge · 2021-12-13T18:13:16Z

tests/test_pipelines_common.py

+                        else:
+                            # Remove the non defined tokenizers
+                            # ByT5 and Perceiver are bytes-level and don't define
+                            # FastTokenizer, we can just ignore those.
+                            tokenizer_classes = [
+                                tokenizer_class for tokenizer_class in tokenizer_classes if tokenizer_class is not None
+                            ]


There's a third bytes-level tokenizer, which is CANINE. Should this be added here too?

It's a comment, I don't mind if it's not exhaustive. I think it clarifies enough the purpose of this line. Fine adding it though.

NielsRogge

Left a few comments.

NielsRogge · 2021-12-13T18:15:46Z

tests/test_pipelines_common.py

-        feature_extractor = None
+        try:
+            feature_extractor = feature_extractor_class()
+        except Exception:


What's the reason this feature_extractor_class argument is added?

Narsil · 2021-12-14T08:42:59Z

OK, the moon-landing failing tests are back up !

Narsil requested a review from NielsRogge December 13, 2021 09:31

Narsil requested a review from LysandreJik December 13, 2021 12:02

LysandreJik reviewed Dec 13, 2021

View reviewed changes

LysandreJik mentioned this pull request Dec 13, 2021

Skip Perceiver tests #14745

Merged

Narsil force-pushed the fix_tests_perceiver_pipeline branch from 9e73769 to 2491418 Compare December 13, 2021 13:16

Narsil force-pushed the fix_tests_perceiver_pipeline branch from 2f3fc71 to a916234 Compare December 13, 2021 16:37

LysandreJik approved these changes Dec 13, 2021

View reviewed changes

Narsil added 7 commits December 13, 2021 18:52

Adding some slow test to check for perceiver at least from a high level.

6090db5

Re-enabling fast tests for Perceiver ImageClassification.

7c8b18e

Perceiver might try to run without Tokenizer (Fast doesn't exist) and

780d8c0

with FeatureExtractor some text only pipelines.

Oops.

ff7b5c5

Adding a comment for update_config_with_model_class.

55d0061

Remove model_architecture to get tiny_config.

d56ad2a

Finalize rebase.

e81db3a

Narsil force-pushed the fix_tests_perceiver_pipeline branch from a916234 to e81db3a Compare December 13, 2021 17:58

Narsil added 2 commits December 13, 2021 19:03

Smarter way to handle undefined FastTokenizer.

d07a4cb

Remove old code.

d87f4b5

NielsRogge reviewed Dec 13, 2021

View reviewed changes

src/transformers/models/perceiver/feature_extraction_perceiver.py Outdated Show resolved Hide resolved

NielsRogge reviewed Dec 13, 2021

View reviewed changes

src/transformers/pipelines/__init__.py Show resolved Hide resolved

NielsRogge reviewed Dec 13, 2021

View reviewed changes

Narsil added 2 commits December 13, 2021 19:55

Addressing some nits.

3d2c3c3

Don't instantiate None.

51a102d

Narsil merged commit 546a91a into huggingface:master Dec 14, 2021

Narsil deleted the fix_tests_perceiver_pipeline branch December 14, 2021 08:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing tests for Perceiver #14739

Fixing tests for Perceiver #14739

Narsil commented Dec 13, 2021

NielsRogge commented Dec 13, 2021 •

edited

Loading

Narsil commented Dec 13, 2021

Narsil commented Dec 13, 2021

LysandreJik left a comment

LysandreJik Dec 13, 2021

Narsil Dec 13, 2021 •

edited

Loading

NielsRogge Dec 13, 2021

Narsil Dec 13, 2021

NielsRogge Dec 13, 2021

LysandreJik commented Dec 13, 2021

LysandreJik left a comment

LysandreJik commented Dec 13, 2021

NielsRogge Dec 13, 2021

Narsil Dec 13, 2021

NielsRogge left a comment

NielsRogge Dec 13, 2021

Narsil commented Dec 14, 2021

Fixing tests for Perceiver #14739

Fixing tests for Perceiver #14739

Conversation

Narsil commented Dec 13, 2021

What does this PR do?

Before submitting

Who can review?

NielsRogge commented Dec 13, 2021 • edited Loading

Narsil commented Dec 13, 2021

Narsil commented Dec 13, 2021

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik Dec 13, 2021

Choose a reason for hiding this comment

Narsil Dec 13, 2021 • edited Loading

Choose a reason for hiding this comment

NielsRogge Dec 13, 2021

Choose a reason for hiding this comment

Narsil Dec 13, 2021

Choose a reason for hiding this comment

NielsRogge Dec 13, 2021

Choose a reason for hiding this comment

LysandreJik commented Dec 13, 2021

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik commented Dec 13, 2021

NielsRogge Dec 13, 2021

Choose a reason for hiding this comment

Narsil Dec 13, 2021

Choose a reason for hiding this comment

NielsRogge left a comment

Choose a reason for hiding this comment

NielsRogge Dec 13, 2021

Choose a reason for hiding this comment

Narsil commented Dec 14, 2021

NielsRogge commented Dec 13, 2021 •

edited

Loading

Narsil Dec 13, 2021 •

edited

Loading