Add type hints for PyTorch BERT. #23541

coledie · 2023-05-20T18:31:58Z

What does this PR do?

Add type hints for PyTorch BERT.

Fixes #16059 for PyTorch BERT

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

text models: @ArthurZucker and @younesbelkada
type hints: @Rocketknight1

HuggingFaceDocBuilderDev · 2023-05-20T19:47:48Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Rocketknight1

This looks really good, thank you! I have one question about the strip_accents argument, but other than that this looks perfect.

src/transformers/models/bert/tokenization_bert.py

src/transformers/models/bert/tokenization_bert_fast.py

Rocketknight1 · 2023-05-23T13:23:43Z

Looks good! The last thing you'll need to do is pip install transformers[quality] followed by make style in the transformers directory. This runs our code formatting tools to make sure everything follows our style guidelines. Once you do that and commit any changes, the tests should pass!

If you run make style and you get an error, it may indicate some part of your code that has issues that our code style tools can't correct - if that happens, take a look and try to see what's wrong, and reply here if you can't figure it out!

coledie · 2023-05-24T00:56:47Z

It looks like I've got one last error that I cannot figure out, the relevant line of code appears several times in many files without issue otherwise. Could you help with this?

coledie · 2023-06-02T17:37:42Z

@Rocketknight1

Rocketknight1 · 2023-06-02T17:43:33Z

Ah, thanks for the ping! Investigating now

Rocketknight1 · 2023-06-02T19:17:22Z

I've investigated and there's an issue in our copy checking code, specifically the is_copy_consistent function. This isn't your fault, and I'll need to file a PR to fix it!

(For internal transformers reference): The issue is triggered when a function is copied from with a single-line header, and there is a change that causes its header to now be multi-line (e.g. adding type hints and causing black to wrap the line). The is_copy_consistent function builds a replacement function from the first line of the target function followed by the subsequent lines of the original function, which creates a mangled header if the original header has changed to multi-line but the target has not:

def build_inputs_with_special_tokens(self, token_ids_0, token_ids_1=None):
        self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None
    ) -> List[int]:

@sgugger do you want me to make a PR to fix it?

sgugger · 2023-06-02T19:55:15Z

Lol no, fixing this is really not a priority. The copies can be manually updated.
The only situation this can appear is this one, and it's rare enough that we can deal with it I think.

Rocketknight1 · 2023-06-06T16:32:37Z

Understood! @coledie you might have to do some manual copying to make this work, in that case. Search the repository for the string # Copied from transformers.models.bert.tokenization_bert_fast.BertTokenizerFast.build_inputs_with_special_tokens. This will locate functions that are copied from the BERT function and that our repository tools keep in sync with it. If you manually copy your new build_inputs_with_special_tokens header over the existing headers in those functions and then rerun make fixup or make fix-copies, everything should work and the CI should pass.

If you have any issues, let me know and I can make the changes for you!

coledie · 2023-06-14T14:28:06Z

@Rocketknight1 Looks like it is working since test failures are unrelated?

github-actions · 2023-08-03T15:02:36Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

coledie added 4 commits May 20, 2023 13:27

Add type hints for PyTorch BERT.

5cd9c70

Quick fix.

994771f

Black.

c50a21b

PRe-commit.

d757289

Rocketknight1 approved these changes May 22, 2023

View reviewed changes

coledie added 2 commits May 22, 2023 23:14

strip_accents.

281d843

strip_accents.

7d96ab1

Add make style.

81a3cf3

Update.

38714e4

huggingface deleted a comment from github-actions bot Jul 10, 2023

github-actions bot closed this Aug 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add type hints for PyTorch BERT. #23541

Add type hints for PyTorch BERT. #23541

coledie commented May 20, 2023

HuggingFaceDocBuilderDev commented May 20, 2023

Rocketknight1 left a comment

Rocketknight1 commented May 23, 2023

coledie commented May 24, 2023

coledie commented Jun 2, 2023

Rocketknight1 commented Jun 2, 2023

Rocketknight1 commented Jun 2, 2023 •

edited

Loading

sgugger commented Jun 2, 2023 •

edited

Loading

Rocketknight1 commented Jun 6, 2023

coledie commented Jun 14, 2023

github-actions bot commented Aug 3, 2023

Add type hints for PyTorch BERT. #23541

Add type hints for PyTorch BERT. #23541

Conversation

coledie commented May 20, 2023

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented May 20, 2023

Rocketknight1 left a comment

Choose a reason for hiding this comment

Rocketknight1 commented May 23, 2023

coledie commented May 24, 2023

coledie commented Jun 2, 2023

Rocketknight1 commented Jun 2, 2023

Rocketknight1 commented Jun 2, 2023 • edited Loading

sgugger commented Jun 2, 2023 • edited Loading

Rocketknight1 commented Jun 6, 2023

coledie commented Jun 14, 2023

github-actions bot commented Aug 3, 2023

Rocketknight1 commented Jun 2, 2023 •

edited

Loading

sgugger commented Jun 2, 2023 •

edited

Loading