-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LayoutLMForQuestionAnswering model #18407
Changes from 1 commit
7d2a809
272fce0
5611bd8
f8d351c
fd8ec43
3bc8dbc
2d044af
ea99a2f
cb28258
5e55166
6c9ce40
0921d57
3361b0b
2351b45
971883b
2c46367
090673d
6029d4f
499d3ea
fc163aa
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1314,8 +1314,9 @@ def forward( | |
>>> end_scores = outputs.end_logits | ||
>>> start, end = word_ids[start_scores.argmax(-1)], word_ids[end_scores.argmax(-1)] | ||
>>> print(" ".join(words[start:end+1])) | ||
``` | ||
""" | ||
M. Hamann P. Harper, P. Martinez | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great example! Would like to see something similar for the other models (if we don't have any fine-tuned ones, then feel free to just verify an expected shape) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we address this separately? I'm not extremely familiar with all of the different models, so I'd prefer to separate it from this particular effort. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, makes sense. |
||
|
||
```""" | ||
|
||
return_dict = return_dict if return_dict is not None else self.config.use_return_dict | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -963,6 +963,7 @@ def call( | |
... ) | ||
|
||
>>> last_hidden_states = outputs.last_hidden_state | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add an expected output shape here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comment as above (prefer to do that in a separate change) |
||
|
||
```""" | ||
|
||
outputs = self.layoutlm( | ||
|
@@ -1094,6 +1095,7 @@ def call( | |
... ) | ||
|
||
>>> loss = outputs.loss | ||
|
||
```""" | ||
|
||
outputs = self.layoutlm( | ||
|
@@ -1218,6 +1220,7 @@ def call( | |
|
||
>>> loss = outputs.loss | ||
>>> logits = outputs.logits | ||
|
||
```""" | ||
|
||
outputs = self.layoutlm( | ||
|
@@ -1347,6 +1350,7 @@ def call( | |
|
||
>>> loss = outputs.loss | ||
>>> logits = outputs.logits | ||
|
||
```""" | ||
|
||
outputs = self.layoutlm( | ||
|
@@ -1452,8 +1456,8 @@ def call( | |
>>> from transformers import AutoTokenizer, TFLayoutLMForQuestionAnswering | ||
>>> from datasets import load_dataset | ||
|
||
>>> tokenizer = AutoTokenizer.from_pretrained("microsoft/layoutlm-base-uncased") | ||
>>> model = TFLayoutLMForQuestionAnswering.from_pretrained("microsoft/layoutlm-base-uncased") | ||
>>> tokenizer = AutoTokenizer.from_pretrained("impira/layoutlm-document-qa", add_prefix_space=True) | ||
>>> model = TFLayoutLMForQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", from_pt=True) | ||
|
||
>>> dataset = load_dataset("nielsr/funsd", split="train") | ||
>>> example = dataset[0] | ||
|
@@ -1474,10 +1478,15 @@ def call( | |
... bbox.append([0] * 4) | ||
>>> encoding["bbox"] = tf.convert_to_tensor([bbox]) | ||
|
||
>>> word_ids = encoding.word_ids(0) | ||
>>> outputs = model(**encoding) | ||
>>> loss = outputs.loss | ||
>>> start_scores = outputs.start_logits | ||
>>> end_scores = outputs.end_logits | ||
>>> start, end = word_ids[tf.math.argmax(start_scores, -1)[0]], word_ids[tf.math.argmax(end_scores, -1)[0]] | ||
>>> print(" ".join(words[start:end+1])) | ||
M. Hamann P. Harper, P. Martinez | ||
|
||
```""" | ||
|
||
outputs = self.layoutlm( | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To confirm the code examples work as expected, it would be great to add LayoutLM (v1) to the doc tests. Details here: https://github.com/huggingface/transformers/tree/main/docs#testing-documentation-examples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done