Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FUNSD predictor #11

Closed
yudezhi123456 opened this issue Oct 13, 2023 · 11 comments
Closed

FUNSD predictor #11

yudezhi123456 opened this issue Oct 13, 2023 · 11 comments
Labels
good question good question

Comments

@yudezhi123456
Copy link

I trained FUNSD-entity to get the best model, verified eval, and obtained entity-labeling consistent with your paper. However, the results of entity-Linking for 'Pair_F1_MACRO' and 'Pair_F1_MICRO' both reach 1.0. I wonder why? That's the metric, right? Below are screenshots from my yaml file, along with the experimental results.

image

image

image

image

@NormXU
Copy link
Owner

NormXU commented Oct 13, 2023

@yudezhi123456
Yes, it is a kind of metric to reflect the performance regarding whether the model can predict the connections between cells correctly. 1.0 indicates that all cells have been predicted correctly.

@yudezhi123456
Copy link
Author

@yudezhi123456 Yes, it is a kind of metric to reflect the performance regarding whether the model can predict the connections between cells correctly. 1.0 indicates that all cells have been predicted correctly.

Thank you very much for your answer, There's a sentence In your paper "In the entity task,our model shown in Tab.2, achieved a F1 score of 0.80575 for entity-labeling and 0.77031 for entity-linking, using 32.98 million parameters."
Then I would like to know whether the F1 score of entity-labeling is expressed by two indicators: Node_F1_MACRO and Node_F1_MICRO, and which indicator is used to express the result of entity-linking, just like the value of 0.77031 mentioned in the paper.

@NormXU
Copy link
Owner

NormXU commented Oct 13, 2023

@yudezhi123456
we report Node_F1_MACRO as the primary metric for entity labeling. Regarding entity linking, we use Pair_F1_MACRO as the metric in our paper.

You can find the implementation details here:

evaluating entity-linking (Node_F1):

sums = len(self.pred_labels_cls)
correct = 0
correct_map = {}
for class_num in range(self.num_classes):
class_num_label = np.array(self.pred_labels_cls)[np.where(np.array(self.gt_labels_cls) == class_num)]
total_num = len(class_num_label)
if total_num != 0:
correct_num = len(np.where(class_num_label == class_num)[0])
correct_map[class_num] = (correct_num, total_num, correct_num / total_num)
correct += correct_num
F1_MACRO, F1_MICRO = f1_score(self.gt_labels_cls, self.pred_labels_cls,
average='macro'), f1_score(self.gt_labels_cls,
self.pred_labels_cls,
average='micro')

evaluating entity-labeling(Pair_F1):

sums = len(self.pred_labels_cell)
correct = sums - len(np.where(np.array(self.gt_labels_cell) + np.array(self.pred_labels_cell) == 1)[0])
correct_map = {}
for class_num in range(2):
class_num_label = np.array(self.pred_labels_cell)[np.where(np.array(self.gt_labels_cell) == class_num)]
total_num = len(class_num_label)
if total_num != 0:
correct_num = len(np.where(class_num_label == class_num)[0])
correct_map[class_num] = (correct_num, total_num, correct_num / total_num)
F1_MACRO, F1_MICRO = f1_score(self.gt_labels_cell, self.pred_labels_cell,
average='macro'), f1_score(self.gt_labels_cell,
self.pred_labels_cell,
average='micro')

@yudezhi123456
Copy link
Author

@yudezhi123456 we report Node_F1_MACRO as the primary metric for entity labeling. Regarding entity linking, we use Pair_F1_MACRO as the metric in our paper.

You can find the implementation details here:

evaluating entity-linking (Node_F1):

sums = len(self.pred_labels_cls)
correct = 0
correct_map = {}
for class_num in range(self.num_classes):
class_num_label = np.array(self.pred_labels_cls)[np.where(np.array(self.gt_labels_cls) == class_num)]
total_num = len(class_num_label)
if total_num != 0:
correct_num = len(np.where(class_num_label == class_num)[0])
correct_map[class_num] = (correct_num, total_num, correct_num / total_num)
correct += correct_num
F1_MACRO, F1_MICRO = f1_score(self.gt_labels_cls, self.pred_labels_cls,
average='macro'), f1_score(self.gt_labels_cls,
self.pred_labels_cls,
average='micro')

evaluating entity-labeling(Pair_F1):

sums = len(self.pred_labels_cell)
correct = sums - len(np.where(np.array(self.gt_labels_cell) + np.array(self.pred_labels_cell) == 1)[0])
correct_map = {}
for class_num in range(2):
class_num_label = np.array(self.pred_labels_cell)[np.where(np.array(self.gt_labels_cell) == class_num)]
total_num = len(class_num_label)
if total_num != 0:
correct_num = len(np.where(class_num_label == class_num)[0])
correct_map[class_num] = (correct_num, total_num, correct_num / total_num)
F1_MACRO, F1_MICRO = f1_score(self.gt_labels_cell, self.pred_labels_cell,
average='macro'), f1_score(self.gt_labels_cell,
self.pred_labels_cell,
average='micro')

Thank you very much for your answer. I can understand what you mean and find the corresponding primary metric.
But after thinking about it all night, I still don't know which index entity-linking=0.77031 in the paper is obtained from. As you said, By "provide 'Pair_F1_MACRO' as the reported metric in our paper", but after I trained the FUNSD dataset and evaluated the test set, Pair_F1_MACRO=1.0, It is not consistent with 0.77031 in your paper. I can't find 0.77031 on Pair_F1_MACRO. Is it because I'm testing on a different dataset than yours? I'm testing testing-data from FUNSD, which is also generated by preprocess_data.py.
I am not sure whether I have made it clear. If you have time, I hope you can answer my confusion. Thank you very much.

@NormXU
Copy link
Owner

NormXU commented Oct 14, 2023

@weishu27 Could you please check out it and answer this questions?

@NormXU
Copy link
Owner

NormXU commented Oct 14, 2023

@yudezhi123456 It's quite weird that you can achieved a Pair_F1_MACRO=1.0. Could you please double-check that there is no data leakage from the testing dataset into the training dataset

Moreover, I notice from your log:
Image_20231014110003
It shows that your dataset only contains edges with label=0. There could be a potential bug with your data processing, as there should be edges with both label=0 and label=1. Label 0 represents an invalid connection, while label 1 denotes a valid connection.

@yudezhi123456
Copy link
Author

@yudezhi123456 It's quite weird that you can achieved a Pair_F1_MACRO=1.0. Could you please double-check that there is no data leakage from the testing dataset into the training dataset

Moreover, I notice from your log: Image_20231014110003 It shows that your dataset only contains edges with label=0. There could be a potential bug with your data processing, as there should be edges with both label=0 and label=1. Label 0 represents an invalid connection, while label 1 denotes a valid connection.

I redownloaded the FUNSD dataset and examined the data and the test_convert_Funsd2Graph function, but still couldn't find the problem. The training data set consists of 149 samples and the test data set consists of 50 samples. Below is the structure of my dataset.

image

Because when training the “word” model and verifying it, there are no such problems. Consistent with the paper, while training “entity”, the above problems occurred.
The following image shows the result of the “word” model validation.

image

image

image

I didn't make any unnecessary changes to your code, just the yaml file configuration.

The problem may not be in the data handling, it may be in the validation code itself. When I was training on the newly downloaded FUNSD, the following PairF1MACRO1.00000 appeared when I saved the model in the first round.

image

I am very confused, thank you for your patient answer.

@NormXU
Copy link
Owner

NormXU commented Oct 14, 2023

@yudezhi123456 Alright, I think I find out what causes this issue.

We provide two kinds of dataset and corresponding CollectFn as shown below:

The GraphCollateFn is for the node classification task. Consider a scenario where we are presented with a collection of OCR boxes. Each of these boxes is treated as an individual node within a graph. Our objective is to leverage the network to establish connections between certain nodes while disconnecting others. To achieve this, we initially implement a fully connected linkage strategy, but this approach proves overly dense for the network to learn from. As a remedy, we apply a sparser linkage approach based on geometric data and prior knowledge. The overlap between the fully connected and sparse linkages is then set as positive edges, while the rest edges within the sparse graph are set as negative edges.
Check the codes below:

for index, target in enumerate(targets):
if linkings is not None:
positive_cell_pair = linkings[index]
positive_cell_pair = [tuple(cell_pair) for cell_pair in positive_cell_pair]
else:
positive_cell_pair = []
for i, loc1 in enumerate(target):
for j, loc2 in enumerate(target[i + 1:]):
if loc1 == loc2:
positive_cell_pair.append((i, j + i + 1))

However, if relational linking information is already available, as is the case with the FUNSD_entity_graph dataset, we can directly use it. The dataset and collectFn in config file should be set to GraphLayoutEntityDataset and GraphEntityCollateF respectively. Otherwise, all nodes will be set to negative edges based on our codes, which finally leads to the PairF1MACRO=1.0

In a word:

  • For entity-node:
datasets:
  train:
    dataset:
      type: GraphLayoutDataset
    collate_fn:
      type: GraphCollateFn
  • For entity-linking
datasets:
  train:
    dataset:
      type: GraphLayoutEntityDataset
    collate_fn:
      type: GraphEntityCollateFn

Sorry for the confusion. I will add the explanations in our documentation.

@NormXU NormXU added the good question good question label Oct 14, 2023
NormXU pushed a commit that referenced this issue Oct 14, 2023
NormXU pushed a commit that referenced this issue Oct 14, 2023
@yudezhi123456 yudezhi123456 reopened this Oct 14, 2023
@yudezhi123456
Copy link
Author

@yudezhi123456 Alright, I think I find out what causes this issue.

We provide two kinds of dataset and corresponding CollectFn as shown below:

The `GraphCollateFn` is for the node classification task. Consider a scenario where we are presented with a collection of OCR boxes. Each of these boxes is treated as an individual node within a graph. Our objective is to leverage the network to establish connections between certain nodes while disconnecting others. To achieve this, we initially implement a fully connected linkage strategy, but this approach proves overly dense for the network to learn from. As a remedy, we apply a sparser linkage approach based on geometric data and prior knowledge. The overlap between the fully connected and sparse linkages is then set as positive edges, while the rest edges within the sparse graph are set as negative edges. Check the codes below:

for index, target in enumerate(targets):
if linkings is not None:
positive_cell_pair = linkings[index]
positive_cell_pair = [tuple(cell_pair) for cell_pair in positive_cell_pair]
else:
positive_cell_pair = []
for i, loc1 in enumerate(target):
for j, loc2 in enumerate(target[i + 1:]):
if loc1 == loc2:
positive_cell_pair.append((i, j + i + 1))

However, if relational linking information is already available, as is the case with the FUNSD_entity_graph dataset, we can directly use it. The dataset and collectFn in config file should be set to GraphLayoutEntityDataset and GraphEntityCollateF respectively. Otherwise, all nodes will be set to negative edges based on our codes, which finally leads to the PairF1MACRO=1.0

In a word:

  • For entity-node:
datasets:
  train:
    dataset:
      type: GraphLayoutDataset
    collate_fn:
      type: GraphCollateFn
  • For entity-linking
datasets:
  train:
    dataset:
      type: GraphLayoutEntityDataset
    collate_fn:
      type: GraphEntityCollateFn

Sorry for the confusion. I will add the explanations in our documentation.

Thank you for your answer, which is of great help to the advancement of my research. I will continue to conduct research in this direction.

@CCball
Copy link

CCball commented Oct 19, 2023

how to train on funsd dataset

@yudezhi123456
Copy link
Author

how to train on funsd dataset

According to the README given by the author, it is very clear that it can be trained step by step according to the requirements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good question good question
Projects
None yet
Development

No branches or pull requests

3 participants