On training a new model to reproduce the reported results: explanations & solutions #2

ytyz1307zzh · 2021-02-08T07:18:56Z

Due to the small size of the ProPara dataset, variances across different computational resources may lead to differences in the final F1 scores even with the same parameters. In unfortunate cases, you may get a 69.0~69.5 F1 while training your own KOALA model. However, there is a way if you indeed want to get close to the reported score 70.4.

In predict.py, there are some hard commonsense rules in function predict_consistent_loc to refine the impossible state sequences (e.g., an entity with state 'E' cannot transform to state 'O_D' without a 'D', so we change the last 'E' to 'D' manally). Using such heuristic rules to fix obviously wrong predictions is a popular approach in previous ProPara models. However, in some ambiguous cases, there are multiple plausible solutions to refine the sequence, which may be difficult to infer the better strategy beforehand. Therefore, we fine-tune some of these rules according to the output patterns of the trained model.

Thus, if you train your own model on your own machine, some fine-tuned rules may not fit the output of your model. If such case happens and you indeed want to get close to the reported score, you can fine-tune the rules in predict.py according to the output patterns of your model. This may cause a 0.5~1.0 improvement in final F1 score due to the small size of ProPara test set.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On training a new model to reproduce the reported results: explanations & solutions #2

On training a new model to reproduce the reported results: explanations & solutions #2

ytyz1307zzh commented Feb 8, 2021

On training a new model to reproduce the reported results: explanations & solutions #2

On training a new model to reproduce the reported results: explanations & solutions #2

Comments

ytyz1307zzh commented Feb 8, 2021