Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The "anwser" for some examples is confusing #27

Closed
Zcchill opened this issue Jul 10, 2024 · 2 comments
Closed

The "anwser" for some examples is confusing #27

Zcchill opened this issue Jul 10, 2024 · 2 comments

Comments

@Zcchill
Copy link

Zcchill commented Jul 10, 2024

Ref to: #THUDM/LongBench#67
Longbench dataset contains a sub-dataset of "qasper". But I found that the "answers" of several examples are confusing. I want to know whether it is my misunderstanding of the dataset or an issue with the data annotatation.
{"pred": "No", "answers": ["Yes", "No"], "all_classes": null, "length": 2317, "input": "Does this method help in sentiment classification task improvement?", "_id": "bcfe56efad9715cc714ffd2e523eaa9ad796a453e7da77a6"}
{"pred": "unanswerable", "answers": ["Yes", "Unanswerable"], "all_classes": null, "length": 2284, "actual_length": 3533, "input": "Is jiant compatible with models in any programming language?", "_id": "e5d1d589ddb30f43547012f04b06ac2924a1f4fdcf56daab"}
{"pred": "BERTBase", "answers": ["BERTbase", "BERTbase"], "all_classes": null, "length": 3852, "actual_length": 5701, "input": "What BERT model do they test?", "_id": "2a51c07e65a9214ed2cd3c04303afa205e005f4e1ccb172a"}

@pdasigi
Copy link
Member

pdasigi commented Jul 10, 2024

@Zcchill Can you elaborate what is confusing about the answers? If it is that there are multiple answers which sometimes contradict with each other, that is because the annotators do not always agree with each other as is expected in difficult task requiring expert knowledge. The disagreements are quantified in the paper associated with the dataset. Also, the prescribed evaluation method is to consider a prediction to be correct if it matches any of the ground truth answers.

@Zcchill
Copy link
Author

Zcchill commented Jul 10, 2024

I see. Thanks for getting back to me!

@Zcchill Zcchill closed this as completed Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants