You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am reaching out for pointers as I am unable to reproduce the accuracy results from the paper while using RoBERTa-Base.
I finetuned the RoBERTa-Base model on the RACE dataset, with the LRQA codebase. Next, I followed the instructions in the previous link to evaluate on BBQ. However, I obtained a 51.64% average accuracy across categories, which is shy of the 61.4% reported in the paper.
I used the same parameters reported in the paper:
Total Batch Size: 16 (The total batch size is simulated with a batch size of 4 and a gradient accumulation of 4 steps)
Learning Rate: 1e-5
Nr Epochs: 3
Max Token Length: 512
I am using the libraries and respective versions in the requirements.txt file.
transformers==4.5.2
tokenizers==0.10.1
datasets==1.1.2
Do you have any clues as to why I am not able to obtain the same results in terms of accuracy while running the instructions of LRQA? Any pointers would be much appreciated!
Thank you!
Gustavo
The text was updated successfully, but these errors were encountered:
Hello,
Congratulations on this great work!
I am reaching out for pointers as I am unable to reproduce the accuracy results from the paper while using RoBERTa-Base.
I finetuned the RoBERTa-Base model on the RACE dataset, with the LRQA codebase. Next, I followed the instructions in the previous link to evaluate on BBQ. However, I obtained a 51.64% average accuracy across categories, which is shy of the 61.4% reported in the paper.
I used the same parameters reported in the paper:
I am using the libraries and respective versions in the requirements.txt file.
Do you have any clues as to why I am not able to obtain the same results in terms of accuracy while running the instructions of LRQA? Any pointers would be much appreciated!
Thank you!
Gustavo
The text was updated successfully, but these errors were encountered: