Skip to content

Latest commit

 

History

History
14 lines (14 loc) · 2.15 KB

score_table.md

File metadata and controls

14 lines (14 loc) · 2.15 KB
Model Name AVG Rank MedQA-USMLE MedQA-Mainland PromptCBLUE WebMedQA CheckupQA MedicineQA DialogSumm MedTriage (F1)
GPT-4 1.25 1129 1117 1110 1116 1096 1098 1109 0.65
PULSE-Pro 1.75 1089 1092 1088 1119 1105 1083 1096 0.63
ChatGPT 4.00 1086 1057 1064 1053 1020 1029 1080 0.43
PULSE-OS 4.12 1042 1024 1039 1059 1049 1069 1076 0.40
Baichuan2 4.50 1024 1041 1065 1044 1062 1035 1069 0.33
ChatGLM3 5.62 1038 1062 997 1012 1003 1024 1021 0.06
HuatuoGPT2 7.62 955 993 985 963 983 1003 980 0.01
QiZhenGPT 8.38 955 959 945 989 1039 932 921 0.00
BenTsao 8.75 961 921 936 910 927 986 920 0.02
BianQue2 10.12 913 928 919 988 974 900 908 0.00
MING 10.75 902 909 924 867 862 960 918 0.01
DoctorGLM 11.12 906 896 930 879 880 880 905 0.00