-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The scores calculated by VLMEvalKit differ from the score calculated on the MMBench website #121
Comments
Hi, @jdy18 , |
|
I think I have found the reason. I uploaded the openai_result table to the mmbench website instead of the original file. Do you know what difference between these two files leads to the difference in evaluation results? |
Oh, you cannot upload the openai_result table for evaluation, it only includes 1-pass for each question, and the corresponding evaluation result is under the VanillaEval setting. |
MMBench_DEV_EN_openai_result.xlsx
When testing the results in this table:
The result on the MMbench website
By VLMEvalKit:
"split","Overall","AR","CP","FP-C","FP-S","LR","RR",
"dev","0.7328178694158075","0.7638190954773869","0.8277027027027027","0.6223776223776224","0.7610921501706485","0.4745762711864407","0.7652173913043478"
The text was updated successfully, but these errors were encountered: