Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] To evaluate MMMU test set, you need to transfer the xlsx output to a json file #124

Closed
StarCycle opened this issue Mar 24, 2024 · 4 comments

Comments

@StarCycle
Copy link
Contributor

Hello,

When using VLMEvalKit with MMMU_TEST, you will generate a xlsx output file, e.g.,

image

This format cannot be accepted by the online MMMU EvalAI server. The server requires this json format.

The following code can transfer the xlsx file to the required json format:

import pandas as pd
import json

# 读取xlsx文件
def read_xlsx(file_path):
    # 使用pandas读取xlsx文件
    df = pd.read_excel(file_path, engine='openpyxl')
    return df

# 转换为单个字典的json格式
def convert_to_single_json(df):
    # 选择第一列和第23列
    selected_columns = df.iloc[:, [0, 22]]
    
    # 创建一个空字典用于存储结果
    result_dict = {}
    
    # 遍历每一行数据
    for index, row in selected_columns.iterrows():
        # 使用第一列的值作为键,第23列的值作为值
        result_dict[row[0]] = row[1]
        
    # 将字典转换为json格式的字符串
    json_data = json.dumps(result_dict, indent=4)
    
    return json_data

# 主函数
def main():
    # xlsx文件路径
    file_path = 'hpt-air-mmmu_MMMU_TEST.xlsx'  # 请替换为你的xlsx文件路径
    
    # 读取xlsx文件
    df = read_xlsx(file_path)
    
    # 转换为单个字典的json格式
    json_data = convert_to_single_json(df)
    
    # 输出json数据
    print(json_data)
    
    # 将json数据保存到文件
    with open('hpt-air-mmmu_MMMU_TEST.json', 'w') as f:
        f.write(json_data)

if __name__ == '__main__':
    main()

Would you like to add it into VLMEval?

Best,
StarCycle

@StarCycle StarCycle changed the title [feature request] To evaluate MMMU test set, you need to transfer the xlsx output to a json file [Feature Request] To evaluate MMMU test set, you need to transfer the xlsx output to a json file Mar 24, 2024
@kennymckormick
Copy link
Member

@llllIlllll Please help take a look at this issue.

@kennymckormick
Copy link
Member

Hi, @StarCycle,
The feature has been supported in #130 , please take a look.

@StarCycle
Copy link
Contributor Author

Hello @kennymckormick and @llllIlllll,

Sorry for the late reply. I guess the json file will still not be generated because of the continue command?

continue

            if dataset_name in ['MMBench_TEST_CN', 'MMBench_TEST_EN', 'MMMU_TEST']:
                if not MMBenchOfficialServer():
                    logger.error(
                        f'Can not evaluate {dataset_name} on non-official servers, '
                        'will skip the evaluation. '
                    )
                    continue
            # noqa W293
            if rank == 0:
                if dataset_name in ['MMMU_TEST']:
                    result_json = MMMU_result_transfer(result_file)
                    logger.info(f'Transfer MMMU_TEST result to json for official evaluation, json file saved in {result_json}')    # noqa E501

btw, what do # noqa W293 and noqa E501 mean here?

Best,
StarCycle

@kennymckormick
Copy link
Member

#noqa is just to skip some warning messages during the flake8 code format check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants