-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在object365数据集上训练处理text prompt 太长的问题 #61
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
你好,如果我需要在object365数据集上训练,我应该如何处理text prompt过长的问题。
如果我直接将object365数据集转成一个jsonl文件,可能labelmap会太长导致意想不到的bug,我看了原作者github里面的issues,他提到可以将数据集切分,以下是我做的数据格式。对应数据训练脚本里的dataset.json文件,请问这是正确的切分方式吗?
( 我将object365数据集按照类别分成5个subset,每一个subset包含73类,每个subset有自己独立的jsonl文件记录了对应的图片和box信息,每个subset的labelmap都是不一样的,且labelmap之间不存在交集,每个labelmap文件的index都是从0开始。)
{
"train": [
{
"root": "path/object365/",
"anno": "path/obj365_train_split1.jsonl",
"label_map": "obj365_split1_labelmap.json",
"dataset_mode": "odvg"
},
{
"root": "path/object365/",
"anno": "path/obj365_train_split2.jsonl",
"label_map": "obj365_split2_labelmap.json",
"dataset_mode": "odvg"
},
{
"root": "path/object365/",
"anno": "path/obj365_train_split3.jsonl",
"label_map": "obj365_split3_labelmap.json",
"dataset_mode": "odvg"
},
{
"root": "path/object365/",
"anno": "path/obj365_train_split4.jsonl",
"label_map": "obj365_split4_labelmap.json",
"dataset_mode": "odvg"
},
{
"root": "path/object365/",
"anno": "path/obj365_train_split5.jsonl",
"label_map": "obj365_split5_labelmap.json",
"dataset_mode": "odvg"
},
],
"val": [
{
"root": "path/object365/",
"anno": "path/obj365_val_split1.jsonl",
"label_map": null,
"dataset_mode": "coco"
}
]
}
The text was updated successfully, but these errors were encountered: