Name		Name	Last commit message	Last commit date
parent directory ..
detr		detr
scripts		scripts
README.md		README.md

README.md

Visual Reasoning Skill Evaluation on PaintSkills

Dataset Setup

Create $paintskills_dir directory.
From the Google Drive link, download metadata.json and three skill directories: object/, count/, spatial/, inside $paintskills_dir.

The $paintskills_dir directory has hierarchy as below:

$paintskills_dir/
    # skill name (i.e., object, count, and spatial)
    {skill}/

        # Scene configuration
        scenes/
            {skill}_train.json
            {skill}_val.json

        # GT Images (from {skill}/images.zip)
        images/

        # Bounding box annotations (for DETR finetuning)
        {skill}_train_bounding_boxes.json
        {skill}_val_bounding_boxes.json

    # metadata for all skills.
    metadata.json

Scene Configuration

The scene configuration files (scenes/{skill}_{split}.json) have the following structure, where skill is one of object, count, spatial, and split is one of train, val.

e.g., count_val.json

{
    "data": [
        {
            "id": "count_val_00000",
            "scene": "HDR-KirbyCove",
            "text": "1 person",
            "skill": "count",
            "split": "val",
            "objects": [
                {
                    "id": 0,
                    "shape": "humanJosh",
                    "coconame": "person",
                    "color": "plain",
                    "relation": null,
                    "scale": 14.114588410729079,
                    "texture": "plain",
                    "rotation": null,
                    "state": "sitting"
                }
            ]
        },
        ...
    ]
}

Evaluation of Text2Img models with DETR

Generate the skill-specific images in $image_dir from captions (text field in the scene data) with your text-to-image generation models (finetuned on PaintSkills). The evaluation scripts expects that the generated images have filenames in the format of image_{datum['id']}.png. For example, if the datum['id'] is count_val_00000, the filename should be image_count_val_00000.png.
Run the evaluation script

skill='object' # switch to other skills (choices=['object', 'count', 'spatial'])
image_dir='/path/to/generated/images'
bash scripts/evaluate_skill_FT_DETR-R101-DC5.sh \
    --skill_name $skill \
    --paintskills_dir $paintskills_dir \
    --image_dir $image_dir \

(Optional) 3D simulator

Please see https://github.com/aszala/PaintSkills-Simulator for our 3D Simulator implementation.

(Optional) Evaluation on GT images

skill='object' # count, spatial
bash scripts/evaluate_skill_FT_DETR-R101-DC5.sh \
    --skill_name $skill \
    --gt_data_eval \
    --paintskills_dir $paintskills_dir

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paintskills

paintskills

README.md

Visual Reasoning Skill Evaluation on PaintSkills

Dataset Setup

Scene Configuration

Evaluation of Text2Img models with DETR

(Optional) 3D simulator

(Optional) Evaluation on GT images

Files

paintskills

Directory actions

More options

Directory actions

More options

Latest commit

History

paintskills

Folders and files

parent directory

README.md

Visual Reasoning Skill Evaluation on PaintSkills

Dataset Setup

Scene Configuration

Evaluation of Text2Img models with DETR

(Optional) 3D simulator

(Optional) Evaluation on GT images