Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modelzoo Requests #383

Open
innerlee opened this issue Dec 27, 2020 · 6 comments
Open

Modelzoo Requests #383

innerlee opened this issue Dec 27, 2020 · 6 comments
Assignees
Labels
community/help wanted extra attention is needed

Comments

@innerlee
Copy link
Contributor

From time to time, there are requests for more checkpoints. Considering that enriching the model zoo is a good thing, we collect requests for checkpoints here. Please post the detailed configs, settings, backgrounds, motivations etc. below, and others may thumbs-up 👍 the request items. We will periodically assess them and train & release them if the needs are high.

Happy Research!

@chaowentao

This comment has been minimized.

@pablovela5620
Copy link

Motivation: Have a smaller model more well suited for realtime inference for the interhand 3d dataset
Configs: mobilenetv2_interhand3d_all_256x256.py
Datasets: Interhand3D
Details:

Would also be great to see smaller input sizes other than 256x256 such as 96x96 or 128x128 (like megatrack) or 224x224 (like mediapipe blazehand)

Example with input size 256x256

log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=1)
evaluation = dict(
    interval=1,
    metric=['MRRPE', 'MPJPE', 'Handedness_acc'],
    key_indicator='MPJPE_all')

optimizer = dict(
    type='Adam',
    lr=2e-4,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[15, 17])
total_epochs = 20
log_config = dict(
    interval=20,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])

channel_cfg = dict(
    num_output_channels=42,
    dataset_joints=42,
    dataset_channel=[list(range(42))],
    inference_channel=list(range(42)))
# model settings
model = dict(
    type='Interhand3D',
    pretrained='mmcls:https://mobilenet_v2',
    backbone=dict(type='MobileNetV2', widen_factor=1., out_indices=(7, )),
    keypoint_head=dict(
        type='Interhand3DHead',
        keypoint_head_cfg=dict(
            in_channels=1280,
            out_channels=21 * 64,
            depth_size=64,
            num_deconv_layers=3,
            num_deconv_filters=(256, 256, 256),
            num_deconv_kernels=(4, 4, 4),
        ),
        root_head_cfg=dict(
            in_channels=1280,
            heatmap_size=64,
            hidden_dims=(512, ),
        ),
        hand_type_head_cfg=dict(
            in_channels=1280,
            num_labels=2,
            hidden_dims=(512, ),
        ),
        loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True),
        loss_root_depth=dict(type='L1Loss'),
        loss_hand_type=dict(type='BCELoss', use_target_weight=True),
    ),
    train_cfg={},
    test_cfg=dict(flip_test=False))

data_cfg = dict(
    image_size=[256, 256],
    heatmap_size=[64, 64, 64],
    heatmap3d_depth_bound=400.0,
    heatmap_size_root=64,
    root_depth_bound=400.0,
    num_output_channels=channel_cfg['num_output_channels'],
    num_joints=channel_cfg['dataset_joints'],
    dataset_channel=channel_cfg['dataset_channel'],
    inference_channel=channel_cfg['inference_channel'])

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='HandRandomFlip', flip_prob=0.5),
    dict(type='TopDownRandomTranslation', trans_factor=0.15),
    dict(
        type='TopDownGetRandomScaleRotation',
        rot_factor=45,
        scale_factor=0.25,
        rot_prob=0.6),
    # dict(type='MeshRandomChannelNoise', noise_factor=0.2),
    dict(type='TopDownAffine'),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(
        type='MultitaskGatherTarget',
        pipeline_list=[
            [dict(
                type='Generate3DHeatmapTarget',
                sigma=2.5,
                max_bound=255,
            )], [dict(type='HandGenerateRelDepthTarget')],
            [
                dict(
                    type='RenameKeys',
                    key_pairs=[('hand_type', 'target'),
                               ('hand_type_valid', 'target_weight')])
            ]
        ],
        pipeline_indices=[0, 1, 2],
    ),
    dict(
        type='Collect',
        keys=['img', 'target', 'target_weight'],
        meta_keys=[
            'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
            'rotation', 'flip_pairs'
        ]),
]

val_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='TopDownAffine'),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(
        type='Collect',
        keys=['img'],
        meta_keys=[
            'image_file', 'center', 'scale', 'rotation', 'flip_pairs',
            'heatmap3d_depth_bound', 'root_depth_bound'
        ]),
]

test_pipeline = val_pipeline

data_root = 'data/interhand2.6m'
data = dict(
    samples_per_gpu=16,
    workers_per_gpu=2,
    train=dict(
        type='InterHand3DDataset',
        ann_file=f'{data_root}/annotations/all/'
        'InterHand2.6M_train_data.json',
        camera_file=f'{data_root}/annotations/all/'
        'InterHand2.6M_train_camera.json',
        joint_file=f'{data_root}/annotations/all/'
        'InterHand2.6M_train_joint_3d.json',
        img_prefix=f'{data_root}/images/train/',
        data_cfg=data_cfg,
        use_gt_root_depth=True,
        rootnet_result_file=None,
        pipeline=train_pipeline),
    val=dict(
        type='InterHand3DDataset',
        ann_file=f'{data_root}/annotations/machine_annot/'
        'InterHand2.6M_val_data.json',
        camera_file=f'{data_root}/annotations/machine_annot/'
        'InterHand2.6M_val_camera.json',
        joint_file=f'{data_root}/annotations/machine_annot/'
        'InterHand2.6M_val_joint_3d.json',
        img_prefix=f'{data_root}/images/val/',
        data_cfg=data_cfg,
        use_gt_root_depth=True,
        rootnet_result_file=None,
        pipeline=val_pipeline),
    test=dict(
        type='InterHand3DDataset',
        ann_file=f'{data_root}/annotations/all/'
        'InterHand2.6M_test_data.json',
        camera_file=f'{data_root}/annotations/all/'
        'InterHand2.6M_test_camera.json',
        joint_file=f'{data_root}/annotations/all/'
        'InterHand2.6M_test_joint_3d.json',
        img_prefix=f'{data_root}/images/test/',
        data_cfg=data_cfg,
        use_gt_root_depth=True,
        rootnet_result_file=None,
        pipeline=val_pipeline),
)

@jin-s13 jin-s13 added the community/help wanted extra attention is needed label Jul 22, 2021
@ly015 ly015 mentioned this issue Dec 5, 2021
@lucasjinreal
Copy link

Body25 key points model supported? Which can be done by combined coco and MPII, so that users like can using the key points of foot.
Now, coco's 16 key points model can not meets requirements anymore.

@jin-s13
Copy link
Collaborator

jin-s13 commented Feb 14, 2022

Thanks! BTW, we have already provided 133-kpt COCO-Wholebody models. You can run this model to obtain foot keypoints.

@lucasjinreal
Copy link

@jin-s13 For some realtime situation, 133 is too much (probability not optimal in certain points). Alphapose provides Body25 with HALPE dataset, as well as Openpose. if mmpose can have such an option, that would be very great for users want a simple yet useful human pose model.

@ly015 ly015 unpinned this issue Jun 3, 2022
@ly015 ly015 pinned this issue Nov 3, 2022
@Tau-J Tau-J unpinned this issue May 30, 2023
HAOCHENYE added a commit to HAOCHENYE/mmpose that referenced this issue Jun 27, 2023
…pen-mmlab#383)

* fix build multiple scheduler

* add new unit test

* fix comment and error message

* fix comment and error message

* extract _parse_scheduler_cfg

* always call build_param_scheduler during train and resume. If there is only one optimizer, the defaut value for sheduler will be a list, otherwise there is multiple optimizer, the default value of sheduler will be a dict

* minor refine

* rename runner test exp name

* fix as comment

* minor refine

* fix ut

* only check parameter scheduler

* minor refine
@nnop
Copy link

nnop commented Jun 19, 2024

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community/help wanted extra attention is needed
Projects
None yet
Development

No branches or pull requests

9 participants