Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support ViTPose #1937

Merged
merged 18 commits into from
Apr 20, 2023
Merged

[Feature] Support ViTPose #1937

merged 18 commits into from
Apr 20, 2023

Conversation

Annbless
Copy link
Contributor

@Annbless Annbless commented Jan 16, 2023

Motivation

Merge the ViTPose variants code and pre-trained models into mmpose.

Modification

  1. Add a vit backbone model in mmpose/models/backbones. The __init__ file is modified accordingly.
  2. Add the config files and corresponding markdown files in the configs folder.
  3. Fix a bug in the registration of layer-wise learning rate decay.
  4. Add a 'resize_upsample4' input transformation in the mmpose/models/heads/topdown_heatmap_simple_head.py file to support the simple decoder in ViTPose. It has no influence on other models.

BC-breaking (Optional)

No.

Use cases (Optional)

Checklist

Before PR:

  • I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
  • Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
  • Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
  • New functionalities are covered by complete unit tests. If not, please add more unit tests to ensure correctness.
  • The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

  • CLA has been signed and all committers have signed the CLA in this PR.

@ly015
Copy link
Member

ly015 commented Jan 16, 2023

Thank you very much for your help! For now, there are lint issues in the code. Could you please install pre-commit hooks (see our docs) and run pre-commit run --all-files in your local repo? The lint issues will be fixed automatically.

@codecov
Copy link

codecov bot commented Jan 16, 2023

Codecov Report

❗ No coverage uploaded for pull request base (dev-0.x@fd98b11). Click here to learn what that means.
Patch has no changes to coverable lines.

❗ Current head 22fbc7b differs from pull request most recent head 52ee52b. Consider uploading reports for the commit 52ee52b to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             dev-0.x    #1937   +/-   ##
==========================================
  Coverage           ?   84.10%           
==========================================
  Files              ?      242           
  Lines              ?    21227           
  Branches           ?     3652           
==========================================
  Hits               ?    17853           
  Misses             ?     2450           
  Partials           ?      924           
Flag Coverage Δ
unittests 84.01% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Annbless
Copy link
Contributor Author

Thanks for your instruction. The code is checked now. How can I upload the pre-trained weights and logs for the ViTPose variants? Can I provide the links using OneDrive or Google Drive?

@jin-s13
Copy link
Collaborator

jin-s13 commented Jan 16, 2023

Thanks. Both OneDrive and Google Drive are welcome.

BTW, would you mind adding some unit-tests? An example could be found https://github.com/open-mmlab/mmpose/pull/1907/files#diff-dadc2075341a40335f28131ceaf3d0d415e5c316c54bb6b0a0741aeb002db24e

@ly015 ly015 changed the title Merge ViTPose into mmpose [Feature] Support ViTPose Jan 17, 2023
@ly015
Copy link
Member

ly015 commented Jan 17, 2023

The unit test of ViTPose seems failed. For quick debugging, you can run unit tests locally by $pytest tests/.

@Annbless
Copy link
Contributor Author

Thanks a lot for your help! The files and configs have been updated now. The pre-trained models and logs are available at Onedrive. The uncovered codes by the unit test are mostly existing in the init weight using the pre-trained models part (for example, changing the tensor name between the MAE pre-trained models and the backbones). We are wondering how can we cover these parts in the unit test. Can we produce pseudo checkpoints via torch.save in the unit test to cover the rename parts? We have tested this parts via re-training the models for several epochs and find them work well.

@Annbless
Copy link
Contributor Author

We also fixed some bugs caused by the updated NumPy version in the dataset files. Please check the recent commits. By the way, it seems that the current failed build is cased by the HTTP error.

@Annbless
Copy link
Contributor Author

Annbless commented Feb 8, 2023

Hi @ly015, would you mind restarting the failed checks? I just checked the logs, and it seems that the pip installation caused the error. Thanks a lot.

@Annbless
Copy link
Contributor Author

Annbless commented Feb 8, 2023

It seems that the current failure information is related to the docker version... Should I open a new PR for the dev-1.x branch and close this PR instead? Thanks a lot for your patience.

@Annbless
Copy link
Contributor Author

Annbless commented Feb 9, 2023

Hi @ly015, it seems that the current failure case is in loading the video in test_inference.py, where no frames are detected after the command.
To this end, is there anything we can do to aid the merging?

Thanks a lot.

@ly015
Copy link
Member

ly015 commented Feb 9, 2023

We will help check and fix the CI problem.

@Annbless
Copy link
Contributor Author

Hi, is there anything we can do to help fix the CI problem? Besides, could we open a new PR based on the dev-1.x branch to merge the ViTPose variants into mmpose? Thanks for your response.

new_key = k.replace('patch_embed.proj',
'patch_embed.projection')
new_ckpt[new_key] = v
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downloaded backbone has keys norm.weight and norm.bias, but in the model the two are called last_norm.weight and last_norm.bias, it is necessary to add another conversion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need. The last norm layer is re-initialized for the pose dataset.

@LareinaM
Copy link
Collaborator

LareinaM commented Mar 8, 2023

I have trained these models using your code and downloaded pretrained backbones. However, the results for some models mismatch with your record.

With classic decoder

Arch Input Size AP AP50 AP75 AR AR50
ViTPose-S 256x192 0.737 0.905 0.813 0.790 0.943
ViTPose-B 256x192 0.751 0.905 0.823 0.803 0.944
ViTPose-L 256x192 0.777 0.915 0.850 0.828 0.953
ViTPose-H 256x192 0.785 0.914 0.853 0.835 0.951

With simple decoder

Arch Input Size AP AP50 AP75 AR AR50
ViTPose-S 256x192 0.736 0.904 0.812 0.790 0.942
ViTPose-B 256x192 0.750 0.905 0.827 0.805 0.944
ViTPose-L 256x192 0.774 0.911 0.847 0.826 0.950
ViTPose-H 256x192 0.785 0.915 0.854 0.835 0.952

The validation accuracy is the same for all models.

I also noticed that the highest accuracy for large and huge models is at around 80th epoch. Maybe there is a problem with the optimizer? Have you validated the training process on this PR?

@Annbless
Copy link
Contributor Author

Hi
We have re-trained the models and figured out that the performance drop is caused by the difference between the transformer layers implemented by mmcv and timm. Is it possible for us to use timm for the backbone implementation?

@ly015
Copy link
Member

ly015 commented Mar 20, 2023

Yes, you can use timm for the backbone implementation. There is a tutorial in MMDetection on how to use timm backbones in MMDetection through an MMClassification wrapper, which should also be applicable for MMPose: https://mmdetection.readthedocs.io/en/latest/tutorials/how_to.html#use-backbone-network-in-timm-through-mmclassification

The above tutorial is just for your reference. You can use any approach to integrate timm backbones in your implementation.

@ly015 ly015 mentioned this pull request Mar 20, 2023
@Annbless
Copy link
Contributor Author

Hi there,

Thanks for your patience. We have uploaded a timm version of the ViTPose.

The training logs are available here.
vitpose_base.log
vitpose_simple_base.log
vitpose_small.log
vitpose_simple_small.log

@Annbless
Copy link
Contributor Author

Hi there,

Are there any things we could help to aid the merge of the PR? We are willing to provide more information.

Best,

@Tau-J Tau-J changed the base branch from master to dev-0.x April 20, 2023 08:07
@Tau-J Tau-J merged commit 6fb1280 into open-mmlab:dev-0.x Apr 20, 2023
@Tau-J Tau-J mentioned this pull request Apr 20, 2023
11 tasks
Ben-Louis pushed a commit to Ben-Louis/mmpose that referenced this pull request Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants