Skip to content

Official code repository for the paper: Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching

Notifications You must be signed in to change notification settings

zhanghao5201/KDSM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

KDSM

Official code repository for the paper: Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching [Hao Zhang, Lumin Xu, Shenqi Lai, Wenqi Shao, Nanning Zheng, Ping Luo, Yu Qiao, Kaipeng Zhang]

Abstract

Current image-based keypoint detection methods for animal (including human) bodies and faces are generally divided into fully supervised and few-shot class-agnostic approaches. The former typically relies on laborious and time-consuming manual annotations, posing considerable challenges in expanding keypoint detection to a broader range of keypoint categories and animal species. The latter, though less dependent on extensive manual input, still requires necessary support images with annotation for reference during testing. To realize zero-shot keypoint detection without any prior annotation, we introduce the Open-Vocabulary Keypoint Detection (OVKD) task, which is innovatively designed to use text prompts for identifying arbitrary keypoints across any species. In pursuit of this goal, we have developed a novel framework named Open-Vocabulary Keypoint Detection with Semantic-feature Matching (KDSM). This framework synergistically combines vision and language models, creating an interplay between language features and local keypoint visual features. KDSM enhances its capabilities by integrating Domain Distribution Matrix Matching (DDMM) and other special modules, such as the Vision-Keypoint Relational Awareness (VKRA) module, improving the framework’s generalizability and overall performance. Our comprehensive experiments demonstrate that KDSM significantly outperforms the baseline in terms of performance and achieves remarkable success in the OVKD task. Impressively, our method, operating in a zero-shot fashion, still yields results comparable to state-of-the-art few-shot species class-agnostic keypoint detection methods. Codes and data are available at https://github.com/zhanghao5201/KDSM.

Usage

Data preparation

Please follow the official guide to prepare the MP-100 dataset for training and evaluation, and organize the data structure properly. The images in the MP-78 dataset are consistent with MP100. To obtain the annotation files for MP78, please contact [email protected]. Please be aware that the annotations are intended for non-commercial use only.

Install

Please first install pytorch and torchvision following official documentation Pytorch. Then, run pip install -r requirements.txt.

Training and Test

The specific training and testing codes, as well as the trained model filescheckpoints will be available soon.

Citation

@misc{zhang2023openvocabulary,
      title={Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching}, 
      author={Hao Zhang and Lumin Xu and Shenqi Lai and Wenqi Shao and Nanning Zheng and Ping Luo and Yu Qiao and Kaipeng Zhang},
      year={2023},
      eprint={2310.05056},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Thanks to:

License

This project is released under the Apache 2.0 license.

About

Official code repository for the paper: Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published