We present HIPIE, a novel HIerarchical, oPen-vocabulary and unIvErsal image segmentation and detection model that is capable of performing segmentation tasks at various levels of granularities (whole, part and subpart) and tasks, including semantic segmentation, instance segmentation, panoptic segmentation, referring segmentation, and part segmentation, all within a unified framework of language-guided segmentation.
Hierarchical Open-vocabulary Universal Image Segmentation
Xudong Wang*, Shufan Li*, Konstantinos Kallidromitis*, Yusuke Kato, Kazuki Kozuka, Trevor Darrell
Berkeley AI Research, UC Berkeley; Panasonic AI Research
[project page
] [arxiv
] [paper
] [bibtex
]
Currently, this repo contains only codes for demos. Stay tuned.
Please refer to INSTALL.md for more details.
- See Demo-Main for Panoptic, Part, Instance and Referring Segmentation
- See Demo-SD for Combining our model with Stable Diffusion
- See Demo-SAM for Combining our model with Segment Anything
Please check our project page for more demos!
We release two checkpoints at the moment.
- ResNet-50 Pretrained with O365,COCO,RefCOCO,Pascal Panoptic Parts
- ViT-H Pretrained with O365,COCO,RefCOCO
We will release training and evlautation code with more checkpoints soon.
The majority of HIPIE is licensed under the MIT license. If you later add other third party code, please keep this license info updated, and please let us know if that component is licensed under something other than CC-BY-NC, MIT, or CC0.
If you have any general questions, feel free to email us at Xudong Wang, Shufan Li and Konstantinos Kallidromitis. If you have code or implementation-related questions, please feel free to send emails to us or open an issue in this codebase (We recommend that you open an issue in this codebase, because your questions may help others).
If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation.
@misc{wang2023hierarchical,
title={Hierarchical Open-vocabulary Universal Image Segmentation},
author={Xudong Wang and Shufan Li and Konstantinos Kallidromitis and Yusuke Kato and Kazuki Kozuka and Trevor Darrell},
year={2023},
eprint={2307.00764},
archivePrefix={arXiv},
primaryClass={cs.CV}
}