ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Bhat, Shariq Farooq; Birkl, Reiner; Wofk, Diana; Wonka, Peter; Müller, Matthias

Abstract:This paper tackles the problem of depth estimation from a single image. Existing work either focuses on generalization performance disregarding metric scale, i.e. relative depth estimation, or state-of-the-art results on specific datasets, i.e. metric depth estimation. We propose the first approach that combines both worlds, leading to a model with excellent generalization performance while maintaining metric scale. Our flagship model, ZoeD-M12-NK, is pre-trained on 12 datasets using relative depth and fine-tuned on two datasets using metric depth. We use a lightweight head with a novel bin adjustment design called metric bins module for each domain. During inference, each input image is automatically routed to the appropriate head using a latent classifier. Our framework admits multiple configurations depending on the datasets used for relative depth pre-training and metric fine-tuning. Without pre-training, we can already significantly improve the state of the art (SOTA) on the NYU Depth v2 indoor dataset. Pre-training on twelve datasets and fine-tuning on the NYU Depth v2 indoor dataset, we can further improve SOTA for a total of 21% in terms of relative absolute error (REL). Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains. The code and pre-trained models are publicly available at this https URL .

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2302.12288 [cs.CV]
	(or arXiv:2302.12288v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2302.12288

Computer Science > Computer Vision and Pattern Recognition

Title:ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators