BEV-Guided Multi-Modality Fusion for Driving Perception

BEV-Guided Multi-Modality Fusion for Driving Perception
Yunze Man, Liang-Yan Gui, Yu-Xiong Wang
https://yunzeman.github.io/BEVGuide/

Abstract: Integrating multiple sensors and addressing diverse tasks in an end-to-end algorithm are challenging yet critical topics for autonomous driving. To this end, we introduce BEVGuide, a novel Bird's Eye-View (BEV) representation learning framework, representing the first attempt to unify a wide range of sensors under direct BEV guidance in an end-to-end fashion. Our architecture accepts input from a diverse sensor pool, including but not limited to Camera, Lidar and Radar sensors, and extracts BEV feature embeddings using a versatile and general transformer backbone. We design a BEV-guided multi-sensor attention block to take queries from BEV embeddings and learn the BEV representation from sensor-specific features. BEVGuide is efficient due to its lightweight backbone design and highly flexible as it supports almost any input sensor configurations. Extensive experiments demonstrate that our framework achieves exceptional performance in BEV perception tasks with a diverse sensor set.

Official PyTorch implementation coming soon

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BEV-Guided Multi-Modality Fusion for Driving Perception

About

Releases

Packages

License

YunzeMan/BEVGuide

Folders and files

Latest commit

History

Repository files navigation

BEV-Guided Multi-Modality Fusion for Driving Perception

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages