TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

Chitta, Kashyap; Prakash, Aditya; Jaeger, Bernhard; Yu, Zehao; Renz, Katrin; Geiger, Andreas

Computer Science > Computer Vision and Pattern Recognition

arXiv:2205.15997 (cs)

[Submitted on 31 May 2022]

Title:TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

Authors:Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, Andreas Geiger

View PDF

Abstract:How should we integrate representations from complementary sensors for autonomous driving? Geometry-based fusion has shown promise for perception (e.g. object detection, motion forecasting). However, in the context of end-to-end driving, we find that imitation learning based on existing sensor fusion methods underperforms in complex driving scenarios with a high density of dynamic agents. Therefore, we propose TransFuser, a mechanism to integrate image and LiDAR representations using self-attention. Our approach uses transformer modules at multiple resolutions to fuse perspective view and bird's eye view feature maps. We experimentally validate its efficacy on a challenging new benchmark with long routes and dense traffic, as well as the official leaderboard of the CARLA urban driving simulator. At the time of submission, TransFuser outperforms all prior work on the CARLA leaderboard in terms of driving score by a large margin. Compared to geometry-based fusion, TransFuser reduces the average collisions per kilometer by 48%.

Comments:	arXiv admin note: text overlap with arXiv:2104.09224
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
Cite as:	arXiv:2205.15997 [cs.CV]
	(or arXiv:2205.15997v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2205.15997

Submission history

From: Kashyap Chitta [view email]
[v1] Tue, 31 May 2022 17:57:19 UTC (33,132 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators