Stars
Helpful DoggyBot: Open-World Object Fetching using Legged Robots and Vision-Language Models
The official repo for the paper "In-Context Imitation Learning via Next-Token Prediction"
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
Robust Speech Recognition via Large-Scale Weak Supervision
A nearly-live implementation of OpenAI's Whisper.
SuperGlue: Learning Feature Matching with Graph Neural Networks (CVPR 2020, Oral)
An open, modular framework for zero-shot, language conditioned pick-and-drop tasks in arbitrary homes.
Mobile manipulation research tools for roboticists
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
Language-Driven Semantic Segmentation
[ICRA2023] Implementation of Visual Language Maps for Robot Navigation
[RA-L] DRAGON: A Dialogue-Based Robot for Assistive Navigation with Visual Language Grounding
republish livox raw message to standard pointcloud2
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
An open source implementation of CLIP.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Tool extensions for ros2bag cli
A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
[ISER 2023] The official implementation of Audio Visual Language Maps for Robot Navigation
The repository provides code associated with the paper VLFM: Vision-Language Frontier Maps for Zero-Shot Semantic Navigation (ICRA 2024)
Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.
SC2-PCR: A Second Order Spatial Compatibility for Efficient and Robust Point Cloud Registration (CVPR 2022)
[CVPR 2021, Oral] PREDATOR: Registration of 3D Point Clouds with Low Overlap.