Stars
[ECCV 2024] Fully Sparse 3D Occupancy Prediction & RayIoU Evaluation Metric
[ICCV 2023] SurroundOcc: Multi-camera 3D Occupancy Prediction for Autonomous Driving
Dataset generation for NeuralGrasps https://arxiv.org/abs/2207.02959
Code repo for MultiGripperGrasp Dataset
Implementation for ICRA 2021 paper "RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images"
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
paper list of robotic grasping and some related works
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
[ACL 2024] IEPile: A Large-Scale Information Extraction Corpus
[Information Fusion 2024] A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective
[NeurIPS'24] NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction
Trim 3D Gaussian Splatting for Accurate Geometry Representation
[Arxiv] Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes the Lead.
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
The official implementation of SAGS (Segment Anything in 3D Gaussians)
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Vision-based 3D occupancy prediction in autonomous driving: a review and outlook
Point-SAM: This is the official repository of "Point-SAM: Promptable 3D Segmentation Model for Point Clouds". We provide codes for running our demo and links to download checkpoints.
TC-Stereo (ECCV2024)
The official implementation of SAGA (Segment Any 3D GAussians)
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI