-
Shanghai AI Lab
- Shanghai
- https://tai-wang.github.io/
- @wangtai97
Highlights
- Pro
Block or Report
Block or report Tai-Wang
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
GRUtopia: Dream General Robots in a City at Scale
Code&Data for Grounded 3D-LLM with Referent Tokens
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
Official implementation of the paper "PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios" (CVPR 2024).
Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"
[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Learning-based locomotion control from OpenRobotLab, including Hybrid Internal Model & H-Infinity Locomotion Control
[NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation
An Open-source Framework for Autonomous Language Agents
[ICLR 2024 Spotlight] Unified Human-Scene Interaction via Prompted Chain-of-Contacts
[ECCV 2024] PointLLM: Empowering Large Language Models to Understand Point Clouds
[ICCV 2023] GeoMIM: towards better 3d knowledge transfer via masked image modeling for multi-view 3d understanding
[ECCV 2024] DriveLM: Driving with Graph Visual Question Answering
A lightweight framework for building LLM-based agents
3D Occupancy Prediction Benchmark in Autonomous Driving
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Official release of InternLM2.5 7B base and chat models. 1M context support
Official Code for DragGAN (SIGGRAPH 2023)
Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities
An open-source tool-augmented conversational language model from Fudan University
Topology Reasoning for Scene Perception in Autonomous Driving
[CoRL 2023] DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking
[IJCV 2024] P3Former: Position-Guided Point Cloud Panoptic Segmentation Transformer
[CVPR 2023] MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
A collaboration friendly studio for NeRFs