-
CUHKSZ
- Shenzhen, China
- https://blog.csdn.net/zhanghm1995
Highlights
Block or Report
Block or report zhanghm1995
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (1)
Sort Name ascending (A-Z)
Stars
Language
Sort by: Recently starred
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Official PyTorch implementation of D^2-World as the second place of CVPR 2024 Predictive World Model Challenge.
Reasoning 3D Segmentation - "segment anything"/grounding/part seperation in 3D with natural conversations.
DoGaussian: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus
[CVPR 2024 Oral, Best Paper Award Candidate] Official repository of "PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness"
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
Open-Sora: Democratizing Efficient Video Production for All
[CVPR'24] DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization
[CVPR2024 Highlight] Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration
The official PyTorch implementation of Google's Gemma models
(AAAI2024) Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
A curated list of awesome world models for autonomous driving (continually updated)
A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.
[ECCV 2024] 3D World Model for Autonomous Driving
[ICRA 2024] RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision. (Former version: UniOcc)
Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
A series of large language models trained from scratch by developers @01-ai
A JAX-based simulator for autonomous driving research.
[ICLR 2024] Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
Talk2BEV: Language-Enhanced Bird's Eye View Maps (Accepted to ICRA'24)
An Invitation to 3D Vision: A Tutorial for Everyone
Meta-Transformer for Unified Multimodal Learning
[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation