-
Baidu | HUST
- Shang Hai, China
- https://ifzhang.github.io/
Block or Report
Block or report ifzhang
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
A Generalizable World Model for Autonomous Driving
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
A method that can match the 3D point cloud sub-map generated by the robot during the SLAM process with the 2D map.
[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"
[ICRA'2024] Rethinking Imitation-based Planner for Autonomous Driving
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models (CVPR 2024)
Layout-Guided multi-view driving scene video generation with latent diffusion model
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…
All you need for End-to-end Autonomous Driving
[CoRL'23] Parting with Misconceptions about Learning-based Vehicle Motion Planning
PyTorch code for the paper "Model-Based Imitation Learning for Urban Driving".
A curated list of awesome End-to-End Autonomous Driving resources (continually updated)
[Information Fusion] Boosting Image Matting with Pretrained Plain Vision Transformers
[ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation
[CVPR 2023] Pytorch implementation of ThinkTwice, a SOTA Decoder for End-to-end Autonomous Driving under BEV.
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
🤖 Assemble, configure, and deploy autonomous AI Agents in your browser.
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
RoboBEV: Towards Robust Bird's Eye View Perception under Common Corruption and Domain Shift
Grounded Segment Anything: From Objects to Parts
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.