-
Baidu | HUST
- Shang Hai, China
- https://ifzhang.github.io/
Block or Report
Block or report ifzhang
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Official repo for paper "Tora: Trajectory-oriented Diffusion Transformer for Video Generation"
OmniTokenizer: one model and one weight for image-video joint tokenization.
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
A Generalizable World Model for Autonomous Driving
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
A method that can match the 3D point cloud sub-map generated by the robot during the SLAM process with the 2D map.
[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"
[ICRA'2024] Rethinking Imitation-based Planner for Autonomous Driving
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models (CVPR 2024)
Layout-Guided multi-view driving scene video generation with latent diffusion model
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
[IEEE T-PAMI] All you need for End-to-end Autonomous Driving
[CoRL'23] Parting with Misconceptions about Learning-based Vehicle Motion Planning
PyTorch code for the paper "Model-Based Imitation Learning for Urban Driving".
A curated list of awesome End-to-End Autonomous Driving resources (continually updated)
[Information Fusion (Vol.103, Mar. '24)] Boosting Image Matting with Pretrained Plain Vision Transformers
[ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation
[CVPR 2023] Pytorch implementation of ThinkTwice, a SOTA Decoder for End-to-end Autonomous Driving under BEV.
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
🤖 Assemble, configure, and deploy autonomous AI Agents in your browser.