Starred repositories
[ACMMM 2024] Hybrid Cost Volume for Memory-Efficient Optical Flow
IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching
The official code of "TOGS: Gaussian Splatting with Temporal Opacity Offset for Real-Time 4D DSA Rendering"
GaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enhanced Quality
This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
[arXiv'24] EVA-X: A foundation model for general chest X-ray analysis with self-supervised learning
MC$^2$: Multi-concept Guidance for Customized Multi-concept Generation
[ICLR 2024] "Less is More: One-shot Subgraph Reasoning on Large-scale Knowledge Graphs"
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
Code for paper LocalMamba: Visual State Space Model with Windowed Selective Scan
MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
[CVPR 2024] Real-Time HDR Video Reconstruction
VideoSys: An easy and efficient system for video generation
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models (CVPR 2024)
[ACM MM 2024] WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)