Skip to content
View Andy1621's full-sized avatar
😇
Paper Reading
😇
Paper Reading
Block or Report

Block or report Andy1621

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 8,136 452 Updated Aug 4, 2024

Kolors Team

Python 2,910 166 Updated Aug 1, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,622 102 Updated Jul 29, 2024

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,104 39 Updated Jul 14, 2024

Your image is almost there!

Python 7,027 411 Updated Jul 26, 2024

Pandora: Towards General World Model with Natural Language Actions and Video States

Python 438 29 Updated May 27, 2024

MambaOut: Do We Really Need Mamba for Vision?

Python 1,910 30 Updated Jun 6, 2024
Python 93 4 Updated Apr 15, 2024

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 374 139 Updated Jul 4, 2024

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …

Python 3,902 295 Updated Jul 16, 2024
Python 102 7 Updated Jun 21, 2024

Minimal multi-gpu implementation of EDM2: "Analyzing and Improving the Training Dynamics of Diffusion Models"

Python 23 Updated Mar 5, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,111 277 Updated May 4, 2024

Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"

Python 565 25 Updated Mar 12, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 21,057 2,002 Updated Aug 4, 2024

Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)

Python 490 35 Updated Apr 23, 2024

Mamba SSM architecture

Python 12,017 1,008 Updated Aug 3, 2024

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding

Python 726 57 Updated Jul 6, 2024

Transparent Image Layer Diffusion using Latent Transparency

1,940 24 Updated Jun 16, 2024
Python 7,041 545 Updated Jul 25, 2024

[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David …

Python 80 8 Updated Feb 26, 2024
Python 161 11 Updated Jan 16, 2024

Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)

135 9 Updated Jan 7, 2024

[CVPR2024] Make Your Dream A Vlog

Python 394 40 Updated Mar 19, 2024

[ECCV 2024] The official code of paper "Open-Vocabulary SAM".

Python 861 27 Updated Jul 31, 2024
Python 547 25 Updated Feb 15, 2024

Let us democratise high-resolution generation! (CVPR 2024)

Jupyter Notebook 1,948 228 Updated Apr 15, 2024

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 24,403 5,039 Updated Aug 4, 2024

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701

Python 775 35 Updated Mar 25, 2024
Next