Stars
A Collection of Variational Autoencoders (VAE) in PyTorch.
Official implementation of the "Multimodal Parameter-Efficient Few-Shot Class Incremental Learning" paper
Audio generation using diffusion models, in PyTorch.
Pytorch implementation of VQGAN (Taming Transformers for High-Resolution Image Synthesis) (https://arxiv.org/pdf/2012.09841.pdf)
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Implementation of Transframer, Deepmind's U-net + Transformer architecture for up to 30 seconds video generation, in Pytorch
A collection of resources and papers on Diffusion Models
✨✨Latest Advances on Multimodal Large Language Models