Highlights
- Pro
Lists (6)
Sort Name ascending (A-Z)
Starred repositories
🔥ImageFolder: Autoregressive Image Generation with Folded Tokens
A framework for few-shot evaluation of language models.
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
🚀 免费订阅地址,🚀 免费节点,🚀 6小时更新一次,共享节点,节点质量高可用,完全免费。免费clash订阅地址,免费翻墙、免费科学上网、免费梯子、免费ss/v2ray/trojan节点、谷歌商店、翻墙梯子。🚀 Free subscription address, 🚀 Free node, 🚀 Updated every 6 hours, shared node, high-quality n…
Inpaint anything using Segment Anything and inpainting models.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Official repo for Hierarchical Masked 3D Diffusion Model for Video Outpainting
[ECCV 2024] Be-Your-Outpainter https://arxiv.org/abs/2403.13745
[ArXiv 2024] Follow-Your-Canvas: This repo is the official implementation of "Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation"
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
A high-throughput and memory-efficient inference and serving engine for LLMs
Controllable Text Generation for Large Language Models: A Survey
Official inference repo for FLUX.1 models
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation
[CVPR2024] DisCo: Referring Human Dance Generation in Real World
IQA: Deep Image Structure and Texture Similarity Metric
[ECCV 2024] Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
Enjoy the magic of Diffusion models!
High-Resolution Image Synthesis with Latent Diffusion Models