Stars
[NeurIPS 2023] Structural Pruning for Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Official inference repo for FLUX.1 models
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!
Open-Sora: Democratizing Efficient Video Production for All
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
Learning materials for Stanford CS149 : Parallel Computing
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
《Machine Learning Systems: Design and Implementation》- Chinese Version
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods
Denoising Diffusion Probabilistic Models
A high-throughput and memory-efficient inference and serving engine for LLMs
An open-source framework for training large multimodal models.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
QLoRA: Efficient Finetuning of Quantized LLMs
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility