Highlights
- Pro
Lists (8)
Sort Name ascending (A-Z)
Stars
Depth Any Video with Scalable Synthetic Data
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A collection of ZSH frameworks, plugins, themes and tutorials.
Official implementations for paper: Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed
ControlNet++: All-in-one ControlNet for image generations and editing!
[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
Open-MAGVIT2: Democratizing Autoregressive Visual Generation
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Diffusion Illusions: Hiding Images in Plain Sight
PT 助手 Plus,为 Microsoft Edge、Google Chrome、Firefox 浏览器插件(Web Extensions),主要用于辅助下载 PT 站的种子。
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation
[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).
Official code for "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps" (Neurips 2022 Oral)
[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
High-resolution models for human tasks.
Evaluating text-to-image/video/3D models with VQAScore
MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.