Stars
High-resolution models for human tasks.
TouchDesigner implementation for real-time Stable Diffusion interactive generation with StreamDiffusion.
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
DeepStream Libraries offer CVCUDA, NvImageCodec, and PyNvVideoCodec modules as Python APIs for seamless integration into custom frameworks.
Official code for the paper "StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control."
A prompting enhancement library for transformers-type text embedding systems
Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.
ControlNet++: All-in-one ControlNet for image generations and editing!
ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).
Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
Sparse autoencoders for Contra text embedding models
Official implementation for "pOps: Photo-Inspired Diffusion Operators"
A deep-dive on the entire history of deep-learning
Code signing and transparency for containers and binaries
A multi-voice TTS system trained with an emphasis on quality
Convert PDF to markdown quickly with high accuracy
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
Training LLMs with QLoRA + FSDP
Faster Whisper transcription with CTranslate2
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper.