- Los Angeles, USA
Block or Report
Block or report xuanhan863
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Sort by: Recently starred
A lightning-fast workflow builder, it supports multimodal interaction, highly customizable extensions, and is intuitive to use even without any coding knowledge.
Agentic components of the Llama Stack APIs
This repository is an implementation that recreates the SketchGuidance feature of "ToonCrafter".
[ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"
Multitrack music mixing style transfer given a reference song using differentiable mixing console.
Generative models for conditional audio generation
Prompty makes it easy to create, manage, debug, and evaluate LLM prompts for your AI applications. Prompty is an asset class and format for LLM prompts designed to enhance observability, understand…
A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.
Learn How Transformers work in Generative AI with Interactive Visualization
A novel framework manipulating CLIP embeddings via projection to remove objects using Stable Diffusion prior.
Implementation of "Disentangled Motion Modeling for Video Frame Interpolation"
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Efficient Multi-modal Models via Stage-wise Visual Context Compression
Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls
Gemma 2 optimized for your local machine.
OpenAI Triton backend for Intel® GPUs
AuraSR: GAN-based Super-Resolution for real-world
A project that optimizes Whisper for low latency inference using NVIDIA TensorRT
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
libsoni: A Python Toolbox for Sonifying Music Annotations and Feature Representations
Plug and Play XAI: Explain Your AI Models with Ease
Official Implementation for "The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing"
Fourier123: One Image to High-Quality 3D Object Generation with Hybrid Fourier Score Distillation
Official implementation of "AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising"