Lists (7)
Sort Name ascending (A-Z)
Stars
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
MINT-1T: A one trillion token multimodal interleaved dataset.
The Blazor WebAssembly app that inspired the Microsoft //Build 2023 demo app.
TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.
We write your reusable computer vision tools. 💜
[ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding"
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Local Vector Database coded in c# supports Cosine Similarity, Jaccard Dissimilarity as well as Euclidean , Manhattan, ChebyShev and Canberra distances
A tool to download whole playlists, channels or single videos from youtube and also optionally convert them to almost any format you would like
The Open Toolkit library is a fast, low-level C# wrapper for OpenGL, OpenAL & OpenCL. It also includes windowing, mouse, keyboard and joystick input and a robust and fast math library, giving you e…
This is a ComfyUi-windows implementation for the image animation project -> UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation
This node was designed to help AI image creators to generate prompts for human portraits.
Stable Diffusion and Flux in pure C/C++
Whisper.net. Speech to text made simple using Whisper Models
Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simpl…
InstantDrag: Improving Interactivity in Drag-based Image Editing
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion