Stars
Document to Markdown OCR library with Llama 3.2 vision
Convert any PDF into a podcast episode!
An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns.
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
rmusser01 / tldw
Forked from the-crypt-keeper/tldwtl/dw (Too Long, Didn't Watch): Your Personal Research Multi-Tool - a naive attempt at 'A Young Lady's Illustrated Primer'
Offical implement of Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for talking head Video Generation
[ECCV 2024] HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting.
A library to generate LaTeX expression from Python code.
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
The Multi-Agent Reasoning framework creates an interactive chatbot where AI agents collaborate via structured reasoning and Swarm Integration for optimal answers. Simulating a team that discusses, …
Entropy Based Sampling and Parallel CoT Decoding
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive A…
Effortlessly run LLM backends, APIs, frontends, and services with one command.
Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
This repository will host the code for the SIGGRAPH Asia 2024 Paper titled: "GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations"
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚