-
Defer to Expertise LLC
- Virginia, USA
- https://defertoexpertise.com
Lists (1)
Sort Name ascending (A-Z)
Stars
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
Source code for the SIGGRAPH 2024 paper "X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention"
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Trench — Open-Source Analytics Infrastructure. A single production-ready Docker image built on ClickHouse, Kafka, and Node.js for tracking events, users, page views, and interactions.
VPN client in a thin Docker container for multiple VPN providers, written in Go, and using OpenVPN or Wireguard, DNS over TLS, with a few proxy servers built-in.
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Text to Shorts/Tiktoks, AI Video Engine
All my self trained & released AI upscaling models. After gathering and applying over 600 different upscaling models, I learned how to train my own models, and these are the results.
Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
A Node for ComfyUI that does what you ask it to do
This a collection of ComfyUI workflows to upscale images to 2K, 4K or 8K. Great for general upscale on photos and illustrations with Magnific-like results.
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Media Hoarder - THE media frontend for data hoarders and movie lovers
OneTrainer is a one-stop solution for all your stable diffusion training needs.