Stars
Container runtimes on macOS (and Linux) with minimal setup
first base model for full-duplex conversational audio
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
[ECCV 2024 Oral🔥] Arc2Face: A Foundation Model for ID-Consistent Human Faces
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
PoC associated to the talk "Attacking Samsung Galaxy A* Boot Chain" (https://www.blackhat.com/us-24/briefings/schedule/#attacking-samsung-galaxy-a-boot-chain-and-beyond-38526)
A tool uses Windows Filtering Platform (WFP) to block Endpoint Detection and Response (EDR) agents from reporting security events to the server.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
Deobfuscated & Decompiled Minecraft source code with Mojang mapping
Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
Distribute and run LLMs with a single file.
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Convert any PDF into a podcast episode!
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Finetune Llama 3.2, Mistral, Phi, Qwen & Gemma LLMs 2-5x faster with 80% less memory
StyleSwap: Style-Based Generator Empowers Robust Face Swapping (ECCV 2022)
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Creates a complete full text historical archive for an RSS or ATOM feed.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
🖧🔍 WIFI / LAN intruder detector. Scans for devices connected to your network and alerts you if new and unknown devices are found.