Lists (19)
Sort Name ascending (A-Z)
Starred repositories
This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achieving exceptional performance on the edge.
KAG is a knowledge-enhanced generation framework based on OpenSPG engine, which is used to build knowledge-enhanced rigorous decision-making and information retrieval knowledge services
Optimized implementation for color-icon-matrix barcodes
m3u8[m3u8-downloader] 视频在线提取工具 流媒体下载 、视频下载 、 m3u8下载 、 B站视频下载 桌面客户端 windows mac
Implementation of UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
CSGO: Content-Style Composition in Text-to-Image Generation 🔥
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Unofficial implementation of PuLID(diffusers) for ComfyUI
[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
Effortless data labeling with AI support from Segment Anything and other awesome models.
Controllable and fast Text-to-Speech for over 7000 languages!
Outfit Anyone(最新修复版): Ultra-high quality virtual try-on for Any Clothing and Any Person
An open-source RAG-based tool for chatting with your documents.
Incremental Knowledge Graphs Constructor Using Large Language Models
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
Dead simple FLUX LoRA training UI with LOW VRAM support
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
Official repository for paper "MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement"
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.