Stars
One-stop data intelligence agent, providing insights from all mainstream data formats in a single dialogue box, including documents, databases, business systems, and images.一站式数据智能体,一个对话框提供所有主流格式数据…
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
[SIGIR'2024] "GraphGPT: Graph Instruction Tuning for Large Language Models"
The "virtual_human_stream" project is a real-time digital human system supporting audio-video dialogue. It integrates models like ernerf, musetalk, and wav2lip for voice cloning, video stitching, a…
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
⭐ Dynamically generate stats SVG from your Github, LeetCode, Steam, and more in #Cyberpunk style :)
AI外呼系统,基于自然语言处理(NLP)、语音识别(ASR)、语音合成(TTS)和通讯(freeswitch)技术,实现自动语音应答,听说状态的实时切换,用自然逼真的对话与客户沟通。
This is the official reproduction of FancyVideo.
Official implementation for "Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model" (NeurIPS 2024)
数字底座是一款面向大型政府、企业数字化转型,基于身份认证、组织架构、岗位职务、应用系统、资源角色等功能构建的统一且安全的管理支撑平台。数字底座基于三员管理模式,具备微服务、多租户、容器化和国产化,支持用户利用代码生成器快速构建自己的业务应用,同时可关联诸多成熟且好用的内部生态应用
A Multimodal Native Agent Framework for Smart Hardware and More
A Python library for converting images into FPGA-displayable pixel art.
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-spee…
Awesome LLMs on Device: A Comprehensive Survey
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Next-Generation Interactive Intelligent Programming Assistant
Cocos simplifies game creation and distribution with Cocos Creator, a free, open-source, cross-platform game engine. Empowering millions of developers to create high-performance, engaging 2D/3D gam…
Accelerate your Stable Diffusion inference with the library's universal C/C++ framework design, powered by ONNXRuntime & across platforms.
eBPF-based Cloud Native Monitoring Tool
The Official Implementation of PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model RL Expert
Open toolkit to create a AI model to detect and prevent fraud transactions for a financial company.