Starred repositories
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-V…
Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
Convert PDF to markdown quickly with high accuracy
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
A High-efficiency Open-source Toolkit for Table-to-Latex Task
trzsz is a simple file transfer tools, similar to lrzsz ( rz / sz ), and compatible with tmux.
A Comprehensive Toolkit for High-Quality PDF Content Extraction
A curated list of resources for Document Understanding (DU) topic
The Open-Source Data Annotation Platform
A lightweight, fast, and secure code execution environment that supports multiple programming languages
A simple, easy-to-hack GraphRAG implementation
[ACL24] Official repo for "Synthesizing Text-to-SQL Data from Weak and Strong LLMs"
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
MindSpore online courses: Step into LLM
🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
A Unified Toolkit for Deep Learning Based Document Image Analysis
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Fast numerical array expression evaluator for Python, NumPy, Pandas, PyTables and more
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Deploy your agentic worfklows to production
A modular graph-based Retrieval-Augmented Generation (RAG) system
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,支持sso登录,多租户,大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU,边缘计算,serverless,标注平台,自动化标注,数据集管理,大模型微调,vllm大模型推理,llmops,私有知识库,AI模型应用商店,支持模型一键开发/推理/微调,…