Stars
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
A modular graph-based Retrieval-Augmented Generation (RAG) system
Build AI Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.
DSPy: The framework for programming—not prompting—foundation models
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
LlamaIndex is a data framework for your LLM applications
High accuracy RAG for answering questions from scientific documents with citations
A list of awesome papers and resources of recommender system on large language model (LLM).
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
🦜🔗 Build context-aware reasoning applications
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLOOM,GPT2,Seq2Seq,BART,T5,UDA等模型的训练和预测,开箱即用。
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Code and documentation to train Stanford's Alpaca models, and generate the data.
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
A curated list of research papers in Sentence Reprsentation Learning and a sts leaderboard of sentence embeddings.
MuCGEC中文纠错数据集及文本纠错SOTA模型开源;Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction"
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
A system for quickly generating training data with weak supervision