Starred repositories
Biterm Topic Model (BTM): modeling topics in short texts
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
GPT4V-level open-source multi-modal model based on Llama3-8B
All NLP you Need Here. 目前包含15个NLP demo的pytorch实现(大量代码借鉴于其他开源项目,原先是自己玩的,后来干脆也开源出来)
SuperSonic is the next-generation BI+AI platform that unifies Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms.
DuckDB is an analytical in-process SQL database management system
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Time series forecasting with PyTorch
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
bert-base-chinese example
A Streamlit component to render ECharts.
A concept and obvious expression pattern collection of Chinese compound event extraction which then be evolved into ComplexEventGraph,本项目提出了中文复合事件的概念与显式模式,包括条件事件、因果事件、顺承事件、反转事件等事件抽取,并形成事理图谱。
💫 Industrial-strength Natural Language Processing (NLP) in Python
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-V…
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
An Autonomous LLM Agent for Complex Task Solving
Retrieval and Retrieval-augmented LLMs
LlamaIndex is a data framework for your LLM applications
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
A lightweight framework for building LLM-based agents
Official release of InternLM2.5 base and chat models. 1M context support
State-of-the-Art Text Embeddings
a state-of-the-art-level open visual language model | 多模态预训练模型
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents