Block or Report
Block or report WenhaoZhang-Git
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
A series of large language models developed by Baichuan Intelligent Technology
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
jcorrector 中文文本纠错工具, Text Error Correction Tool,Spelling Check
Continuation of Clash Verge - A Clash Meta GUI based on Tauri (Windows, MacOS, Linux)
A GUI client for Windows, support Xray core and v2fly core and others
Minimalistic large language model 3D-parallelism training
A series of large language models trained from scratch by developers @01-ai
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
A quick guide (especially) for trending instruction finetuning datasets
This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model.
OCR, layout analysis, reading order, line detection in 90+ languages
Convert PDF to markdown quickly with high accuracy
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Implementation of Nougat Neural Optical Understanding for Academic Documents
SimPO: Simple Preference Optimization with a Reference-Free Reward
Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个
Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement