Skip to content
View WenhaoZhang-Git's full-sized avatar
Block or Report

Block or report WenhaoZhang-Git

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
125 results for source starred repositories
Clear filter

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

Python 1,653 108 Updated Jun 21, 2024

A series of large language models developed by Baichuan Intelligent Technology

Python 4,018 285 Updated Jun 22, 2024

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 17,808 1,829 Updated Apr 30, 2024

pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。

Python 5,305 1,075 Updated May 17, 2024

jcorrector 中文文本纠错工具, Text Error Correction Tool,Spelling Check

Java 46 14 Updated Jan 18, 2023

Continuation of Clash Verge - A Clash Meta GUI based on Tauri (Windows, MacOS, Linux)

TypeScript 23,975 1,800 Updated Jun 22, 2024

A clash client for Windows, support Mihomo

C# 4,572 581 Updated May 5, 2024

A GUI client for Windows, support Xray core and v2fly core and others

C# 63,522 10,814 Updated Jun 22, 2024

unified embedding model

Python 778 58 Updated Sep 1, 2023

Converts Microsoft Word docx to LaTeX

XSLT 503 48 Updated Jun 18, 2024

Minimalistic large language model 3D-parallelism training

Python 924 81 Updated Jun 22, 2024

A series of large language models trained from scratch by developers @01-ai

Python 7,400 453 Updated Jun 19, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 1,701 101 Updated Jun 22, 2024
Python 251 41 Updated Nov 2, 2023

A quick guide (especially) for trending instruction finetuning datasets

2,167 141 Updated Nov 28, 2023

This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model.

277 14 Updated May 6, 2024

OCR, layout analysis, reading order, line detection in 90+ languages

Python 8,830 549 Updated Jun 21, 2024

Convert PDF to markdown quickly with high accuracy

Python 13,132 650 Updated Jun 17, 2024

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 5,406 293 Updated Jun 21, 2024

FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

Jupyter Notebook 12,465 1,751 Updated Jun 16, 2024

总结Prompt&LLM论文,开源数据&模型,AIGC应用

2,316 220 Updated Jun 20, 2024

MNBVC项目-ShareGPT语料清洗

Python 12 Updated Oct 4, 2023

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,151 219 Updated Jun 18, 2024

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

HTML 7,676 743 Updated Mar 15, 2024

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 8,371 536 Updated Apr 16, 2024

SimPO: Simple Preference Optimization with a Reference-Free Reward

Python 455 28 Updated Jun 2, 2024

Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个

706 23 Updated Jun 18, 2024

Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,123 469 Updated Jun 7, 2024

Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型

Python 381 27 Updated Oct 21, 2023

[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Python 125 8 Updated Mar 25, 2024
Next