Skip to content
View WenhaoZhang-Git's full-sized avatar

Block or report WenhaoZhang-Git

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Python 2,432 150 Updated Sep 1, 2024

A series of large language models developed by Baichuan Intelligent Technology

Python 4,072 293 Updated Jun 22, 2024

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,118 1,853 Updated Apr 30, 2024

pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。

Python 5,443 1,086 Updated Aug 28, 2024

jcorrector 中文文本纠错工具, Text Error Correction Tool,Spelling Check

Java 51 14 Updated Jan 18, 2023

A Clash GUI based on tauri. Supports Windows, macOS and Linux.

TypeScript 21,014 3,159 Updated Nov 3, 2023

Continuation of Clash Verge - A Clash Meta GUI based on Tauri (Windows, MacOS, Linux)

TypeScript 31,350 2,396 Updated Aug 23, 2024

A clash client for Windows, support Mihomo

C# 4,717 585 Updated Jun 29, 2024

A GUI client for Windows, support Xray core and v2fly core and others

C# 66,434 11,131 Updated Sep 1, 2024

unified embedding model

Python 809 61 Updated Sep 1, 2023

Converts Microsoft Word docx to LaTeX

XSLT 517 48 Updated Jul 12, 2024

Minimalistic large language model 3D-parallelism training

Python 1,079 103 Updated Sep 1, 2024

A series of large language models trained from scratch by developers @01-ai

Jupyter Notebook 7,572 463 Updated Aug 22, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 1,897 131 Updated Aug 30, 2024
Python 263 42 Updated Nov 2, 2023

A quick guide (especially) for trending instruction finetuning datasets

2,397 155 Updated Nov 28, 2023

This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model.

300 16 Updated May 6, 2024

OCR, layout analysis, reading order, line detection in 90+ languages

Python 9,711 629 Updated Aug 26, 2024

Convert PDF to markdown quickly with high accuracy

Python 16,042 892 Updated Aug 21, 2024

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 7,134 422 Updated Aug 30, 2024

FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

Jupyter Notebook 13,419 1,873 Updated Jul 18, 2024

总结Prompt&LLM论文,开源数据&模型,AIGC应用

2,543 255 Updated Aug 29, 2024

MNBVC项目-ShareGPT语料清洗

Python 12 Updated Oct 4, 2023

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,341 231 Updated Sep 1, 2024

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

HTML 7,795 753 Updated Mar 15, 2024

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 8,675 553 Updated Apr 16, 2024

SimPO: Simple Preference Optimization with a Reference-Free Reward

Python 619 36 Updated Aug 22, 2024

Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个

886 33 Updated Jul 31, 2024

Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,582 502 Updated Jul 16, 2024

Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型

Python 394 30 Updated Oct 21, 2023