qiugen

shawn_will_be_fine qiugen

6 followers · 9 following

xiamen university

Stars

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 34,024 4,190 Updated Nov 10, 2024

liucongg / NLPDataSet

记录本人整理的一些数据集

1,003 131 Updated Jun 16, 2022

ChenghaoMou / text-dedup

All-in-one text de-duplication

Python 619 71 Updated May 21, 2024

xsysigma / TencentLLMEval

TencentLLMEval is a comprehensive and extensive benchmark for artificial evaluation of large models that includes task trees, standards, data verification methods, and more.

38 1 Updated Aug 20, 2024

togethercomputer / RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,569 350 Updated Oct 17, 2024

quqixun / ReadWiki-ZH

Convert WIKI dumped XML (Chinese) to human readable documents in markdown and txt.

Python 6 2 Updated Mar 25, 2020

attardi / wikiextractor

A tool for extracting plain text from Wikipedia dumps

Python 3,748 967 Updated May 23, 2024

brandontrabucco / wikipedia_dataset

This is a repository using the Wiki Extractor to build and prepare WIKIPEDIA for use in tensorflow.

Python 1 Updated Jul 21, 2018

facebookresearch / mlqe

We release a dataset based on Wikipedia sentences and the corresponding translations in 6 different languages along with the scores (scale 1 to 100) generated though human evaluations that represen…

81 14 Updated Aug 31, 2021

tatsu-lab / alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,514 242 Updated Oct 23, 2024

microsoft / Megatron-DeepSpeed

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 1,889 344 Updated Oct 18, 2024

HIT-SCIR / huozi

活字通用大模型

Python 352 21 Updated Sep 12, 2024

immersive-translate / immersive-translate

沉浸式双语网页翻译扩展 , 支持输入框翻译，鼠标悬停翻译， PDF, Epub, 字幕文件, TXT 文件翻译 - Immersive Dual Web Page Translation Extension

14,208 786 Updated Nov 5, 2024

TigerResearch / TigerBot

TigerBot: A multi-language multi-task LLM

Python 2,240 194 Updated Jun 7, 2024

janlle / 12306

12306 订票程序，自动登录，自动下单

Python 23 10 Updated Jan 17, 2023

langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 94,632 15,313 Updated Nov 9, 2024

LianjiaTech / BELLE

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

HTML 7,908 759 Updated Oct 16, 2024

qiugen / self-instruct

Forked from yizhongw/self-instruct

Aligning pretrained language models with instruction data generated by themselves.

Python 1 Updated Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shawn_will_be_fine qiugen

Block or report qiugen

Stars

hiyouga / LLaMA-Factory

liucongg / NLPDataSet

ChenghaoMou / text-dedup

xsysigma / TencentLLMEval

togethercomputer / RedPajama-Data

quqixun / ReadWiki-ZH

attardi / wikiextractor

brandontrabucco / wikipedia_dataset

facebookresearch / mlqe

tatsu-lab / alpaca_eval

microsoft / Megatron-DeepSpeed

HIT-SCIR / huozi

immersive-translate / immersive-translate

TigerResearch / TigerBot

janlle / 12306

langchain-ai / langchain

LianjiaTech / BELLE

qiugen / self-instruct

BrianPulfer / PapersReimplementations

CarperAI / trlx

shengcaishizhan / kkndme_tianya

openai / summarize-from-feedback

anthropics / hh-rlhf

anthropics / ConstitutionalHarmlessnessPaper

PaddlePaddle / Paddle2ONNX

PaddlePaddle / PaddleHelix

microsoft / calculator

Lynten / smt

moses-smt / giza-pp

google-research / bert