paraGONG

Follow

paraGONG

Follow

3 followers · 10 following

Stars

Zjh-819 / LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

2,608 168 Updated Nov 28, 2023

allenai / reward-bench

RewardBench: the first evaluation tool for reward models.

Python 425 50 Updated Oct 23, 2024

OpenLMLab / MOSS-RLHF

MOSS-RLHF

Python 1,291 101 Updated Mar 3, 2024

yfzhang114 / Generalization-Causality

关于domain generalization，domain adaptation，causality，robutness，prompt，optimization，generative model各式各样研究的阅读笔记

1,170 101 Updated Dec 14, 2023

princeton-nlp / LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Jupyter Notebook 368 32 Updated Oct 20, 2024

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 3,818 514 Updated Nov 6, 2024

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 789 66 Updated Nov 8, 2024

RLHFlow / Online-RLHF

A recipe for online RLHF and online iterative DPO.

Python 414 46 Updated Nov 8, 2024

XayahSuSuSu / Latex-HNUThesisTemplate

湖南大学本科毕业论文LaTeX模板（大理类）

TeX 2 2 Updated May 5, 2024

louieworth / awesome-rlhf

An index of algorithms for reinforcement learning from human feedback (rlhf))

87 1 Updated Apr 17, 2024

zepingyu0512 / awesome-llm-understanding-mechanism

awesome papers in LLM interpretability

315 12 Updated Nov 8, 2024

openai / transformer-debugger

Python 4,034 236 Updated Jun 4, 2024

wangshusen / DRL

Deep Reinforcement Learning

3,330 587 Updated Dec 10, 2022

openai / weak-to-strong

Python 2,501 305 Updated May 19, 2024

Improbable-AI / curiosity_redteam

Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizXgXU)

Jupyter Notebook 60 10 Updated Mar 15, 2024

microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,891 176 Updated Nov 8, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 29,834 4,504 Updated Nov 9, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)

Python 2,536 247 Updated Nov 9, 2024

thunlp / UltraChat

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)

Python 2,250 115 Updated Mar 13, 2024

OFA-Sys / InsTag

InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning

214 7 Updated Aug 20, 2023

langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 94,577 15,304 Updated Nov 9, 2024

ZigeW / data_management_LLM

Collection of training data management explorations for large language models

282 29 Updated Aug 2, 2024

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,533 4,050 Updated Jul 17, 2024

ydyjya / Awesome-LLM-Safety

A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…

975 54 Updated Nov 8, 2024

project-baize / baize-chatbot

Let ChatGPT teach your own chatbot in hours with a single GPU!

Python 3,166 285 Updated Mar 17, 2024

PAIR-code / lit

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

TypeScript 3,490 356 Updated Nov 6, 2024

ridiculouz / LLMaAA

The official repository for paper "LLMaAA: Making Large Language Models as Active Annotators"

Python 34 3 Updated Apr 14, 2024

openai / openai-python

The official Python library for the OpenAI API

Python 22,899 3,204 Updated Nov 6, 2024

dsdanielpark / Bard-API

The unofficial python package that returns response of Google Bard through cookie value.

Python 5,325 527 Updated Apr 24, 2024

baichuan-inc / Baichuan2

A series of large language models developed by Baichuan Intelligent Technology

Python 4,086 295 Updated Nov 8, 2024