Skip to content
View paraGONG's full-sized avatar

Block or report paraGONG

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A quick guide (especially) for trending instruction finetuning datasets

2,608 168 Updated Nov 28, 2023

RewardBench: the first evaluation tool for reward models.

Python 425 50 Updated Oct 23, 2024

MOSS-RLHF

Python 1,291 101 Updated Mar 3, 2024

关于domain generalization,domain adaptation,causality,robutness,prompt,optimization,generative model各式各样研究的阅读笔记

1,170 101 Updated Dec 14, 2023

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Jupyter Notebook 368 32 Updated Oct 20, 2024

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 3,818 514 Updated Nov 6, 2024

Recipes to train reward model for RLHF.

Python 789 66 Updated Nov 8, 2024

A recipe for online RLHF and online iterative DPO.

Python 414 46 Updated Nov 8, 2024

湖南大学本科毕业论文LaTeX模板(大理类)

TeX 2 2 Updated May 5, 2024

An index of algorithms for reinforcement learning from human feedback (rlhf))

87 1 Updated Apr 17, 2024

awesome papers in LLM interpretability

315 12 Updated Nov 8, 2024

Deep Reinforcement Learning

3,330 587 Updated Dec 10, 2022
Python 2,501 305 Updated May 19, 2024

Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizXgXU)

Jupyter Notebook 60 10 Updated Mar 15, 2024

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,891 176 Updated Nov 8, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 29,834 4,504 Updated Nov 9, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)

Python 2,536 247 Updated Nov 9, 2024

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)

Python 2,250 115 Updated Mar 13, 2024

InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning

214 7 Updated Aug 20, 2023

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 94,577 15,304 Updated Nov 9, 2024

Collection of training data management explorations for large language models

282 29 Updated Aug 2, 2024

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,533 4,050 Updated Jul 17, 2024

A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…

975 54 Updated Nov 8, 2024

Let ChatGPT teach your own chatbot in hours with a single GPU!

Python 3,166 285 Updated Mar 17, 2024

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

TypeScript 3,490 356 Updated Nov 6, 2024

The official repository for paper "LLMaAA: Making Large Language Models as Active Annotators"

Python 34 3 Updated Apr 14, 2024

The official Python library for the OpenAI API

Python 22,899 3,204 Updated Nov 6, 2024

The unofficial python package that returns response of Google Bard through cookie value.

Python 5,325 527 Updated Apr 24, 2024

A series of large language models developed by Baichuan Intelligent Technology

Python 4,086 295 Updated Nov 8, 2024
Next