Cherry-CBS

Follow

Cherry-CBS

Follow

1 follower · 0 following

Stars

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 586 49 Updated Aug 28, 2024

PKU-Alignment / safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python 1,284 118 Updated Jun 13, 2024

microsoft / DeepSpeedExamples

Example models using DeepSpeed

Python 5,974 1,008 Updated Aug 28, 2024

jianzhnie / awesome-instruction-datasets

A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。

485 23 Updated Apr 7, 2024

openai / summarize-from-feedback

Code for "Learning to summarize from human feedback"

Python 972 143 Updated Sep 5, 2023

jackaduma / ChatGLM-LoRA-RLHF-PyTorch

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically Ch…

Python 121 10 Updated Apr 28, 2023

lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,670 668 Updated Jan 14, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 9,156 1,144 Updated Aug 31, 2024

HumanSignal / RLHF

Collection of links, tutorials and best practices of how to collect the data and build end-to-end RLHF system to finetune Generative AI models

Jupyter Notebook 163 35 Updated Jul 24, 2023

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,309 4,024 Updated Jul 17, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Python 1,958 191 Updated Aug 29, 2024

opendilab / awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

3,194 201 Updated Aug 30, 2024

ericyangyu / PPO-for-Beginners

A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Python 711 106 Updated Dec 22, 2023

MorvanZhou / Reinforcement-learning-with-tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

Python 8,818 4,999 Updated Mar 31, 2024

EbookFoundation / bookshelf-management

Application for managing bookshelves on project gutenberg site.

Python 12 11 Updated May 19, 2021

getaurora / download

极光官方版本下载页翻墙代理科学上网外网加速器梯子路由

3,673 457 Updated Aug 13, 2024

addisonlynch / iexfinance

Python SDK for IEX Cloud

Python 649 136 Updated Apr 6, 2022

pydata / pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.

Python 2,900 681 Updated Aug 8, 2024

hongtaocai / googlefinance

Python module to get real-time stock data from Google Finance API

Python 700 172 Updated Sep 23, 2018

AI4Finance-Foundation / FinRL-Trading

For trading. Please star.

Jupyter Notebook 1,995 726 Updated Jul 1, 2024

huseinzol05 / Stock-Prediction-Models

Gathers machine learning and deep learning models for Stock forecasting including trading bots and simulations

Jupyter Notebook 7,824 2,785 Updated Apr 16, 2023