Skip to content
View Cherry-CBS's full-sized avatar

Block or report Cherry-CBS

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Recipes to train reward model for RLHF.

Python 586 49 Updated Aug 28, 2024

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python 1,284 118 Updated Jun 13, 2024

Example models using DeepSpeed

Python 5,974 1,008 Updated Aug 28, 2024

A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。

485 23 Updated Apr 7, 2024

Code for "Learning to summarize from human feedback"

Python 972 143 Updated Sep 5, 2023

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically Ch…

Python 121 10 Updated Apr 28, 2023

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,670 668 Updated Jan 14, 2024

Train transformer language models with reinforcement learning.

Python 9,156 1,144 Updated Aug 31, 2024

Collection of links, tutorials and best practices of how to collect the data and build end-to-end RLHF system to finetune Generative AI models

Jupyter Notebook 163 35 Updated Jul 24, 2023

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,309 4,024 Updated Jul 17, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Python 1,958 191 Updated Aug 29, 2024

A curated list of reinforcement learning with human feedback resources (continually updated)

3,194 201 Updated Aug 30, 2024

A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Python 711 106 Updated Dec 22, 2023

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

Python 8,818 4,999 Updated Mar 31, 2024

Application for managing bookshelves on project gutenberg site.

Python 12 11 Updated May 19, 2021

极光官方版本下载页 翻墙 代理 科学上网 外网 加速器 梯子 路由

3,673 457 Updated Aug 13, 2024

Python SDK for IEX Cloud

Python 649 136 Updated Apr 6, 2022

Extract data from a wide range of Internet sources into a pandas DataFrame.

Python 2,900 681 Updated Aug 8, 2024

Python module to get real-time stock data from Google Finance API

Python 700 172 Updated Sep 23, 2018

For trading. Please star.

Jupyter Notebook 1,995 726 Updated Jul 1, 2024

Gathers machine learning and deep learning models for Stock forecasting including trading bots and simulations

Jupyter Notebook 7,824 2,785 Updated Apr 16, 2023