Skip to content
View O-suke12's full-sized avatar

Highlights

  • Pro
Block or Report

Block or report O-suke12

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

Python 1,559 331 Updated Dec 8, 2023

A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.

276 12 Updated Oct 4, 2023

Reference implementation for DPO (Direct Preference Optimization)

Python 1,849 143 Updated May 23, 2024

A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.

Jupyter Notebook 581 53 Updated May 7, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 34,564 5,320 Updated Jun 27, 2024

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Python 2,253 427 Updated Apr 29, 2024

This is the repository for our paper "Learning Latent Representations to Co-Adapt to Humans" [http:https://arxiv.org/abs/2212.09586]

Python 1 6 Updated Apr 11, 2024

A minimum example of aligning language models with RLHF similar to ChatGPT

Python 206 28 Updated Sep 26, 2023

A guide to contributing to open source

Ruby 8,617 1,792 Updated Jun 27, 2024

A collection of MARL benchmarks based on TorchRL

Python 186 22 Updated Jul 8, 2024

Overcooked human-AI experiment platform

Python 30 3 Updated Dec 21, 2023
JavaScript 12 2 Updated Jan 4, 2024

This is a repository for Hidden-utility Self-Play.

JavaScript 26 1 Updated Jul 27, 2023

Official codebase for Generating Diverse Cooperative Agents by Learning Incompatible Policies (notable-top-25% @ ICLR 2023)

Python 15 1 Updated May 10, 2024

A standard format for offline reinforcement learning datasets, with popular reference datasets and related utilities

Python 237 38 Updated Jul 8, 2024

This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates!

723 38 Updated Jul 1, 2024

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Python 29 4 Updated Dec 15, 2021

A benchmark environment for fully cooperative human-AI performance.

Jupyter Notebook 656 137 Updated Jun 25, 2024

An index of algorithms for offline reinforcement learning (offline-rl)

883 86 Updated May 23, 2024

A curated list of awesome Human-AI Interaction

293 40 Updated May 12, 2021

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 4,849 563 Updated Jul 2, 2024

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

Python 580 87 Updated Mar 23, 2024