O-suke12

Follow

Osuke Sashida O-suke12

Follow

BSc in CSE | Intern @ghelia

2 followers · 14 following

Waseda University
Tokyo
https://o-suke12.github.io/

Achievements

Achievements

Highlights

Pro

Block or Report

Block or report O-suke12

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Stars

nikhilbarhate99 / PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

Python 1,559 331 Updated Dec 8, 2023

glgh / awesome-llm-human-preference-datasets

A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.

276 12 Updated Oct 4, 2023

eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Python 1,849 143 Updated May 23, 2024

tanelp / tiny-diffusion

A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.

Jupyter Notebook 581 53 Updated May 7, 2024

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 34,564 5,320 Updated Jun 27, 2024

kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Python 2,253 427 Updated Apr 29, 2024

PKU-MARL / Multi-Agent-Transformer

Python 306 56 Updated Dec 28, 2023

VT-Collab / RILI_co-adaptation

This is the repository for our paper "Learning Latent Representations to Co-Adapt to Humans" [http:https://arxiv.org/abs/2212.09586]

Python 1 6 Updated Apr 11, 2024

ethanyanjiali / minChatGPT

A minimum example of aligning language models with RLHF similar to ChatGPT

Python 206 28 Updated Sep 26, 2023

freeCodeCamp / how-to-contribute-to-open-source

A guide to contributing to open source

Ruby 8,617 1,792 Updated Jun 27, 2024

facebookresearch / BenchMARL

A collection of MARL benchmarks based on TorchRL

Python 186 22 Updated Jul 8, 2024

liyang619 / COLE-Platform

Overcooked human-AI experiment platform

Python 30 3 Updated Dec 21, 2023

LxzGordon / PECAN

JavaScript 12 2 Updated Jan 4, 2024

samjia2000 / HSP

This is a repository for Hidden-utility Self-Play.

JavaScript 26 1 Updated Jul 27, 2023

51616 / marl-lipo

Official codebase for Generating Diverse Cooperative Agents by Learning Incompatible Policies (notable-top-25% @ ICLR 2023)

Python 15 1 Updated May 10, 2024

Farama-Foundation / Minari

A standard format for offline reinforcement learning datasets, with popular reference datasets and related utilities

Python 237 38 Updated Jul 8, 2024

zchoi / Awesome-Embodied-Agent-with-LLMs

This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates!

723 38 Updated Jul 1, 2024

bic4907 / Overcooked-AI

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Python 29 4 Updated Dec 15, 2021

HumanCompatibleAI / overcooked_ai

A benchmark environment for fully cooperative human-AI performance.

Jupyter Notebook 656 137 Updated Jun 25, 2024

hanjuku-kaso / awesome-offline-rl

An index of algorithms for offline reinforcement learning (offline-rl)

883 86 Updated May 23, 2024

bwang514 / awesome-HAI

A curated list of awesome Human-AI Interaction

293 40 Updated May 12, 2021

vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 4,849 563 Updated Jul 2, 2024

vwxyzjn / ppo-implementation-details

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

Python 580 87 Updated Mar 23, 2024