sweetice

😀

Not bad

Johnny He sweetice

😀

Not bad

A Ph.D. student @ Tuebingen AI center. Research Interests include Reinforcement Learning, Machine Learning, and Deep Learning.

244 followers · 26 following

Tuebingen, Germany
sweetice.github.io

Achievements

Highlights

Block or Report

Block or report sweetice

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Online-RLHF Public
Forked from RLHFlow/Online-RLHF

A recipe for online RLHF.

Python Updated Jun 20, 2024
trl Public
Forked from huggingface/trl

Train transformer language models with reinforcement learning.

Python Apache License 2.0 Updated Nov 29, 2023
ERC-ECML-23 Public

Anonymous code for ICML submission 45

Python 1 Updated Nov 17, 2023
BEER-ICLR2024 Public

The present anonymous repository serves as a guide for reproducing the results of the "BEER" method proposed in our ICLR submission "Adaptive Regularization of Representation Rank as an Implicit Co…

Python 1 Updated Sep 23, 2023
LLM4Arxiv Public
Forked from xihuai18/arxiv-sanity-x

Python Other Updated Jul 20, 2023
PEER-CVPR23 Public

Authors' implementation of PEER

Python 8 1 MIT License Updated Jul 13, 2023
ColossalAI Public
Forked from hpcaitech/ColossalAI

Making large AI models cheaper, faster and more accessible

Python Apache License 2.0 Updated Mar 29, 2023
dalai_llama Public
Forked from cocktailpeanut/dalai

The simplest way to run LLaMA on your local machine

CSS Updated Mar 25, 2023
Deep-reinforcement-learning-with-pytorch Public

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

algorithm deep-learning deep-reinforcement-learning pytorch dqn policy-gradient sarsa

Python 3,759 837 MIT License Updated Mar 24, 2023
stanford_alpaca Public
Forked from tatsu-lab/stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python Apache License 2.0 Updated Mar 21, 2023
llama Public
Forked from meta-llama/llama

Inference code for LLaMA models

Python GNU General Public License v3.0 Updated Mar 15, 2023
voltron-robotics Public
Forked from siddk/voltron-robotics

Voltron: Language-Driven Representation Learning for Robotics

Python MIT License Updated Feb 27, 2023
RWKV-LM Public
Forked from BlinkDL/RWKV-LM

RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, …

Python Apache License 2.0 Updated Feb 17, 2023
ffn_geyang Public
Forked from geyang/ffn

Public Repo for the paper "Overcoming The Spectral-Bias of Neural Value Approximation"

Python Updated Jan 11, 2023
MEPE Public

Official implementation of MEPE

Python Updated Oct 28, 2022
learned-fourier-features Public
Forked from alexlioralexli/learned-fourier-features

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Python Updated Oct 2, 2022
LibMTL Public
Forked from median-research-group/LibMTL

A PyTorch Library for Multi-Task Learning

Python MIT License Updated Sep 3, 2022
sweetice.github.io_abondon Public

JavaScript MIT License Updated Sep 1, 2022
sweetice.github.io_old Public

HTML 1 MIT License Updated Jul 17, 2022
reward-surfaces Public
Forked from RyanNavillus/reward-surfaces

Python MIT License Updated May 20, 2022
drqv2 Public
Forked from facebookresearch/drqv2

DrQ-v2: Improved Data-Augmented Reinforcement Learning

Python MIT License Updated Jul 21, 2021
snrl Public
Forked from floringogianu/snrl

Python Updated Jun 26, 2021
TD3_BC Public
Forked from sfujim/TD3_BC

Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL

Python MIT License Updated Jun 16, 2021
neural-approx-ss-lfi Public
Forked from cyz-ai/neural-approx-ss-lfi

Codes for ICLR 21 paper: Neural Approximate Sufficient Statistics for Implicit Models

Jupyter Notebook Updated Jun 15, 2021
dice_rl Public
Forked from google-research/dice_rl

Python Apache License 2.0 Updated Jun 1, 2021
mpo Public
Forked from daisatojp/mpo

PyTorch Implementation of the Maximum a Posteriori Policy Optimisation

Python GNU General Public License v3.0 Updated May 22, 2021
deep-successor-features-for-transfer Public
Forked from mike-gimelfarb/deep-successor-features-for-transfer

A reusable framework for successor features for transfer in deep reinforcement learning using keras.

Python Other Updated May 11, 2021
pderl Public
Forked from crisbodnar/pderl

Code for "Proximal Distilled Evolutionary Reinforcement Learning", accepted at AAAI 2020

Python Updated Feb 24, 2021
tqc_pytorch_1epo Public
Forked from SamsungLabs/tqc_pytorch

Implementation of Truncated Quantile Critics method for continuous reinforcement learning. https://bayesgroup.github.io/tqc/

Python MIT License Updated Feb 16, 2021
gulf Public
Forked from riejohnson/gulf

GULF: GUided Learning through successive Functional gradient optimization (author implementation of DPCNN included)

Python MIT License Updated Jan 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Johnny He sweetice

Achievements

Achievements

Highlights

Block or report sweetice

Online-RLHF Public

trl Public

ERC-ECML-23 Public

BEER-ICLR2024 Public

LLM4Arxiv Public

PEER-CVPR23 Public

ColossalAI Public

dalai_llama Public

Deep-reinforcement-learning-with-pytorch Public

stanford_alpaca Public

llama Public

voltron-robotics Public

RWKV-LM Public

ffn_geyang Public

MEPE Public

learned-fourier-features Public

LibMTL Public

sweetice.github.io_abondon Public

sweetice.github.io_old Public

reward-surfaces Public

drqv2 Public

snrl Public

TD3_BC Public

neural-approx-ss-lfi Public

dice_rl Public

mpo Public

deep-successor-features-for-transfer Public

pderl Public

tqc_pytorch_1epo Public

gulf Public