Skip to content
/ srl Public

High performance RL library for robot manipulation task given sparse reward.

Notifications You must be signed in to change notification settings

jc-bao/srl

Repository files navigation

Sparse-RL

| Note: Please use Isaacgym v3 rather than v4 to aviod robot control issues.

Features

  • Fast: Fully GPU-Based Pipeline (isaac support, fast buffer indexing, fast batch relabeling)
  • Designed for Sparse Reward (HER)
  • Easy to use: single config file, multi-algorithm support(Red-Q, SAC, TD3, DDPG, PPO), resume from cloud, etc.

Use Cases

Basics

python run.py

Examples:

'''debug'''
# do the environment check
python run.py +exp=1ho_debug

'''train'''
# run main experinment
python run.py +corl=nho

'''eval'''
python run.py +corl=2ho_render 

'''env'''
# render env with handwriting policy
python envs/franka_cube.py -e 
# render env with random policy
python envs/franka_cube.py -r

Curriculum Learning

add these line to config file to run automatic curriculum:

curri:
  table_gap: # env params to change
    now: 0
    end: 0.2
    step: 0.05
    bar: -0.2 # reward bar to change curriculum params

Issues

  • HER relabel lead to fluctuations in success rate
    • check relabel boundary case
    • check relabel reward, mask
    • check trajectory buffer
    • check

TODO

L1

  • Add SAC
  • Add PPO
  • Add RedQ
  • Add HER
    • add info (env_step, traj_idx) to HER
    • add trajectory extra info (traj_len, ag_pool) to HER
  • Toy Env
    • reach
    • pnp
    • handover
  • Eval Func (in agent)
  • Normalizer
  • logger
  • check buffer function for multi done collect
  • env reset function
  • Add Hydra
  • sub task evaluation
  • hand write normalizer data check

L2

  • Add Attention Dense Net
  • Resume run from cloud
  • Add Isaac Env (fast auto reset)
    • PNP
    • Handover
    • Multi Robot Environment
  • Isaac render + viewer
  • Render function
  • merge all buffer together (fast indexing)
  • get_env_params, obs_parser, info_parser function
  • curriculum learning

L3

  • add ray tune (wandb)
  • update according to collected steps
  • merge buffer into agent
  • fix relabel for to generate to left index
  • add vec transitions at a time

Misc

  • resume from remote
  • update env info dim automatically
  • classified log info

Requirements

see requirements.txt.

About

High performance RL library for robot manipulation task given sparse reward.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages