RialTo Policy Learning

This repository provides the official implementation of the RialTo system, as proposed in Reconciling Reality through Simulation: A Real-to-Sim-to-Real approach for Robust Manipulation The manuscript is available on arXiv. See the project page

If you use this codebase, please cite

@article{torne2024reconciling,
  title={Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation},
  author={Torne, Marcel and Simeonov, Anthony and Li, Zechu and Chan, April and Chen, Tao and Gupta, Abhishek and Agrawal, Pulkit},
  journal={arXiv preprint arXiv:2403.03949},
  year={2024}
}

Installation

Download omniverse: https://www.nvidia.com/en-us/omniverse/
Install isaac-sim 2022.2.1 (from omniverse launcher)

Launch isaac-sim to complete the installation

Install orbit https://docs.omniverse.nvidia.com/isaacsim/latest/ext_omni_isaac_orbit.html

git clone [email protected]:NVIDIA-Omniverse/orbit.git
git checkout f2d97bdcddb3005d17d0ebd1546c7064bc7ae8bc

export ISAACSIM_PATH=<path to isaacsim>

# enter the cloned repository
cd orbit
# create a symbolic link
ln -s ${ISAACSIM_PATH} _isaac_sim

Create conda environment from orbit

./orbit.sh --conda isaac-sim
conda activate isaac-sim
orbit -i
orbit -e

Make sure orbit was correctly set up:

python -c "import omni.isaac.orbit; print('Orbit configuration is now complete.')"

Clone repo

git clone [email protected]:real-to-sim-to-real/RialToPolicyLearning.git
git checkout dev-massive-huge
cd PolicyLearning
pip install -r requirements.txt
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.0+cu117.html --no-index
pip install torch-cluster -f https://data.pyg.org/whl/torch-1.13.0+cu117.html --no-index

conda install conda-build
conda develop dependencies

You should be ready to go :) Test it our running the teleoperation script below

Pipeline:

https://github.com/real-to-sim-to-real/RialToPolicyLearning/blob/main/materials/teaserv14.pdf

Create your environment using the RialTo GUI

Running the code

Collect teleoperation data in sim

booknshelf

python collect_demos_teleop_franka.py --env_name=isaac-env --img_width=1024 --img_height=1024 --demo_folder=booknshelvenew --max_path_length=150 --extra_params=booknshelve,booknshelve,booknshelve_debug_mid_randomness --usd_path=/home/marcel/USDAssets/scenes --offset=1

mugandshelf

python collect_demos_teleop_franka.py --env_name=isaac-env --img_width=1024 --img_height=1024 --demo_folder=mugandshelfnew2 --max_path_length=150 --extra_params=mugandshelf,mugandshelf_mid_rot_randomness --usd_path=/home/marcel/USDAssets/scenes

cabinet

python collect_demos_teleop_franka.py --env_name=isaac-env --img_width=1024 --img_height=1024 --demo_folder=cabinet --max_path_length=150 --extra_params=cabinet,cabinet_mid_randomness --usd_path=/home/marcel/USDAssets/scenes

drawer

python collect_demos_teleop_franka.py --env_name=isaac-env --img_width=1024 --img_height=1024 --demo_folder=drawertras --max_path_length=150 --extra_params=wooden_drawer_bigger,drawer_debug_high_randomness --usd_path=/home/marcel/USDAssets/scenes

dishinrack

python collect_demos_teleop_franka.py --env_name=isaac-env --img_width=1024 --img_height=1024 --demo_folder=dishinrackv3 --max_path_length=200 --extra_params=dishnrackv2,dishnrack_high_randomness,no_action_rand

kitchentoaster

python collect_demos_teleop_franka.py --env_name=isaac-env --img_width=1024 --img_height=1024 --demo_folder=kitchentoaster --max_path_length=150 --extra_params=kitchentoaster,kitchentoaster_randomness,no_action_rand

RL-finetuning

cabinet

python launch_ppo.py --num_envs=4096 --n_steps=90 --max_path_length=90 --usd_path=/home/marcel/USDAssets/scenes --bc_loss --bc_coef=0.1 --filename=isaac-envcabinet --extra_params=cabinet,cabinet_mid_rot_randomness,no_action_rand --datafolder=demos --num_demos=12 --ppo_batch_size=31257 --run_path=locobot-learn/cabinet.usdppo-finetune/rw4wp46f --model_name=model_policy_567 --from_ppo

mugandshelf

python launch_ppo.py --num_envs=4096 --n_steps=150 --max_path_length=150 --bc_loss --bc_coef=0.1 --filename=isaac-envmugandshelf --extra_params=mugandshelf,mugandshelf_mid_rot_randomness,no_action_rand --datafolder=demos --num_demos=6 --ppo_batch_size=31257 --model_name=model_policy_279 --run_path=locobot-learn/mugandshelf.usdppo-finetune/ef5nbwyb --from_ppo

booknshelve

python launch_ppo.py --num_envs=4096 --n_steps=130 --max_path_length=130 --extra_params=booknshelve,booknshelve,booknshelve_debug_high_rot_randomness,no_action_rand --bc_loss --bc_coef=0.1 --filename=booknshelvesreal2sim --datafolder=/data/pulkitag/data/marcel/data --num_demos=14 --ppo_batch_size=31257 --num_envs=2048

kitchen toaster

python launch_ppo.py --num_envs=4096 --n_steps=130 --max_path_length=130 --extra_params=kitchentoaster,kitchentoaster_randomness,no_action_rand --bc_loss --bc_coef=0.1 --filename=isaac-envkitchentoaster --datafolder=demos --num_demos=12 --ppo_batch_size=31257 --num_envs=2048 --usd_path=/home/marcel/USDAssets/scenes

dishsinklab

python launch_ppo.py --num_envs=4096 --n_steps=110 --max_path_length=110 --extra_params=dishsinklab,dishsinklab_low_randomness,no_action_rand --bc_loss --bc_coef=0.1 --filename=isaac-envdishsinklablow --datafolder=demos --num_demos=15 --ppo_batch_size=31257 --num_envs=2048 --usd_path=/home/marcel/USDAssets/scenes

Visualize PPO

python distillation.py --extra_params=booknshelve,booknshelve,booknshelve_debug_high_rot_randomness,no_action_rand --max_path_length=130 --filename=trash --run_path=locobot-learn/booknshelve.usdppo-finetune/ruk0ihdc --model_name=policy_finetune_step_210 --datafolder=/data/pulkitag/data/marcel/data/ --visualize_traj --max_demos=10 --sensors=rgb,pointcloud

Distillation from synthetic pointclouds

booknshelve

python distillation.py --extra_params=booknshelve,booknshelve,booknshelve_debug_high_rot_randomness,no_action_rand --max_path_length=130 --filename=booknshelvenewsynthetic --run_path=locobot-learn/booknshelve.usdppo-finetune/ruk0ihdc --model_name=policy_finetune_step_210 --datafolder=/data/pulkitag/data/marcel/data/ --use_synthetic_pcd --max_demos=15000 --eval_freq=1 --use_state --visualize_traj

mugandshelf

python distillation.py --max_path_length=150 --extra_params=mugandshelf,mugandshelf_mid_rot_randomness,no_action_rand --filename=mugandshelfsynthetic --run_path=locobot-learn/mugandshelf.usdppo-finetune/ke2wi6xb --model_name=policy_finetune_step_369 --datafolder=/data/pulkitag/data/marcel/data/ --use_synthetic_pcd --max_demos=15000 --eval_freq=1 --use_state --visualize_traj

cabinet

python distillation.py --max_path_length=90 --extra_params=cabinet,cabinet_mid_rot_randomness,no_action_rand --filename=cabinetsynthetic --run_path=locobot-learn/cabinet.usdppo-finetune/uqn3jbmg --model_name=policy_finetune_step_393 --datafolder=/data/pulkitag/data/marcel/data/ --use_synthetic_pcd --max_demos=15000 --eval_freq=1 --use_state --visualize_traj

mugupright

python distillation.py --max_path_length=100 --extra_params=mugupright,mugupright_mid_randomness,no_action_rand --filename=muguprightsynthetic --run_path=locobot-learn/mugupright.usdppo-finetune/5vocshfn --model_name=model_policy_830 --from_ppo --datafolder=/data/pulkitag/data/marcel/data/ --use_synthetic_pcd --max_demos=15000 --eval_freq=1 --use_state --visualize_traj --usd_path=/home/marcel/USDAssets/scenes

Distillation from sim pcd

cabinet

python distillation.py --max_path_length=90 --extra_params=cabinet,cabinet_mid_rot_randomness,no_action_rand --filename=cabinetsimpcd --run_path=locobot-learn/cabinet.usdppo-finetune/uqn3jbmg --model_name=policy_finetune_step_393 --datafolder=/data/pulkitag/data/marcel/data/ --max_demos=2500 --eval_freq=1 --use_state --visualize_traj --policy_batch_size=32 --num_envs=9 --run_path_student=locobot-learn/distillation_cabinet/47r0waz0 --model_name_student=policy_distill_step_71

mugupright

python distillation.py --max_path_length=100 --extra_params=mugupright,mugupright_mid_randomness,no_action_rand --filename=muguprightsimpcd --run_path=locobot-learn/mugupright.usdppo-finetune/5vocshfn --model_name=model_policy_830 --from_ppo --datafolder=/home/marcel/data/ --max_demos=2500 --eval_freq=1 --use_state --visualize_traj --num_envs=12 --usd_path=/home/marcel/USDAssets/scenes --run_path_student=locobot-learn/distillation_mugupright/tjlds4ds --model_name_student=policy_distill_step_407

mugandshelf

python distillation.py --max_path_length=150 --extra_params=mugandshelf,mugandshelf_mid_rot_randomness,no_action_rand --filename=mugandshelfsimpcd --run_path=locobot-learn/mugandshelf.usdppo-finetune/ke2wi6xb --model_name=policy_finetune_step_369 --datafolder=/data/pulkitag/data/marcel/data/ --max_demos=2500 --eval_freq=1 --use_state --num_envs=12 --visualize_traj --model_name_student=policy_distill_step_381 --run_path_student=locobot-learn/distillation_mugandshelf/valc8x68

booknshelve

python distillation.py --extra_params=booknshelve,booknshelve,booknshelve_debug_high_rot_randomness,no_action_rand --max_path_length=130 --filename=booknshelvenewsimpcd --run_path=locobot-learn/booknshelve.usdppo-finetune/ruk0ihdc --model_name=policy_finetune_step_210 --datafolder=/data/pulkitag/data/marcel/data/ --num_envs=9 --max_demos=2500 --eval_freq=1 --use_state --visualize_traj

Collecting distractor data

python distillation.py --max_path_length=80 --extra_params=wooden_drawer_bigger,drawer_debug_extra_randomness --filename=drawerdistractorsfixedv2 --run_path=locobot-learn/drawerbiggerhandle.usdppo-finetune/cpyibf8w --model_name=model_policy_16 --from_ppo --datafolder=/data/pulkitag/data/marcel/data/ --eval_freq=1 --use_state --visualize_traj --policy_batch_size=32 --max_demos=5000 --num_envs=9 --policy_train_steps=1 --distractors=distractors_fixed --seed=2

Running Dagger

python distillation.py --max_path_length=65 --extra_params=wooden_drawer_bigger,drawer_debug_high_randomness,no_action_rand --filename=drawerfromrealdistractors,drawerbiggerreal --run_path=locobot-learn/drawerbiggerhandle.usdppo-finetune/l3zbpqta --model_name=model_policy_196 --from_ppo --datafolder=/data/scratch-oc40/pulkitag/marcel --eval_freq=1 --use_state --visualize_traj --policy_batch_size=32 --policy_train_steps=10000 --eval_freq=3 --max_demos=500 --num_envs=5 --run_path_student=locobot-learn/distillation_drawer_bigger/7vl0dkb3 --model_name_student=policy_distill_step_17 --dagger --sampling_expert=0 --policy_batch_size=8

Running in the real world

Install environment

We use Polymetis. You would need to install polymetis on the robot side. We give the instructions on how to install and create the environments on the GPU side.

conda create -n franka-env-new-cuda python=3.8.15
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
conda install -c pytorch -c fair-robotics -c aihabitat -c conda-forge polymetis

Clone and install airobot
Clone and install improbable_rdt
- git clone --recurse [email protected]:anthonysimeonov/improbable_rdt.git
- git checkout lightweight-marcel

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.0+cu117.html --no-index
pip install torch-cluster -f https://data.pyg.org/whl/torch-1.13.0+cu117.html --no-index

pip install -r franka_requirements.txt

Evaluation

conda activate franka-env-new-cuda
export RDT_SOURCE_DIR=~/improbable_rdt/src/rdt
export PB_PLANNING_SOURCE_DIR=~/improbable_rdt/pybullet-planning/

dishinrack

python evaluate_policy_real.py --run_path=locobot-learn/distillation_dishnrack/zmglvp19 --model_name=policy_distill_step_67 --max_path_length=130 --cam_index=2 --extra_params=booknshelve,booknshelve,booknshelve_debug_high_rot_randomness,no_action_rand --use_state --hz=2 --gripper_force=100 --background_loop --total_loop_time=0.3 --interp_steps=30 --start_interp_offset=3

Collect teleop data in the real world

conda activate franka-env-new-cuda
export RDT_SOURCE_DIR=~/improbable_rdt/src/rdt
export PB_PLANNING_SOURCE_DIR=~/improbable_rdt/pybullet-planning/

python teleop_franka.py --demo_folder=mugandshelfreal --offset=0 --hz=2 --max_path_length=100 --extra_params=booknshelve,booknshelve_debug_low_randomness,booknshelve2

python teleop_franka.py --demo_folder=cupntrashreal --offset=0 --hz=2 --max_path_length=100 --extra_params=cupntrash,kitchentoaster_randomness,no_action_rand --num_demos=15 --cam_index=1

python teleop_franka.py --demo_folder=cupntrashreal --offset=0 --hz=2 --max_path_length=100 --extra_params=cupntrash,kitchentoaster_randomness,no_action_rand --num_demos=15 --cam_index=1

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
dependencies		dependencies
distractormeshes		distractormeshes
franka		franka
materials		materials
rialto		rialto
README.md		README.md
camera_utils.py		camera_utils.py
collect_demos_teleop_franka.py		collect_demos_teleop_franka.py
config.yaml		config.yaml
distillation.py		distillation.py
evaluate_policy_real.py		evaluate_policy_real.py
franka_requirements.txt		franka_requirements.txt
generate_meshes.py		generate_meshes.py
generate_meshes_distractors.py		generate_meshes_distractors.py
launch_ppo.py		launch_ppo.py
render_mesh_utils.py		render_mesh_utils.py
requirements.txt		requirements.txt
teleop_franka.py		teleop_franka.py
utils.py		utils.py
utils_frames.py		utils_frames.py
utils_real.py		utils_real.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RialTo Policy Learning

Installation

Pipeline:

Running the code

Collect teleoperation data in sim

RL-finetuning

Visualize PPO

Distillation from synthetic pointclouds

Distillation from sim pcd

Collecting distractor data

Running Dagger

Running in the real world

Install environment

Evaluation

Collect teleop data in the real world

About

Releases

Packages

Languages

real-to-sim-to-real/RialToPolicyLearning

Folders and files

Latest commit

History

Repository files navigation

RialTo Policy Learning

Installation

Pipeline:

Running the code

Collect teleoperation data in sim

RL-finetuning

Visualize PPO

Distillation from synthetic pointclouds

Distillation from sim pcd

Collecting distractor data

Running Dagger

Running in the real world

Install environment

Evaluation

Collect teleop data in the real world

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages