HOLD Reward Models

Official implementation of the following paper:

Learning Reward Functions for Robotic Manipulation by Observing Humans
Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce, Cordelia Schmid
ICRA 2023
[Paper] | [Project website]

This repository includes the training of HOLD models (a.k.a functional distance models) on video data. This implementation is based on Scenic.

For the RL policy training experiments in the paper, see https://github.com/minttusofia/hold-policies.

Installation

To train models, python 3.9 is required (required for dmvr, which is a dependency of scenic).

To install this codebase, run

$ git clone https://github.com/minttusofia/hold-rewards.git
$ cd hold-rewards
$ pip install .

For a GPU-enabled installation of jax (recommended), see https://github.com/google/jax/tree/jax-v0.2.28#pip-installation-gpu-cuda.
For example, to install jax for CUDA >= 11.1 and cuDNN >= 8.2, run:

$ pip install "jax[cuda11_cudnn82]>=0.2.21,<0.3" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

Training HOLD on Something-Something v2

Make the following changes to the scenic config file
(in scenic/projects/func_dist/configs/holdr/vivit_large_factorized_encoder.py for HOLD-R, or
scenic/projects/func_dist/configs/holdc/resnet50.py for HOLD-C):

Set DATA_DIR to the directory where Something-Something v2 data (and optionally, any pretrained model checkpoints) are saved.
Set NUM_DEVICES to the number of GPUs / TPUs to use.

Run

python scenic/projects/func_dist/main.py \
--config=scenic/projects/func_dist/configs/holdr/vivit_large_factorized_encoder.py \
--workdir=/PATH/TO/OUT_DIR

where /PATH/TO/OUT_DIR is the directory to which experiment checkpoints will be written.

For HOLD-C, use --config=scenic/projects/func_dist/configs/holdc/resnet50.py.

Trained models

We release the trained model checkpoints used in the paper:

To use these as reward models in policy training with SAC (as in the paper), please refer to our policy training repo https://github.com/minttusofia/hold-policies.

Citing HOLD

If you found this implementation or the released models useful, you are encouraged to cite our paper:

@article{alakuijala2023learning,  
    title={Learning Reward Functions for Robotic Manipulation by Observing Humans},  
    author={Alakuijala, Minttu and Dulac-Arnold, Gabriel and Mairal, Julien and Ponce, Jean and Schmid, Cordelia},  
    journal={2023 IEEE International Conference on Robotics and Automation (ICRA)},  
    year={2023},  
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
scenic		scenic
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pylintrc		pylintrc
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HOLD Reward Models

Installation

Training HOLD on Something-Something v2

Trained models

Citing HOLD

About

Releases

Packages

Languages

License

minttusofia/hold-rewards

Folders and files

Latest commit

History

Repository files navigation

HOLD Reward Models

Installation

Training HOLD on Something-Something v2

Trained models

Citing HOLD

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages