GR-MG

This repo contains code for the paper:

Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy

Peiyan Li, Hongtao Wu^*‡, Yan Huang^*, Chilam Cheang, Liang Wang, Tao Kong

^*Corresponding author ^‡ Project lead

🌐 Project Website | 📄 Paper

News

(🔥 New) (2024.08.27) We have released the code and checkpoints of GR-MG !

Preparation

Note: We only test GR-MG with CUDA 12.1 and python 3.9

# clone this repository
git clone https://github.com/bytedance/GR-MG.git
cd GR_MG
# install dependencies for goal image generation model
bash ./goal_gen/install.sh
# install dependencies for multi-modal goal conditioned policy
bash ./policy/install.sh

Download the pretrained InstructPix2Pix weights from Huggingface and save them in resources/IP2P/. Download the pretrained MAE encoder mae_pretrain_vit_base.pth and save it in resources/MAE/. Download and unzip the CALVIN dataset.

Checkpoints

Training

1. Train Goal Image Generation Model

# modify the variables in the script before you execute the following instruction
bash ./goal_gen/train_ip2p.sh  ./goal_gen/config/train.json

2. Pretrain Multi-modal Goal Conditioned Policy

We use the method described in GR-1 and pretrain our policy with Ego4D videos. You can download the pretrained model checkpoint here. You can also pretrain the policy yourself using the scripts we provide. Before doing this, you'll need to download the Ego4D dataset.

# pretrain multi-modal goal conditioned policy
bash ./policy/main.sh  ./policy/config/pretrain.json

3. Train Multi-modal Goal Conditioned Policy

After pretraining, modify the pretrained_model_path in /policy/config/train.json and execute the following instruction to train the policy.

# train multi-modal goal conditioned policy
bash ./policy/main.sh  ./policy/config/train.json

Evaluation

To evaluate our model on CALVIN, you can execute the following instruction:

# Evaluate GR-MG on CALVIN
bash ./evaluate/eval.sh  ./policy/config/train.json

In the eval.sh script, you can specify which goal image generation model and policy to use. Additionally, we provide multi-GPU evaluation code, allowing you to evaluate different training epochs of the policy simultaneously.

Contact

If you have any questions about the project, please contact [email protected].

Acknowledgements

We thank the authors of the following projects for making their code and dataset open source:

Citation

If you find this project useful, please star the repository and cite our paper:

@article{li2024gr,
  title={GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy},
  author={Li, Peiyan and Wu, Hongtao and Huang, Yan and Cheang, Chilam and Wang, Liang and Kong, Tao},
  journal={arXiv preprint arXiv:2408.14368},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
evaluate		evaluate
goal_gen		goal_gen
media		media
policy		policy
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GR-MG

Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy

🌐 Project Website | 📄 Paper

News

Preparation

Checkpoints

Training

1. Train Goal Image Generation Model

2. Pretrain Multi-modal Goal Conditioned Policy

3. Train Multi-modal Goal Conditioned Policy

Evaluation

Contact

Acknowledgements

Citation

About

Releases

Packages

Languages

License

bytedance/GR-MG

Folders and files

Latest commit

History

Repository files navigation

GR-MG

Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy

🌐 Project Website | 📄 Paper

News

Preparation

Checkpoints

Training

1. Train Goal Image Generation Model

2. Pretrain Multi-modal Goal Conditioned Policy

3. Train Multi-modal Goal Conditioned Policy

Evaluation

Contact

Acknowledgements

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages