This repo contains code for the paper:
Peiyan Li, Hongtao Wu*‡, Yan Huang*, Chilam Cheang, Liang Wang, Tao Kong
*Corresponding author ‡ Project lead
- (🔥 New) (2024.12.18) Our paper was accepted by IEEE Robotics and Automation Letter (RA-L) !
- (🔥 New) (2024.08.27) We have released the code and checkpoints of GR-MG !
Note: We only test GR-MG with CUDA 12.1 and python 3.9
# clone this repository
git clone https://github.com/bytedance/GR-MG.git
cd GR_MG
# install dependencies for goal image generation model
bash ./goal_gen/install.sh
# install dependencies for multi-modal goal conditioned policy
bash ./policy/install.sh
Download the pretrained InstructPix2Pix weights from Huggingface and save them in resources/IP2P/
.
Download the pretrained MAE encoder mae_pretrain_vit_base.pth and save it in resources/MAE/
.
Download and unzip the CALVIN dataset.
# modify the variables in the script before you execute the following instruction
bash ./goal_gen/train_ip2p.sh ./goal_gen/config/train.json
We use the method described in GR-1 and pretrain our policy with Ego4D videos. You can download the pretrained model checkpoint here. You can also pretrain the policy yourself using the scripts we provide. Before doing this, you'll need to download the Ego4D dataset.
# pretrain multi-modal goal conditioned policy
bash ./policy/main.sh ./policy/config/pretrain.json
After pretraining, modify the pretrained_model_path in /policy/config/train.json
and execute the following instruction to train the policy.
# train multi-modal goal conditioned policy
bash ./policy/main.sh ./policy/config/train.json
To evaluate our model on CALVIN, you can execute the following instruction:
# Evaluate GR-MG on CALVIN
bash ./evaluate/eval.sh ./policy/config/train.json
In the eval.sh
script, you can specify which goal image generation model and policy to use. Additionally, we provide multi-GPU evaluation code, allowing you to evaluate different training epochs of the policy simultaneously.
If you have any questions about the project, please contact [email protected].
We thank the authors of the following projects for making their code and dataset open source:
If you find this project useful, please star the repository and cite our paper:
@article{li2025gr,
title={GR-MG: Leveraging Partially-Annotated Data Via Multi-Modal Goal-Conditioned Policy},
author={Li, Peiyan and Wu, Hongtao and Huang, Yan and Cheang, Chilam and Wang, Liang and Kong, Tao},
journal={IEEE Robotics and Automation Letters},
year={2025},
publisher={IEEE}
}