Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models

This repo contains code and data for running HELPER.

Todo List

Add HELPER-X support for ALFRED, Dialfred, and Tidy Task

Installation

Environment

(1) Start by cloning the repository:

git clone https://github.com/Gabesarch/HELPER.git

(1a) (optional) If you are using conda, create an environment:

conda create -n helper python=3.8

(2) Install PyTorch with the CUDA version you have. For example, run the following for CUDA 11.1:

pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

(3) Install additional requirements:

pip install -r requirements.txt

(4) Install Detectron2 (needed for SOLQ detector) with correct PyTorch and CUDA version. E.g. for PyTorch 1.10 & CUDA 11.1:

python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/torch1.10/index.html

(5) Install teach:

pip install -e teach

(6) Build SOLQ deformable attention:

cd ./SOLQ/models/ops && sh make.sh && cd ../../..

(7) Clone ZoeDepth repo

git clone https://github.com/isl-org/ZoeDepth.git
cd ZoeDepth
git checkout edb6daf45458569e24f50250ef1ed08c015f17a7

TEACh Dataset

Download the TEACh dataset following the instructions in the TEACh repo

teach_download

Model Checkpoints and GPT Embeddings

To our model on the TEACh dataset, you'll first need the GPT embeddings for example retrieval:

Download GPT embeddings for example retrieval: here. Unzip it to get the gpt_embedding folder in ./data folder (or in a desired foldered and set --gpt_embedding_dir argument). Alternatively, you can download the file with gdown (pip install gdown):

cd data
gdown 1kqZZXdglNICjDlDKygd19JyyBzkkk-UL
unzip gpt_embeddings.zip
rm gpt_embeddings.zip

TO run our model with estimated depth and segmentation, download the SOLQ and ZoeDepth checkpoints:

Download SOLQ checkpoint: here. Place it in the ./checkpoints folder (or anywhere you want and specify the path with --solq_checkpoint). Alternatively, you can download the file with gdown (pip install gdown):

cd checkpoints
gdown 1hTCtTuygPCJnhAkGeVPzWGHiY3PHNE2j

Download ZoeDepth checkpoint: here. Place it in the ./checkpoints folder (or anywhere you want and specify the path with --zoedepth_checkpoint). (Also make sure you clone the ZoeDepth repo: git clone https://github.com/isl-org/ZoeDepth.git) Alternatively, you can download the file with gdown (pip install gdown):

cd checkpoints
gdown 1gMe8_5PzaNKWLT5OP-9KKEYhbNxRjk9F

Running TEACh benchmark

Running the TfD evaluation

(if required) Start x server. if an X server is not already running on your machine. First, open a screen with the desired node, and run the following to open an x server on that node:

python startx.py 0

Specify the server port number with the argument --server_port (default 0).

Set OpenAI keys. Set Azure keys:

export AZURE_OPENAI_KEY={KEY}
export AZURE_OPENAI_ENDPOINT={ENDPOINT}

(If not using Azure) Important! If using openai API, append --use_openai to arguments. Then set openai key:

export OPENAI_API_KEY={KEY}

Run agent. To run the agent with all modules and estimated perception on TfD validation unseen, run the following:

python main.py \
 --mode teach_eval_tfd \
 --split valid_unseen \
 --gpt_embedding_dir ./data/gpt_embeddings \
 --teach_data_dir PATH_TO_TEACH_DATASET \
 --server_port X_SERVER_PORT_HERE \
 --episode_in_try_except \
 --use_llm_search \
 --use_constraint_check \
 --run_error_correction_llm \
 --zoedepth_checkpoint ./checkpoints/ZOEDEPTH-model-00015000.pth \
 --solq_checkpoint ./checkpoints/SOLQ-model-00023000.pth \
 --set_name HELPER_teach_tfd_validunseen

Change split to --split valid_seen to evaluate validation seen set.

Metrics

All metrics will be saved to ./output/metrics/{set_name}. Metrics and videos will also automatically be logged to wandb.

Movie generation

To create movies of the agent, append --create_movie to the arguments. This will by default create a movie for every episode rendered to ./output/movies. To change the episode frequency of logging, alter --log_every (e.g., --log_every 10 to render videos every 10 episodes). To remove the map visualization, append --remove_map_vis to the arguments. This can speed up the episode since rendering the map visual can slow down episodes.

Ablations

The following arguments can be removed to run the ablations:

Remove memory augmented prompting. Add argument --ablate_example_retrieval.
Remove LLM search (locator) (only random). Remove --use_llm_search.
Remove constraint check (inspector). Remove --use_constraint_check.
Remove error correction (rectifier). Remove --run_error_correction_llm.
Change openai model type. Change --openai_model argument (e.g., --openai_model gpt-3.5-turbo).

Ground truth

The following arguments can be added to run with ground truth:

GT depth --use_gt_depth. Reccomended to also add --increased_explore with estimated segmentation for best performance.
GT segmentation --use_gt_seg.
GT action success --use_gt_success_checker.
GT error feedback --use_GT_error_feedback.
GT constraint check using controller metadata --use_GT_constraint_checks.
Increase max API fails --max_api_fails {MAX_FAILS}.

User Feedback

To run with user feedback, add --use_progress_check. Two additional metric files (for feedback query 1 & 2) will be saved to ./output/metrics/{set_name}.

Running the EDH evaluation

See the teach_edh branch for how to run the TEACh EDH evaluation.

Citation

If you like this paper, please cite us:

@inproceedings{sarch2023helper,
                        title = "Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models",
                        author = "Sarch, Gabriel and
                        Wu, Yue and
                        Tarr, Michael and
                        Fragkiadaki, Katerina",
                        booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
                        year = "2023"}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
SOLQ		SOLQ
checkpoints		checkpoints
data		data
map_and_plan/mess		map_and_plan/mess
models		models
nets		nets
output		output
prompt		prompt
task_base		task_base
teach		teach
tmp		tmp
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
arguments.py		arguments.py
main.py		main.py
requirements.txt		requirements.txt
startx.py		startx.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models

Contents

Todo List

Installation

Environment

TEACh Dataset

Model Checkpoints and GPT Embeddings

Running TEACh benchmark

Running the TfD evaluation

Metrics

Movie generation

Ablations

Ground truth

User Feedback

Running the EDH evaluation

Citation

About

Releases

Packages

Languages

Gabesarch/HELPER

Folders and files

Latest commit

History

Repository files navigation

Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models

Contents

Todo List

Installation

Environment

TEACh Dataset

Model Checkpoints and GPT Embeddings

Running TEACh benchmark

Running the TfD evaluation

Metrics

Movie generation

Ablations

Ground truth

User Feedback

Running the EDH evaluation

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages