(So far) Basic implementation of Model-Agnostic Meta-Learning (MAML) applied on Reinforcement Learning problems in TensorFlow 2.
This repo is heavily inspired by the original implementation cbfinn/maml_rl (TensorFlow 1.x) and the very fantastic implementations of Tristan Deleu from MILA tristandeleu/pytorch-maml-rl (PyTorch) and Jonas Rothfuss jonasrothfuss/ProMP (TensorFlow 1.x).
I totally recommend to check out all three implementations too.
You can use the main.py
script in order to train the algorithm with MAML.
python main.py --env-name 2DNavigation-v0 --num-workers 20 --fast-lr 0.1 --max-kl 0.01 --fast-batch-size 20 --meta-batch-size 40 --num-layers 2 --hidden-size 100 --num-batches 500 --gamma 0.99 --tau 1.0 --cg-damping 1e-5 --ls-max-steps 15
This script was tested with Python 3.6.
The basic MAML algorithm which uses TRPO as optimizing method can already be used. Examples, experiment scripts and more environments (i.e. MetaWorld) will follow.
TF2 graph support with 'tf.function' will be added soon.
In the future more variations like CAVIA, ProMP, etc. will be implemented also.
This project is, for the most part, a reproduction of the original implementation cbfinn/maml_rl in TensorFlow 2. The experiments are based on the paper
Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. International Conference on Machine Learning (ICML), 2017 [ArXiv]
If you want to cite this paper
@article{DBLP:journals/corr/FinnAL17,
author = {Chelsea Finn and Pieter Abbeel and Sergey Levine},
title = {Model-{A}gnostic {M}eta-{L}earning for {F}ast {A}daptation of {D}eep {N}etworks},
journal = {International Conference on Machine Learning (ICML)},
year = {2017},
url = {http:https://arxiv.org/abs/1703.03400}
}