Skip to content

Latest commit

 

History

History
 
 

examples

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

The mbrl.examples package can be used to train models using our example MBRL algorithm implementations. We currently have examples for PETS, MBPO, and PlaNet.

The examples can be run by typing

python -m mbrl.examples.main ${hydra_options}

where ${hydra_options} is any set of Hydra overrides. To see the available overrides, take a look at our configuration files. The config files are generally structured in 4 groups:

  • algorithm: includes options specific to each algorithm that typically don't vary across experiments.
  • dynamics_model: describes the dynamics model to use.
  • overrides: describes experiment specific configuration and hyperparameters. Typically, we include one for each environment to be run, which we have populated with the best hyper-parameters for each environment we have found so far.
  • action_optimizer: describes possible optimizers to use for action selections. Some algorithms, like MBPO, ignore this.

For example, to run MBPO on gym's Hopper environment using the standard ensemble version of GaussianMLP, you can type

python -m mbrl.examples.main \
  algorithm=mbpo \
  overrides=mbpo_hopper \
  dynamics_model=gaussian_mlp_ensemble \
  algorithm.agent.batch_size=256 \
  overrides.validation_ratio=0.2 \
  dynamics_model.activation_fn_cfg._target_=torch.nn.ReLU

where we have re-written some defaults, just to show how hydra command line syntax works. The number of possible options is extensive, and the best way to explore would be to look at the configuration files.

Finally, keep in mind that not all models and algorithms are compatible, and the correct combination needs to be specified manually in the command line. For example, running PlaNet requires passing both algorithm=planet and dynamics_model=planet, in addition to any other arguments you wish to change.

By default, all algorithms will save results in a csv file called results.csv, inside a folder whose path looks like ./exp/mbpo/default/gym___HalfCheetah-v2/yyyy.mm.dd/hhmmss; you can change the root directory (./exp) by passing root_dir=path-to-your-dir, and the experiment sub-folder (default) by passing experiment=your-name. The logger will also save a file called model_train.csv with training information for the dynamics model.