Skip to content

Tensorflow code for "Learning Self-Imitating Diverse Policies" (ICLR 2019)

Notifications You must be signed in to change notification settings

tgangwani/SelfImitationDiverse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repo contains code for our paper Learning Self-Imitating Diverse Policies published at ICLR 2019.

The code was tested with the following packages:

  • python 3.5.2
  • tensorflow 1.4.0
  • gym 0.9.2

Running command

To train a self-imitation agent in an episodic reward environment, use:

python main.py --env_id HalfCheetah-v1 --seed=$(echo $RANDOM) --mu=0.8 --episodic

The parameter 'mu' is as defined in the paper (Equation 5.)

SVPG for diverse multi-agent training

This functionality is provided as part of a separate codebase. Please use the code here with the following configuration in the file default_config.yaml: divergence: js, dre_type: nce

Credits

The code is built on, and uses many utils from OpenAI baselines

About

Tensorflow code for "Learning Self-Imitating Diverse Policies" (ICLR 2019)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published