Name		Name	Last commit message	Last commit date
parent directory ..
images		images
HopperBulletEnv_0.02std_5438epis.ipynb		HopperBulletEnv_0.02std_5438epis.ipynb
HopperBulletEnv_003std_3240epis_U.ipynb		HopperBulletEnv_003std_3240epis_U.ipynb
README.md		README.md
TwinDelayed.py		TwinDelayed.py
TwinDelayed_U.py		TwinDelayed_U.py

README.md

Project - HopperBulletEnv with Twin Delayed DDPG (TD3)

Environment

Solving the environment require an average total reward of over 2500 on 100 consecutive episodes.
Training of HopperBulletEnv is performed using the Twin Delayed DDPG (TD3) algorithm, see
the basic paper Addressing Function Approximation Error in Actor-Critic Methods.
In this directory we solve the HopperBulletEnv environment in 3240 episodes with the parameter noise std = 0.03, and in 5438 episodes with noise std = 0.02.

Training the Agent

The score 2500 was achieved in the episode 3240 after training 25 hours 28 minutes.

The score 2500 was achieved in the episode 5438 after training 36 hours 59 minutes.

Relevant paper

Three aspects of Deep RL: noise, overestimation and exploration

Other TD3 projects

Video

See video Lucky Hopper on youtube.

Credit

The source paper is Addressing Function Approximation Error in Actor-Critic Methods
by Scott Fujimoto , Herke van Hoof, David Meger.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HopperBulletEnv_v0-TD3

HopperBulletEnv_v0-TD3

README.md

Project - HopperBulletEnv with Twin Delayed DDPG (TD3)

Environment

Training the Agent

Relevant paper

Other TD3 projects

Video

Credit

Files

HopperBulletEnv_v0-TD3

Directory actions

More options

Directory actions

More options

Latest commit

History

HopperBulletEnv_v0-TD3

Folders and files

parent directory

README.md

Project - HopperBulletEnv with Twin Delayed DDPG (TD3)

Environment

Training the Agent

Relevant paper

Other TD3 projects

Video

Credit