Skip to content

YHL04/DQN

Repository files navigation

DQN

Sample Gif

Dueling Double DQN with e-greedy replaced with Noisy Nets:

alt text

Dueling Double DQN progress after 40+ hours before screen froze:

alt text

My guess for the instability of DQN is that since experience replay replaces the oldest memories with new ones, as the agent learns the optimal policy, the replay would mostly be popularized by the same states and overfitting to it. This results in a sudden collapse of the policy and then the agent quickly recovers while the replay is populated again by "bad" states that it has forgotten. (Someone explain to me if im wrong)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages