DQN

Dueling Double DQN with e-greedy replaced with Noisy Nets:

Dueling Double DQN progress after 40+ hours before screen froze:

My guess for the instability of DQN is that since experience replay replaces the oldest memories with new ones, as the agent learns the optimal policy, the replay would mostly be popularized by the same states and overfitting to it. This results in a sudden collapse of the policy and then the agent quickly recovers while the replay is populated again by "bad" states that it has forgotten. (Someone explain to me if im wrong)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
__pycache__		__pycache__
logs		logs
saves/BreakoutDeterministic-v4		saves/BreakoutDeterministic-v4
test_gifs		test_gifs
.gitignore		.gitignore
40hour_dueling_dqn.png		40hour_dueling_dqn.png
60hour_dueling_noisy_dqn.png		60hour_dueling_noisy_dqn.png
README.md		README.md
config.py		config.py
dqn.py		dqn.py
environment.py		environment.py
main.py		main.py
memory.py		memory.py
model.py		model.py
plot.py		plot.py
test.py		test.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DQN

About

Releases

Packages

Languages

YHL04/DQN

Folders and files

Latest commit

History

Repository files navigation

DQN

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages