Skip to content
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

missing the initialization of target action value and refreshing the Qhat #13

Open
fi000 opened this issue May 3, 2018 · 1 comment

Comments

@fi000
Copy link

fi000 commented May 3, 2018

I have several questions:
1- When I compared with algorithm presented in"Human-level control through deep reinforcement learning", I can not find the third initialization (initial target action value)? Also, I do not find the last step "every C step Qhat=Q"? Would you please explain where are them or what is the difference to reach them? These steps seems essential!
2- I have my own environment, If I want to have a state=[a,b,c] as input instead of just one input for DQN showing the state what I should do?

@fi000 fi000 changed the title multi-inputs instead of one input and missing the initialization of target action value and refreshing the Qhat missing the initialization of target action value and refreshing the Qhat May 29, 2018
@WorksWellWithOthers
Copy link

  1. There is a function updating the target model. Does this answer your question?
  2. How about, state = [[a, b, c]] ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants