Algorithms for Policy Evaluation, Estimation of Action Values, Policy Improvement, Policy Iteration, Truncated Policy Evaluation, Truncated Policy Iteration, Value Iteration . From Udacity's Deep Reinforcement Learning Nanodegree program.
reinforcement-learning
openai-gym
gym
dynamic-programming
policy-evaluation
policy-iteration
value-iteration
bellman-equation
frozenlake
policy-improvement
state-value-function
action-value-function
-
Updated
Apr 3, 2019 - Jupyter Notebook