This project implemented value iteration and q-learning to solve Markov Decision Processes.
The generic agents created here implement these algorithms to maximize their long-term reward in three settings: a simple gridworld (Sutton 1998), a simulated robot controller (Crawler), and Pac-Man.
The Pacman AI projects were developed at UC Berkeley, primarily by John DeNero ([email protected]) and Dan Klein ([email protected]).
For more info on the code used for this project, see https://inst.eecs.berkeley.edu/~cs188/sp09/pacman.html