RL_Nim

##Python##

The code is written in Python2

Temporal-Difference Learning

##Installation##

a) The code requires the packages "matplotlib" and "numpy" to be installed. If needed, run "pip install matplotlib" and "pip install numpy" (it may be possible that pip needs upgrading. If so, run "pip install --upgrade pip" beforehand).

b) To run the scripts, you will need jupyter notebook. To install it, run "pip install jupyter"

NOTA BENE: On windows the commands will rather be like this: "python -m pip install package_name"

##Running TD-learning##

a) Once everything is installed, open a terminal and go in the folder where the jupyter notebook (.ipynb) are. Then type "jupyter notebook" to launch the notebook (a new page on your internet navigator will open after a while). From there you will be able to select a notebook to open.

b) There are two notebook: "TD_main.ipynb" which contains the main code for the learning (you'll be able to make an agent learn, then do some plots to evaluate it, and even play against it), and "TD_gridSearch.ipynb" which contains the code needed to perform the grid search on the parameters.

c) Once inside the notebook, you can run code-block by code-block the code by clicking in the block and pressing "Ctrl"+"Enter"

Deep Reinforcement Learning

Results for Deep Q-Learning and Deep Policy Gradient are not as convincing as Q-Learning. However, you can run our implementations by following the instructions below:

##Running Deep Q-Learning##

a) Make sure you have pytorch installed. You can go to https://pytorch.org/ for instructions. The installation command is specific to your machine!

b) You will also need run "pip install matplotlib" if you do not already have the package installed.

c) run "python DQN_Nim.py", the script will perform a parameter grid search over the method. You can adjust parameters at the top of the script. You can plot the output of the grid search by uncommenting the code at the bottom of the script.

Some sources:

Demo of RL using Deep Q-Learning: https://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html

Website of Richard S. Sutton (RL book & courses): https://incompleteideas.net/sutton/

An Introduction to RL (Online textbook): https://incompleteideas.net/sutton/book/the-book.html

Courses on RL (Sutton): https://incompleteideas.net/sutton/609%20dropbox/

Reinforcement Learning for Board Games: The Temporal Difference Algorithm: https://www.gm.fh-koeln.de/ciopwebpub/Kone15c.d/TR-TDgame_EN.pdf

RL for NIM: https://www.diva-portal.org/smash/get/diva2:814832/FULLTEXT01.pdf

RL for NIM 2:https://www.csc.kth.se/utbildning/kth/kurser/DD143X/dkand11/Group6Lars/erikjarleberg.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
GroupE_RL_for_NIM		GroupE_RL_for_NIM
PG		PG
TD-Learning		TD-Learning
example_implementation		example_implementation
python		python
.DS_Store		.DS_Store
.gitignore		.gitignore
DQN_Nim.py		DQN_Nim.py
DQN_Nim.py.orig		DQN_Nim.py.orig
GoupE_RL_for_NIM.rar		GoupE_RL_for_NIM.rar
Play_NIM_DQN.py		Play_NIM_DQN.py
README.md		README.md
README.txt		README.txt
nohup2.out		nohup2.out
plotDQNResults.py		plotDQNResults.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL_Nim

Temporal-Difference Learning

Deep Reinforcement Learning

Some sources:

About

Releases

Packages

Contributors 2

Languages

bramtoula/RL_Nim

Folders and files

Latest commit

History

Repository files navigation

RL_Nim

Temporal-Difference Learning

Deep Reinforcement Learning

Some sources:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages