Skip to content

Deep Q Network based Project Submission for Udacity Deep Reinforcement Learning Nanodegree

License

Notifications You must be signed in to change notification settings

sayonpalit/p1_navigation

Repository files navigation

Navigation

Deep Q Network based Project Submission for Udacity Deep Reinforcement Learning Nanodegree By Sayon Palit

Project Environment

For this project, we will train an agent to navigate (and collect yellow bananas!) in a large, square world.

A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.

The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around the agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:

  • 0 - move forward.
  • 1 - move backward.
  • 2 - turn left.
  • 3 - turn right.

The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.

Requirements

To run the codes, follow the next steps:

  • Create a new environment:
    • Linux or Mac:
     conda create --name dqn python=3.6
     source activate dqn
    • Windows:
     conda create --name dqn python=3.6 
     activate dqn
  • Perform a minimal install of OpenAI gym
    • If using Windows,
      • download swig for windows and add it the PATH of windows
      • install Microsoft Visual C++ Build Tools
    • then run these commands
     pip install gym
  • Install the dependencies under the folder python/
	cd python
	pip install .
  • Create an IPython kernel for the dqn environment
	python -m ipykernel install --user --name dqn --display-name "dqn"
jupyter notebook
  • Once started, change the kernel through the menu Kernel>Change kernel>dqn
  • If necessary, inside the ipynb files, change the path to the unity environment appropriately

Results

result

The current model solves the environment in 400-500 epsiodes on average.

About

Deep Q Network based Project Submission for Udacity Deep Reinforcement Learning Nanodegree

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published