GitHub - sanitgupta/pac-planning: PAC Optimal MDP Planning

Introduction

Markov Decision Processes (MDPs) are a fundamental mathematical abstraction used to model sequential decision making under uncertainty and are a model of discrete-time stochastic control and reinforcement learning problems. Particularly important in MDPs is the planning problem, wherein we try to compute an optimal policy that maps each state of an MDP to an action to be followed at that state. The optimal policy is a policy which maximises the reward while traversing the MDP.

We study the planning problem assuming that a near perfect simulator is available. Given how expensive it can be to gather data, the time required to find a near optimal policy for many problems is dominated by the number of calls to the simulator. A good MDP planning algorithm in such a setting attempts to minimise the number of calls made to the simulator to learn a policy that is very close to being optimal with a high probability. This is known as the probably approximately correctly (PAC) framework.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
FINALDATATOPLOT		FINALDATATOPLOT
data		data
logs		logs
observations		observations
results/markovchain		results/markovchain
value iteration		value iteration
.gitignore		.gitignore
CasinoLand-fiechter.txt~		CasinoLand-fiechter.txt~
DDVOuu-script.sh		DDVOuu-script.sh
DDVOuu.py		DDVOuu.py
FIECHTER-script.sh		FIECHTER-script.sh
FIECHTER.py		FIECHTER.py
LICENSE		LICENSE
LUCB.py		LUCB.py
LUCBEpisodicBound.py		LUCBEpisodicBound.py
LUCBepisodic-script.sh		LUCBepisodic-script.sh
LUCBepisodic.py		LUCBepisodic.py
MBIE.py		MBIE.py
MC_PI.py		MC_PI.py
MDPclass.py		MDPclass.py
MDPmodel.lp		MDPmodel.lp
MarkovChainEsti.py		MarkovChainEsti.py
PolicyIt.py		PolicyIt.py
README.md		README.md
RR-script.sh		RR-script.sh
RR.py		RR.py
a.out		a.out
allscript.sh		allscript.sh
allscript.sh~		allscript.sh~
average_seeds.py		average_seeds.py
betterprogram.cpp		betterprogram.cpp
commands.sh		commands.sh
constants.py		constants.py
evaluatePolicy.py		evaluatePolicy.py
histogram.py		histogram.py
main.py		main.py
mc_verror.py		mc_verror.py
notes.txt		notes.txt
observations.ods		observations.ods
plot.py		plot.py
program.cpp		program.cpp
readme.txt		readme.txt
runExp.sh		runExp.sh
testing.txt		testing.txt
util.py		util.py
verror_avg.py		verror_avg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

About

Releases

Packages

Languages

License

sanitgupta/pac-planning

Folders and files

Latest commit

History

Repository files navigation

Introduction

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages