Skip to content

Implements a dynamic programming algorithm solving a joint probabilistic constraint.

License

Notifications You must be signed in to change notification settings

qc-an/chanceDPA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chanceDPA

About this project

Dynamic Programming Algorithm (DPA) to solve a problem with joint probabilistic constraints. It can be used to solve a path planning problem where the chance of hitting an obstacle needs to be kept below a threshold. This project implements this paper [1] and can be directly used to efficiently solve problems with a large state-action-noise space.

The following animation shows the result, step by step, of the DPA on a path planning problem.

Path planning problem definition

As in [1], the code solves a path planning problem with the following dynamics:
x_{k+1}=f_k(xk,uk,wk)=xk+uk+wk
norm(u_k), wk_distrib
As shown in the paper [1], this model is general enough to solve a Mars entry, descent and landing problem. Note that for the path planning problem considered here, xk consists of 100x100 states, uk can take up to 81 different values using our discretized grid and the noise distribution increases the computations required to apply the DPA, requiring an efficient solution to the problem.

As in [1], we use sigma167. Therefore, to minimize the complexity of the problem, we approximate wk to discretized values, with abs(wk)<5.

The cost-to-go is defined in this way:

gN
gk

We also define the Lagrangian which needs to be minimized and a variable indicating whether a state is an obstacle or not:
Lk
Ik

To compute efficiently the expected cost value for each state and action, we convolve the costs for each action with a probability filter w_kern, since
Jk_fixeduk
Jk_Exp_sep
Jk_sump(wk)
Jk_conv

This avoids the need to use a stochastic approach by applying different wk values, which would be averaged (Monte-Carlo) to obtain an estimate of the cost for each action. This approach (tested) is indeed computationally too expensive.

Results

These paths were obtained using the Dynamic Programming Algorithm (DPA) algorithm while respecting the probability constraint of hitting an obstacle. Note that different paths are obtained depending on the cost penalization by lambda, with higher values leading to less risk-prone policies. Hence, an extreme lambda value of 0 yields to the fastest path between the start and the end points, as shown in blue on the plot.

The next animation shows the different costs obtained for each timestep using the DPA, with lambda = 1e-4. One can see that policies obviously depend on the timestep, as there would be no benefit in risking to hit an obstacle when the goal state is not reachable anymore, for instance for important timesteps (K>30) on states far from the goal. Also, higher cost values ("hills" on the plot) show the presence of obstacles and lead to policies reducing the probability of staying on the obstacle.


When removing the obstacle constraint (lambda = 0), the optimal policy will not avoid obstacles as it occurs no additionnal cost. Typical costs results after N=50 steps, are shown on the following plot below

Execution time

The execution of the Dynamic programming Algorithm without evaluating the risk for the given policy currently takes approx. 3.8 sec (for N=50 steps on 100x100 states, with 81 inputs + noise, tested on a Intel Core i5-4210U CPU @ 1.70GHz, 4GB of RAM).

The algorithm finds an optimal solution satisfying the chance contraint of 1% in approximately 80 sec, for the error tolerance convergence_criteria.

How to run it, easy!

The code was tested in Matlab R2016b and shouldn't require any Matlab toolbox.
To run it, simply download the \src folder and run the "riskDPA_main.m" script.
To set a different probability constraint on obstacle collision, simply change the parameter delta (0.01 would correspond to 1% of chance of hitting an obstacle using an optimal policy).

Further work

  • Implementing Brent's method for efficient interval computation for lambdas. The convergence rate is currently not optimal, although the solution meets the final chance constraint.

References

  • [1] M. Ono, M. Pavone, Y. Kuwata and J. Balaram, "Chance-constrained dynamic programming with application to risk-aware robotic space exploration", Autonomous Robots, 2015.
  • [2] D.P. Bertsekas, "Dynamic Programming and Optimal Control", Vol. I, 3rd edition, 2005, 558 pages.
  • [3] D.P. Bertsekas, "Nonlinear Programming", Second Edition. Athena Scientific (1999)

About

Implements a dynamic programming algorithm solving a joint probabilistic constraint.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • MATLAB 100.0%