Author: Jack Robbins
The
Given an initial configuration of an
$N\times N$ grid of tiles, each labeled with distinct numbers from$0$ to$N^2 - 1$ , determine the smallest sequence of single step moves that can be made with the$0$ tile such that the puzzle is in numerical row-major order, with the 0 slider in the$N-1$ row and$N-1$ column position.
The definition can be a bit hard to parse without an example, so let's look at an example starting configuration and goal configuration.
1 | 7 | 15 | 4 |
---|---|---|---|
0 | 6 | 3 | 8 |
2 | 5 | 14 | 11 |
9 | 13 | 10 | 12 |
The goal configuration for all
1 | 2 | 3 | 4 |
---|---|---|---|
5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 |
13 | 14 | 15 | 0 |
While this puzzle may look easy to the uninitiated, writing a program that finds the solution in a reasonable amount of time is quite the challenge. This particular example is solved optimally in 24 moves of the 0 slider.
This project contains several programs, written in C, that use an A* heuristic algorithm to solve the N-Puzzle. An as to my knowledge novel heuristic function, generalized linear conflict
, is discussed. Additionally, contained in this README.md is a full writeup on the theoretical basis and ideas implemented in the programs.
The
Given a great deal of time and memory, it would be possible to explore every possible solution path to each given problem instance. However, we can do much better than this. To manage the complexity of the
Here is the general idea of our search algorithm:
A* SEARCH ALGORITHM
Add starting state to fringe
Initialize closed list to be null
while fringe is not empty:
dequeue a state from fringe
if state is goal:
trace back solution path
exit
end if
expand state into its predecessors
for each predecessor generated:
if state is a repeat(in closed or fringe already):
destroy state
continue
else:
update prediction function for state
priority queue insert state into fringe
end if
end for
add current state into closed
end while
If we reach here, no solution
In the algorithm above, we see that there are two data structures: fringe and closed. Fringe is a priority queue, where the states with the lowest prediction function value(i.e., closest to goal) are put first. This ensures that when the algorithm dequeues, it is always dequeueing and expanding the most promising state. Closed is simply a linked list that stores every node that we have previously expanded. Maintaining these two linked lists is of vital importance to the performance of our algorithm. If we don't keep track of which states we have already visited, it is likely that the algorithm would re-explore states, and there is even a chance that it could get stuck in an infinite loop.
The other important part of this algorithm is the prediction function for each state. This prediction function is the most importnat part of our algorithm. Effectively, this algorithm "guesses" how promising a state is. The lower the value of this prediction function, the more promising the state seems. Good "guesses" allow the algorithm to finish faster, and bad "guesses" will cause the algorithm to explore dead end paths and waste time. The prediction function implemented here focuses on estimating the cost to get from the current state to the goal state.
The format for our prediction function is as follows: f(n) = g(n) + h(n)
. g(n)
is referred to as the "previous travel", and is the simplest to calculate. It is calculated by counting the number of predecessors that a node has. This allows us to factor in the cost to reach a state so far, which is important if we want to find the shortest solution path possible. Additionally, g(n)
can be calculated during state generation by simply adding one to the predecessors current travel, making it basically free to compute.
However, the heuristic function h(n)
is where most of the heavy lifting happens for the algorithm. This function uses calculations guess the number of moves required to change the current state into the goal state. These calculations are referred to as heuristics, and more specifically as admissable heuristics. For a heuristic to be admissable, it must always understimate or equal the cost to reach the goal for any given state. This ensures that we are always following the most optimal, meaning shortest, path to the solution. Heuristics that violate admissablility will not crash the program or cause any drastic errors, they will just simply lead to suboptimal solutions.
This project currently uses two heuristics in combination: manhattan distance
and generalized linear conflict
.
The Manhattan Distance heuristic is quite simple. Given a board and knowing the goal state, it calculates the sum of the horizontal and vertical distance from each tile's current position to its goal position. It can be imagined as taking each tile and sliding it horizontally and vertically into its goal position. This heuristic gives an effective lower bound on the number of moves required to reach a state, becuase it does not take into consideration the interactions between tiles. For this reason, Manhattan distance is an admissable heuristic, because it will always underestimate the cost of reaching the goal state.
From our example before:
1 | 7 | 15 | 4 |
---|---|---|---|
0 | 6 | 3 | 8 |
2 | 5 | 14 | 11 |
9 | 13 | 10 | 12 |
The Manhattan distance would be: 0 + 2 + 3 + 0 + 0 + 1 + 0 + 3 + 1 + 2 + 1 + 1 + 1 + 2 + 1 = 18
It is important to note that we skip calculation anything for the 0 tile, as it is allowed to move. As it turns out, skipping the 0 tile for Manhattan distance calculations leads to much better cost estimation, as the 0 tile is what we use to manipulate the board.
As stated before, Manhattan Distance does not take into account the interactions between tiles that are required for movement. This means that it is quite a drastic underestimate of how many moves are required to reach a state, leaving room for improvement. We will do this improvement with our next heuristic, generalized linear conflict.
Generalized Linear Conflict can make up for Manhattan Distance's ignorance of tile interactions. Traditional Linear Conflict looks for two tiles that are both already in their goal row or goal column, but who are "swapped", meaning that their Manhattan Distances are both 1, and counts them as a linear conflict. Even though both Manhattan Distances are 1, it will take at least 2 additional moves to swap the two tiles around, because we have to use the 0 tile to do this. In the traditional heuristic, we count the number of such swapped tiles and multiply this number by 2 to get the Linear Conflict value. In my version of the heuristic, I generalize this conflict to the entire row and column, and instead of only searching for tiles that are 1 apart, I search for any mismatched tiles in the entire row. With this improved version of the heuristic, we get a more accurate prediction of how many moves are required to properly rearrange the entire board. It will of course still be an underestimate, and therefore admissable, because counting sometimes it may take more than 2 moves to swap two tiles. In fact, it often requires more than 2 moves, but we must keep the heuristic an underestimate to ensure that we always find the shortest path.
Note
Potential Improvement Idea: try and find a more accuracte prediction multiplication. Initial research online suggested that taking the linear conflict count and doing linear_conflict_count / (N - 1) + linear_conflict_count % (N - 1)
, but my results with this method have shown that this violates admissability in some cases
With these two heuristics combined, we have a powerful searching tool that allows for the solving of very complex puzzles(30+ moves) in less than 5 seconds. The heuristic is by no means perfect though, and sometimes it requires hundreds of thousands of iterations to solve random instances of the puzzle.
This project contains several different source files for different purposes. The file generate_start_config.c provides a convenient way of generating starting configurations to solve. It works by taking the goal state for any pthreads
library in C to parallelize the solver. For anyone curious about how this parallelization works, I would encourage you to look at the source code, as it is well documented. The multi-threaded solver is faster for solving large, complex puzzles, but is actually often slower for solving simpler configurations, due to the overhead of thread creation and management.
The interaction between the source code files can be a little complex until you get used to it. Fortunately, this has all been abstracted away through the runner script run.sh for the end user. To run the solver for yourself, first download all of the source code to a unix-based operating system and navigate to the src
folder. From there run the following:
example@bash: ~/N-Puzzle-Solver/src $ chmod +x run.sh
example@bash: ~/N-Puzzle-Solver/src $ ./run.sh
Enter a positive integer for the NxN puzzle size: 4
Enter a positive integer for complexity of initial configuration: 200
Do you want to use multithreading[Y/n]: y
#Puzzle output and solution displays below here
In this example, I've told the program to create a 4x4 puzzle with an initial complexity of 200, and to solve it using multithreading. It is impossible to predict how long the program will take to run, but usually configurations under 300 initial complexity solve within less than 10 seconds. If you are interested in seeing how the solver works, I greatly encourage you to download the source code file and give it a try yourself!
Important
Everything below this point is still experimental and a work in progress. Unfortunately, due a busy schedule, I have not had enough time to refine the pattern database heuristic to the point that I would like to, but basic functionality has been achieved with it, and there have even been some promising initial results using it.
For future work, there are other heuristics, like walking distance, that could be used in both the single and multithreaded version of the solver to potentially improve the speed. My next idea to implement will be implementing the priority queue as a min-heap.