The aim of this project is to randomly generate KenKen puzzles, with a difficulty metric to help find interesting ones. As of 2018, CS was using it to provide KenKen puzzles for Caltech's newspaper.
Run the following commands to compile the TypeScript code:
npm i
npm run build
main.js
creates a random puzzle solution and then continually generates cagings which can be solved to give that solution.
For larger grid sizes, main.js
may need to be run for at least about 20 minutes, especially if you are looking for a very difficult puzzle.
The program runs until interrupted by you unless you give it some limits on
the number of puzzles to generate (see below).
The shared solution to all these puzzles is stored in solution.sbv
, and the different cagings are stored in the cagings
folder, grouped by difficulty.
For example, cagings/5/1.sbv
is the first puzzle generated with difficulty 5, cagings/5/2.sbv
is the second puzzle generated with difficulty 5, and so on.
Running main.js
continuously shows a progress odometer, which looks something
like this:
$ ./main.js 5 #generate 5x5 puzzles
Saved solution
59.44%; 0: 189, 2: 3, 3: 83, 4: 128, 5: 39, 6: 10, 7: 12, 8: 1, 10: 1
# Of the cagings tried so far, 189 were not solvable, 3 had difficulty 2, 83 had difficulty 3, and so on
# 59.44% of the cagings tried so far were solvable using the available solving strategies
You can limit the number of puzzles generated by using the option
-count lev/num
with main.js
. This stops puzzle generation when num
puzzles of difficulty level lev
(or higher) are generated. You can suppress
the odometer output entirely by using the -odo
option, or set the line length
to n
with -odo n
.
The generated cagings are stored in a structure-bytes format to minimize the space they take up on disk, as there may be thousands of them.
To view a generated puzzle, run render.js
, which creates a puzzle.html
:
./render.js cagings/10/1.sbv
open puzzle.html
The solution steps for a puzzle may be seen by running solver.js
which gives
you a step-by-step solution to the puzzle. Unfortunately, the solver is not a
general solver (see Solving a caging below) and some puzzles not generated
by main.js
are unsolvable.
You can also translate the internal (binary) form of the puzzle into a portable,
human-readable one by running the
trans.js
program. You can solve puzzles in the human-readable format
directly using mysolver.js
. (Not all puzzles are solvable; see above. For
general solvers, see here).
The shell script generate.sh
will generate a suite of puzzles of some given
size and level of difficulty. Use it like so:
sh generate.sh 6 -count 12/10 -out /tmp/puzzle.out
which will write a set of 10 puzzles of difficulty level 12 into the file
/tmp/puzzle.out
. The puzzles will be of size 6x6.
There are essentially 3 steps to generating an n
by n
puzzle:
- Create a random solution grid (i.e. every row and column has each number
1
ton
)—inmake-board.ts
- Create a random caging of the grid—in
make-board.ts
- Attempt to solve the caging using relatively human-friendly strategies; if solving was successful, computing a difficulty metric—in
solve.ts
This is the simplest of the 3 steps.
The program starts with a known valid grid, containing 1, 2, ..., n
in the first row and each subsequent row is shifted over once more.
If n
is 4, for example, the starting grid looks like this:
1 2 3 4
2 3 4 1
3 4 1 2
4 1 2 3
The program then uses the fact that swapping rows maintains a valid grid, as does transposing the grid. So the program chooses 2 random rows to swap and then takes the transpose of this new grid, and repeats this process 100,000 times.
A list of "unallocated regions" is maintained, from which cages are carved out. Initially, the whole grid is a single unallocated region. While unallocated regions remain, the largest one is selected to have a cage carved out of it. A random desired cage size is computed, and an initial box in the unallocated region is randomly chosen to be in the cage. Adjacent boxes to the boxes already added to the cage are added to the cage until the cage reaches the desired size or the unallocated region is exhausted. The remaining boxes in the unallocated regions are assembled into new unallocated regions.
For example, consider a 3x3 grid, where the numbers denote unallocated regions:
1. Initially 2. Cage A of size 3 3. Cage B of size 2
1 1 1 1 A 2 1 A 2
1 1 1 A A 2 A A 2
1 1 1 2 2 2 B B 2
4. Cage C of size 4 (capped at 3) 5. Cage D of size 3 (capped at 1)
1 A C D A C
A A C A A C
B B C B B C
Once the boxes in each cage have been decided, the cage operation is picked randomly.
Cages with a single box are forced to be =
; /
and -
are prioritized for
two box cages if it is possible to make the value a positive integer;
+
or *
can be chosen in any case.
The solving process attempts to mimic how a human would solve the puzzle, so for example, it does not guess a box's value and check whether that would create conflicts later in the solving process. The solving rules are as follows:
-
Arithmetic restriction: look at each cage and find all possible ways to fill in the boxes with numbers
1
ton
. Throw out all combinations that would lead to the same number being in any row or column. For each box, restrict its possible values to the values it could take on in any valid combination. In addition, if every possible combination causes some row to have a box with some numberC
,C
is excluded from the rest of that row.For example, in a 6x6 puzzle, a cage with 3 boxes and
6*
as its target has the following possible combinations of values:[ 1, 1, 6 ] [ 1, 2, 3 ] [ 1, 3, 2 ] [ 1, 6, 1 ] [ 2, 1, 3 ] [ 2, 3, 1 ] [ 3, 1, 2 ] [ 3, 2, 1 ] [ 6, 1, 1 ]
-
Pick uniques: if any box is the only box in its row or column which can have a certain value, it must have that value. This can be thought of as a special case of the "find isolated groups" strategy with a group size of
n - 1
, except that works even whenn
is large. -
Find isolated groups: if
k
boxes in a row or column (called a group of sizek
) all have possibility sets which are subsets of a set of sizek
, no other box in that row or column can have any of thosek
values. Equivalently, if the union of the possibility sets of thosek
boxes has sizek
, that union can be removed from all other boxes in the row or column. The strategy only checks relatively small groups, both to reduce processing time and because it is hard for humans to see groups larger than 3 or 4. Note that ifk
is 1, this strategy is simply removing the value of determined boxes from the other boxes in their rows and columns. -
Cross-row eliminate: if there are
k
rows, and in each, the possible locations of some numberC
is some subset of the samek
columns, no box in any of thosek
columns that is not in one of thek
rows can haveC
. The same is true when using columns instead of rows.This is easier to understand when
k
is 2. Consider this 4x4 grid, and focus on columns 1 and 2:2 or 3 3 or 4 any any 1 or 2 1 or 4 any any 1 or 3 1 or 3 any any 4 2 any any
You can see that in each column, 1 must be in either the second or third boxes and 3 must be in either the first or third boxes -- just based on the columns themselves. Even though we don't know which box in each column contains the 1 or the 3, we know that no other box in the second or third rows can have 1, and no other box in the first or third rows can have 3.
This is by no means a complete list of solving strategies, but any others are much more difficult to program.
For example, it is often possible to find the sum of all the boxes in a row or column except for one, and since the sum of each row and column is 1 + ... + n
, the value of the remaining box's value can be determined.
Informally, the difficulty of a puzzle is measured by the number of sequential logical steps required to solve it. The idea is, for example, that if every square's value can be found by making at most 5 logical inferences, then this is probably an easier puzzle than if 10 inferences were required to find some square's value.
If the set of partially filled-in grids is considered as a directed acyclic graph where edges represent single inferences, the difficulty metric is the minimum number of edges on a path from the empty grid to the solved grid. This metric is computed by applying each solver independently to the previous grid, and then taking the intersection of the possibilities given by each of the solvers to compute the next grid. The number of steps required to solve the grid is then the difficulty metric. This is probably not the most accurate metric of what makes a puzzle difficult to a human, but it yields a rough measure of their relative difficulty to know which puzzles to choose among. Among other things, it doesn't consider the breadth of the inference graph or the relative difficulty of steps made by the different solvers.
Caleb Sander (CS), 2018, ghfbsd, 2024