A simple image processing application that uses OpenCV and C++ to solve any given 10x10 sudoku puzzles, by reading images of them taken from a newspaper!
An arbitrary sized image of any newspaper sudoku puzzle.
The solved sudoku, displayed to the user.
The input image:
- Primarily contains the sudoku (i.e. is focused on the sudoku)
- Is clear, and the grid lines of the sudoku are visible and distinct from background
- Contains only one sudoku that isn't folded (but angle of capture can vary)
- Doesn't have shadows
- Doesn't have complicated noise (i.e. apart from noise that can be removed by smoothing)
- Contains an unsolved sudoku
- Is captured in atleast dimly lit environment
We first remove the unnecessary detail of the given newspaper image, by cropping out only the sudoku.
- Gaussian blurring
- Adaptive thresholding and inverting
- Dilation (using a plus shaped kernel) to connect broken components
- Flood filling for identifying the largest blob (the main sudoku grid)
- Obtaining the coordinates of the grid by detecting lines using Hough transforms.
- Warping perspective of the original image using the above points to focus on the sudoku
We now split the sudoku into it's individual "cells", where a cell would be the smallest grid containing a number in the sudoku. Thus, we expect there to be 100 such cells according to our consideration.
- On the warped image, we apply adaptive thresholding.
- We now obtain an image containing all horizontal grid lines only, and similarly an image containing all vertical grid lines only.
- We add the above two images to get a mask.
- All the grid lines are eliminated from the image by subtracting it with the above mask.
- We now iterate through each cell and crop it by cell size to obtain either a blank image, or an image containing a digit.
Each cell image would need to be classified to get the number present within the cell, and 0 otherwise (a sudoku cell varies from 1 to 9, and thus 0 is chosen to represent no digit.)
- We center each of the cell images, by calculating the bound lines where the digit starts and ends.
- Then, we crop and resize the cell image to 28x28 pixel size.
- A decision is made whether the image contains a number or not by calculating the area of moments. If above a threshold, we assume a number is present. Else, we return a zero for that cell.
- We now use the K-Nearest Neighbours algorithm on the MNIST dataset that is trained, to predict the number in the cell image, and return it.
A straightforward backtracking based algorithm is implemented that can solve any solvable sudoku.
OpenCV>=3.4.0
gcc version 7.4.0
MNIST database in .csv files
> git clone .
> cd Sudoku_Solver
Edit paths to the MNIST database in sudoku_former.cpp
> make
> ./sudoku_former sudoku.jpg
Press <Enter> upon viewing the Undistorted image
Paste the reference sudoku present in Reference_Sudokus.txt
The accuracy of sudoku formation is only at 92.6 %. Accuracy was calculated by the total number of correct grid cells classified.
For example, the digit "1" is being classified as digit "7". Or the digit "9" as "0", and so on. To solve a sudoku, we would require a 100 % accurate sudoku formation, as even one misclassified digit could make the sudoku unsolvable.
Possible reasons behind this issue: KNN was trained on MNIST database of handwritten digits, but it's used on printed digits. Another reason could be the accuracy of the KNN itself - which is around 96.65 % (for a test set of 10,000 images).