In this notebook, we will be using python and machine learning to solve sudoku puzzles! This is a pretty well-covered, topic, so I won't go into great detail, but instead provide overview of each step of the process and how it all comes together!
The goal of this model is to be able to take a picture of a sudoku puzzle, say from a newpaper, and have the model find the puzzle within the picture, and solve it.
With any project, I believe it is crucial to identify the steps needed in order the reach the project's goal. First, break down the goal into the main steps, and then work your way down from there.
In this case, our main goal can be achieved in 3 main steps:
Sounds easy enough, right? Let's dive right in.
To begin, we'll need to isolate the actual sudoku puzzle from the rest of the image. We can use computer vision algorithms to achieve this, such as OpenCV.
Let's first take a look at the raw image that we'll be working with throughout this example.
import os
import cv2
import matplotlib.pyplot as plt
file_list = []
for _file in os.listdir("data/puzzle_imgs/"):
file_list.append(_file)
puzzle = cv2.imread("data/puzzle_imgs/" + file_list[2])
plt.figure(figsize = (10, 10))
plt.imshow(puzzle)
plt.title("Raw Input Image")
plt.show()
Great, now we have an idea of the sort of image we are working with. We can now apply some OpenCV techniques to identify the puzzle itself.
from src.read_puzzle import pre_process_img, find_puzzle_bounds, crop_and_warp, remove_extra
# function to convert image to greyscale and apply thresholding to identify key image components
pre_processed_puzzle = pre_process_img(puzzle)
plt.figure(figsize = (10, 10))
plt.imshow(pre_processed_puzzle, cmap = "gray")
plt.title("Puzzle after Color Convert, Guassian Blur, and Thresholding")
plt.show()
# finds the 4 corners of the largest contour in the image (in this case, the puzzle)
puzzle_bounds = find_puzzle_bounds(pre_processed_puzzle)
print(puzzle_bounds)
[array([485, 67], dtype=int32), array([1253, 67], dtype=int32), array([1265, 876], dtype=int32), array([440, 842], dtype=int32)]
# applies a transformation on the pre-processed image using the bounds from above
# to create a top-down view of the puzzle
cropped_puzzle = crop_and_warp(pre_processed_puzzle, puzzle_bounds)
plt.figure(figsize = (10, 10))
plt.imshow(cropped_puzzle, cmap = "gray")
plt.title("Puzzle Cropped and Warped")
plt.show()
Now that our puzzle has been located and isolated from the rest of the image, it is time to move onward to the second step of the process - digit extraction. This is a two-fold step.
First, we'll need to slice up our preprocessed image into sections that only contain one number.
Then, we'll feed those sections into a neural network that will classify the number in the image. This is done to transfer the information from the puzzle to a more workable format, such as an array.
def grid_slice(grid, i, j, width):
"""
Returns a single section of the image based on its coordinates
"""
i_start = i * width
i_end = i_start + width
j_start = j * width
j_end = j_start + width
section = grid[i_start:i_end, j_start:j_end]
return section
Let's take a quick look at the results of our slicing function before moving onto the next step.
w, h = cropped_puzzle.shape
width = w // 9 # get width of each digit "section"
# print the first row of the puzzle as individual sections
for i in range(1):
for j in range(9):
digit_section = grid_slice(cropped_puzzle, i, j, width)
plt.figure(figsize = (5, 5))
plt.imshow(digit_section, cmap = "gray")
Great, that's working as expected! However, we can see that our section slicing is far from perfect, and many images contain a great deal of the "border" between sections. This could throw off our digit recognition step, so we want to remove that as best we can.
# print the first row of the puzzle as individual sections (with cleaning)
for i in range(1):
for j in range(9):
digit_section = grid_slice(cropped_puzzle, i, j, width)
digit_section = digit_section.astype(float)/255
digit_section = remove_extra(digit_section)
plt.figure(figsize = (5, 5))
plt.imshow(digit_section, cmap = "gray")
Great, our image looks a lot cleaner now!
We can now move on to digit classification using a neural network.
import cv2
import torch
from src.model_selection import get_model
from src.classifier_model import smart_classify
net = get_model("resnet50")
path_to_model = "resnet50_mnist.pth"
net.load_state_dict(torch.load(path_to_model))
net = net.to(device = "cpu")
digit_list = list()
for i in range(9):
for j in range(9):
digit_section = grid_slice(cropped_puzzle, i, j, width)
digit_section = digit_section.astype(float)/255
digit_section = remove_extra(digit_section)
resized_digit = cv2.resize(digit_section, (28, 28), interpolation = cv2.INTER_AREA)
digit_list.append(smart_classify(resized_digit, net, conf_threshold = 0.9))
print(digit_list)
['7', ' ', ' ', ' ', '8', ' ', ' ', ' ', ' ', '9', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '8', '4', ' ', ' ', ' ', '7', '6', ' ', ' ', '6', '7', ' ', ' ', ' ', '5', '9', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '7', ' ', ' ', ' ', '4', '3', ' ', ' ', '4', '3', ' ', ' ', ' ', '7', '2', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '8', ' ', ' ', ' ', '6', '1', ' ', ' ', ' ', '3']
Excellent! Our network seems to be doing the job wonderfully. Now all that's left is to complete the puzzle and print the output!
That that we have all of our starting information in a format that python can easily manipulate, we can move onto solving the puzzle and displaying the output.
Since this topic is pretty widely covered, I opted to scour the internet for resources on sudoku solving algorithms rather than write my own. Surprisingly, this was some of the simplest code across the entire project! I won't list the code here, but it can be found in the repository for this project on my Github if you'd like to take a look for yourself.
Let's get solving!
# convert empty spaces to 0s for displaying puzzle grid
def prettify(digit):
if digit == " ":
return 0
else:
return int(digit)
from src.sudoku import find_empty_grid, solve_array, valid, print_board
plt.figure(figsize = (10, 10))
plt.imshow(255 - cropped_puzzle, cmap = "gray")
plt.title("Original Puzzle Grid")
plt.show()
import numpy as np
digits_as_int = np.array([prettify(digit) for digit in digit_list])
digits_as_int = digits_as_int.reshape((9, 9))
digits_list = [list(digit) for digit in list(digits_as_int)]
print("Original Board in Numerical Representation")
print("\n")
print_board(digits_list)
Original Board in Numerical Representation 7 . . | . 8 . | . . . 9 . . | . . . | . . . . . 8 | 4 . . | . 7 6 - - - - - - - - - - - - - . . 6 | 7 . . | . 5 9 . . . | . . . | . . . . 7 . | . . 4 | 3 . . - - - - - - - - - - - - - 4 3 . | . . 7 | 2 . . . . . | . . . | . . 8 . . . | 6 1 . | . . 3
# actual solving of the puzzle
completed_digit_list = list(digits_list)
solve_array(completed_digit_list)
True
print("Completed Puzzle Board")
print("\n")
print_board(completed_digit_list)
Completed Puzzle Board 7 1 2 | 3 8 6 | 5 9 4 9 6 4 | 1 7 5 | 8 3 2 3 5 8 | 4 2 9 | 1 7 6 - - - - - - - - - - - - - 2 8 6 | 7 3 1 | 4 5 9 1 4 3 | 9 5 8 | 6 2 7 5 7 9 | 2 6 4 | 3 8 1 - - - - - - - - - - - - - 4 3 1 | 8 9 7 | 2 6 5 6 2 7 | 5 4 3 | 9 1 8 8 9 5 | 6 1 2 | 7 4 3
And there you have it! A sudoku puzzle, solved completely using python. Thanks for reading!
All of the source code for this project can be found on my Github.