MK-OCR

What/Why

Well, it's code that uses machine learning to analyze Mario Kart race result screenshots and dump all the results into a spreadsheet. Basically it's a very silly, highly specific OCR routine.

I wrote it after playing a bunch of Mario Kart online and wondering how the game awards points after a race. It's obvious that you get more points for beating players with a better rating than you, and lose more points for losing to someone with a worse rating than you, but the exact rules aren't clear. I thought I might be able to work out the pattern if I could analyze the data. I haven't figured it out really (I'll probably post more about that another time), but maybe someone else wants to take a crack at it. If nothing else, this might be useful as an example of a simple machine learning task.

How

The stats shown after a race are an easy target for simple machine learning since the digits are very distinct and consistent in appearance.

The first step is to extract the relevant pixel data from the race results screen. We need:

Each digit for each player's current rating (called "VR", for Versus Rating)
Each digit for the number of points awarded to each player
The awarded points' sign (+ or -) designating a gain or a loss.

This is a supervised classification problem, so the next step is to create a training data set. I did this already by manually labeling a bunch of the pixel data, for use in a classifier. This classification problem is not particularly hard and I suspect the choice of classifier doesn't make a whole lot of difference, but I chose a multi-class support vector machine (SVM). SVMs are really binary classifiers (distinguish A from B only), so scikit-learn implements a one-vs-rest scheme for the multi-class case. That just means that, for the classifier to succeed, a given digit's pixel data need to be separable from the pixel data of all the other digits.

Rather than feed the raw pixels into the classifier, they're processed first by:

Converting to grayscale
Applying a sharpening filter to accentuate edges
Reducing the dimensionality using factor analysis

The model fitting process uses cross-validation to choose the number of components to retain from the factor analysis, and to tune the regularization strength of the SVM. It selects the best-performing model and exposes it to the user.

Requirements

Mario Kart 8 Deluxe screenshots. I captured mine using a Switch Lite. I suspect a full-sized Switch would be fine, too, but I didn't try it. You can also just use my screenshots.
Python3 (I'm using 3.7.1)
PIL
NumPy
scikit-learn
pandas (and xlrd or openpyxl)
joblib
pigeon, only if you want to label your own training data for some reason

Usage

High-level functions

The high-level MKDataCompiler class is the easiest place to start, and it's very simple to use. Just give it training data and your screenshots, and get a pandas DataFrame.

from MKDataCompiler import MKDataCompiler
import glob

paths_to_labeled = {'vr_digits':  'vr_digits_labeled.xlsx',
                    'pts_digits': 'pts_digits_labeled.xlsx',
                    'pts_signs':  'pts_signs_labeled.xlsx'}
mkdc = MKDataCompiler( paths_to_labeled, n_jobs=4)
df = mkdc.compile( glob.glob( 'images-redacted/*.jpg'))
df.head(20).fillna('')

tuning vr_digits...done.
classification accuracy w/ 10-fold xval: 100.0% using ncomp=20 and C=0.1
tuning pts_digits...done.
classification accuracy w/ 10-fold xval: 100.0% using ncomp=15 and C=0.01
tuning pts_signs...done.
classification accuracy w/ 10-fold xval: 100.0% using ncomp=4 and C=0.1

		VR	points	is user
race	rank
1	1	13453	20
	2	10178	16
	3	2417	21
	4	10937	4
	5	10342	-2
	6	10243	-7	x
	7	10455	-16
	8	10010	-19
2	1	10311	27
	2	10119	21
	3	12154	17
	4	17105	3
	5	9956	9
	6	11916	2
	7	11778	-2
	8	9104	-1
	9	10265	-9	x
	10	12190	-18
	11	11686	-24
	12	1007	-2

That's it! It works well; the cross-validation result suggests the classifier has perfect accuracy. It's not super surprising, but pretty cool.

The biggest trouble with this classification task has to do with detecting blank spaces that don't contain digits. Blanks can occur for a few reasons: if there are less than 12 players; if a "VR" rating is less than 5 digits; or if the points awarded are single-digit. In any of those scenarios, some random part of the game's visuals ends up in the extracted data.

This issue made it tough at first to get perfect accuracy. There were always a couple samples that got misclassified. My first solution was to just add more training data, which is why there is such a stupidly large number of labeled samples in my training set. It didn't really help though, so I started playing around with pre-processing and found that applying a sharpening filter to the image made a big difference. This filter made each digit stand out better against the background scenes.

Using the individual classifiers

You don't need to use the high-level MKDataCompiler class. Using the classifiers directly might also be interesting:

from MKImageClassifier import MKImageClassifier
import matplotlib.pyplot as plt
import numpy as np

Train the SVM for the points-awarded digits

clf_pts_digits = MKImageClassifier( 'pts_digits_labeled.xlsx', 'pts_digits', n_splits=10)
_ = clf_pts_digits.tune( n_jobs=4)

tuning pts_digits...done.
classification accuracy w/ 10-fold xval: 100.0% using ncomp=15 and C=0.01

Plot the low-dimensional training data features (i.e. scores from factor analysis)

fig,axs = plt.subplots( ncols=3, figsize=(13,4))
for d in np.unique( clf_pts_digits.mdl.y_train):
    mask = (clf_pts_digits.mdl.y_train == d)
    _ = axs[0].scatter( clf_pts_digits.mdl.scores[mask,1], clf_pts_digits.mdl.scores[mask,3], s=2)
    _ = axs[0].set_xlabel( 'Factor 2', size=16)
    _ = axs[0].set_ylabel( 'Factor 4', size=16)
    
    _ = axs[1].scatter( clf_pts_digits.mdl.scores[mask,3], clf_pts_digits.mdl.scores[mask,6], s=2)
    _ = axs[1].set_xlabel( 'Factor 4', size=16)
    _ = axs[1].set_ylabel( 'Factor 7', size=16)
    
    _ = axs[2].scatter( clf_pts_digits.mdl.scores[mask,0], clf_pts_digits.mdl.scores[mask,6], s=2)
    _ = axs[2].set_xlabel( 'Factor 1', size=16)
    _ = axs[2].set_ylabel( 'Factor 7', size=16)
plt.tight_layout()

Each dot is the low-dimensional representation of one sample of digit pixels. Different samples of the same digit are shown in the same color, and they tend to cluster together (which is why the classifier can tell them apart). Cluster separation isn't great along every factor/dimension so I selected a couple of the more interesting planes for plotting. The clusters appear elongated in the low-D space-- there's probably a way to adjust the normalization to fix this but it works well anyway, so it's Probably Fine as-is.

Fit the other two classifiers

clf_pts_signs = MKImageClassifier( 'pts_signs_labeled.xlsx', 'pts_signs', n_splits=10)
_ = clf_pts_signs.tune( n_jobs=4)

clf_vr_digits = MKImageClassifier( 'vr_digits_labeled.xlsx', 'vr_digits', n_splits=10)
_ = clf_vr_digits.tune( n_jobs=4)

tuning pts_signs...done.
classification accuracy w/ 10-fold xval: 100.0% using ncomp=4 and C=0.1
tuning vr_digits...done.
classification accuracy w/ 10-fold xval: 100.0% using ncomp=20 and C=0.1

Load a bunch of images to classify (using MKImageLoader, more on that below)...

from MKImageLoader import load_images
vr_digits, pts_digits, pts_signs, user_ranks = load_images( glob.glob( 'images-redacted\\*jpg'))

...and run the loaded data through each classifier

vr_digits_hat  = clf_vr_digits.predict( vr_digits)
pts_digits_hat = clf_pts_digits.predict( pts_digits)
pts_signs_hat  = clf_pts_signs.predict( pts_signs)

Take a look at the predictions

img = 4
fig,axs = plt.subplots( nrows=12, ncols=8, figsize=(9,13))
for rank in range(12):
    for digit in range(5):
        axs[rank,digit+3].imshow( vr_digits[:,:,digit,rank,img])
        axs[rank,digit+3].set_title( vr_digits_hat[digit,rank,img], fontsize=26)
    
    _ = axs[rank,0].imshow( pts_signs[:,:,0,rank,img])
    _ = axs[rank,0].set_title( pts_signs_hat[0,rank,img], fontsize=26)
    
    _ = axs[rank,1].imshow( pts_digits[:,:,0,rank,img])
    _ = axs[rank,1].set_title( pts_digits_hat[0,rank,img], fontsize=26)
    
    _ = axs[rank,2].imshow( pts_digits[:,:,1,rank,img])
    _ = axs[rank,2].set_title( pts_digits_hat[1,rank,img], fontsize=26)
for ax in axs.flatten():
    ax.set_xticks([])
    ax.set_yticks([])
plt.tight_layout()

Work with image data directly

You can use the image loader class, MKImageLoader, if you want more direct access to the pixel data.

from MKImageLoader import MKImageLoader

mkil = MKImageLoader( 'images-redacted\\2020082317571100-16851BE00BC6068871FE49D98876D6C5.jpg')
mkil.main_region

View the winner's pixels

mkil.player_regions[0]

View your own pixels

The image loader auto-detects which place you came in, so you can track yourself easily. This is also important because the image loader has to invert the colors for your stats, otherwise they won't be white-on-a-black-background like all the other players.

mkil.player_regions[mkil.user_rank]

Extract every VR rating as a numpy array of image data (height-by-width-by-digit-by-rank)

vr_digits = mkil.get_vr_digits()
type( vr_digits), vr_digits.shape

(numpy.ndarray, (23, 15, 5, 12))

Inspect individual VR digits from numpy array

fig,axs = plt.subplots( ncols=5, figsize=(4,1))
for i,ax in enumerate( axs):
    ax.imshow( vr_digits[:,:,i,mkil.user_rank])

Do the same with the sign and digits of points awarded

pts_signs = mkil.get_pts_signs()
pts_digits = mkil.get_pts_digits()
pts_signs.shape, pts_digits.shape

((13, 13, 1, 12), (18, 12, 2, 12))

fig,axs = plt.subplots( ncols=3, figsize=(2.5,1))
_ = axs[0].imshow( pts_signs[:,:,0,mkil.user_rank])
_ = axs[1].imshow( pts_digits[:,:,0,mkil.user_rank])
_ = axs[2].imshow( pts_digits[:,:,1,mkil.user_rank])

Load a whole folder full of images

The resulting numpy arrays will be essentially the same dimensions as above, except there is one more dimension added to stack multiple images

all_vr_digits, all_pts_digits, all_pts_signs, all_user_ranks = load_images( glob.glob( 'images-redacted\\*.jpg'))

type(all_vr_digits), all_vr_digits.shape # height,width,ndigit,nplayer,nimages

(numpy.ndarray, (23, 15, 5, 12, 747))

This also returns a list of the user's ranks for each race (rank goes from 0-11, not 1-12)

type(all_user_ranks), len(all_user_ranks)

(list, 747)

Label your own training data

You don't need to do this, but it's here if you want to for some reason. You can label your own data any way you like, but I used pigeon and thought it was pretty convenient. The MKImageLabeler class is provided to help with this:

from MKImageLabeler import MKImageLabeler
mklbl = MKImageLabeler( glob.glob( 'images-redacted\\*.jpg'))

Label the digits for VR ratings (capped at 5 samples just for the example)

mklbl.label_vr_digits( 5)

Convert your labeled data to a pandas DataFrame

df_vr_digits = mklbl.vr_digits_as_df()
df_vr_digits

	digit	path	label
0	0	images-redacted\2020082314144200-16851BE00BC60...	1
1	1	images-redacted\2020082314144200-16851BE00BC60...	3
2	2	images-redacted\2020082314144200-16851BE00BC60...	4
3	3	images-redacted\2020082314144200-16851BE00BC60...	5
4	4	images-redacted\2020082314144200-16851BE00BC60...	3

df_vr_digits.to_excel( 'vr_digits_labeled_example.xlsx', index=False)

Repeat for the digits and signs of points awarded

mklbl.label_pts_digits( 5)

df_pts_digits = mklbl.pts_digits_as_df()
df_pts_digits

	rank	digit	path	label
0	0	0	images-redacted\2020082314144200-16851BE00BC60...	2
1	0	1	images-redacted\2020082314144200-16851BE00BC60...	0
2	1	0	images-redacted\2020082314144200-16851BE00BC60...	1
3	1	1	images-redacted\2020082314144200-16851BE00BC60...	6
4	2	0	images-redacted\2020082314144200-16851BE00BC60...	2

mklbl.label_pts_signs( 5)

df_pts_signs = mklbl.pts_signs_as_df()
df_pts_signs

	rank	path	label
0	0	images-redacted\2020082314144200-16851BE00BC60...	+
1	1	images-redacted\2020082314144200-16851BE00BC60...	+
2	2	images-redacted\2020082314144200-16851BE00BC60...	+
3	3	images-redacted\2020082314144200-16851BE00BC60...	+
4	4	images-redacted\2020082314144200-16851BE00BC60...	-

License

TBD

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
images-redacted		images-redacted
readme-img		readme-img
MKDataCompiler.py		MKDataCompiler.py
MKImageClassifier.py		MKImageClassifier.py
MKImageLabeler.py		MKImageLabeler.py
MKImageLoader.py		MKImageLoader.py
README.md		README.md
pts_digits_labeled.xlsx		pts_digits_labeled.xlsx
pts_signs_labeled.xlsx		pts_signs_labeled.xlsx
vr_digits_labeled.xlsx		vr_digits_labeled.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MK-OCR

Table of Contents

What/Why

How

Requirements

Usage

High-level functions

Using the individual classifiers

Train the SVM for the points-awarded digits

Plot the low-dimensional training data features (i.e. scores from factor analysis)

Fit the other two classifiers

Load a bunch of images to classify (using MKImageLoader, more on that below)...

...and run the loaded data through each classifier

Take a look at the predictions

Work with image data directly

View the winner's pixels

View your own pixels

Extract every VR rating as a numpy array of image data (height-by-width-by-digit-by-rank)

Inspect individual VR digits from numpy array

Do the same with the sign and digits of points awarded

Load a whole folder full of images

This also returns a list of the user's ranks for each race (rank goes from 0-11, not 1-12)

Label your own training data

Label the digits for VR ratings (capped at 5 samples just for the example)

Convert your labeled data to a pandas DataFrame

Repeat for the digits and signs of points awarded

License

About

Releases

Packages

Languages

probablyfine/MK-OCR

Folders and files

Latest commit

History

Repository files navigation

MK-OCR

Table of Contents

What/Why

How

Requirements

Usage

High-level functions

Using the individual classifiers

Train the SVM for the points-awarded digits

Plot the low-dimensional training data features (i.e. scores from factor analysis)

Fit the other two classifiers

Load a bunch of images to classify (using MKImageLoader, more on that below)...

...and run the loaded data through each classifier

Take a look at the predictions

Work with image data directly

View the winner's pixels

View your own pixels

Extract every VR rating as a numpy array of image data (height-by-width-by-digit-by-rank)

Inspect individual VR digits from numpy array

Do the same with the sign and digits of points awarded

Load a whole folder full of images

This also returns a list of the user's ranks for each race (rank goes from 0-11, not 1-12)

Label your own training data

Label the digits for VR ratings (capped at 5 samples just for the example)

Convert your labeled data to a pandas DataFrame

Repeat for the digits and signs of points awarded

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages