GitHub - andrewsimplebet/expected_goals_deep_dive: A tutorial on using cross validation and calibrating predictions for expected goals models in soccer

Expected Goals Deep Dive:

In this repository, I will build upon the brilliant work done by David Sumpter and @clekraus on expected goals as part of the Friends of Tracking series.

David's tutorials: https://www.youtube.com/channel/UCUBFJYcag8j2rm_9HkrrA7w/videos

David's code: https://github.com/Friends-of-Tracking-Data-FoTD/SoccermaticsForPython

@cleKraus' code: https://github.com/CleKraus/soccer_analytics

The repo will consist of 4 Jupyter notebooks, plus a notebook for reading in the Wyscout data and hand-constructing features that I believe will be helpful in building the model.

Notebook 1: Building a Random Forest Model for Expected Goals: In this notebook, we examine how to avoid overfitting/mistakes, learn how to cross validate our models and determine the best hyperparameters. Finally, we take a small section at the end to determine which features are important in a Random Forest.

Link to Notebook 1: https://github.com/andrewsimplebet/expected_goals_deep_dive/blob/master/1.%20Random%20Forest%20Cross%20Validation%20And%20Hyperparameter%20Tuning.ipynb

Notebook 2: Comparing Logistic Regression and Random Forest For Expected Goals: In this notebook, we take a look at building a logistic regression model, with an emphasis on comparing strengths/weaknesses and paint points to a Random Forest. We then take a look at why traditional data science methodologies used thus far are not sufficient to building a strong expected goals model, and preview what will come in the rest of the series.

Link to Notebook 2: https://github.com/andrewsimplebet/expected_goals_deep_dive/blob/master/2.%20Basic%20Logistic%20Regression%20and%20Comparison%20To%20Random%20Forests.ipynb

Notebook 3: Calibrating Expected Goals Models: When our model spits out that a shot has a 30% chance it goes in, how do we know that that's accurate? In this notebook, we examine how to determine if our model's predicted probabilities are calibrated and tools to correct our model if we find that this is not the case.

Link to Notebook 3: https://github.com/andrewsimplebet/expected_goals_deep_dive/blob/master/3.%20Calibrating%20Expected%20Goals%20Models.ipynb

Notebook 4: Sanity Checking Our Expected Goals Model and Final Thoughts: In this notebook, we look at some specific shots to get a better sense of our expected goals models, and discuss the drawbacks of our model and ways to improve it in the future

Link to Notebook 4: https://github.com/andrewsimplebet/expected_goals_deep_dive/blob/master/4.%20Sanity%20Checking%20Our%20Expected%20Goals%20Models%20And%20Final%20Thoughts.ipynb

Data: For information on how to download the Wyscout data, please refer to David's video: Setting Up To Do Data Science. In addition, please download the player and matches data, as these tables are used in various spots in the series. Please put the Wyscout data in the "Wyscout" folder of the repo to ensure that the code runs properly.

Special thanks to Wyscout for the data and FCPython for the code!

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Wyscout		Wyscout
.gitattributes		.gitattributes
0. Setting Our Data Up.ipynb		0. Setting Our Data Up.ipynb
1. Random Forest Cross Validation And Hyperparameter Tuning.ipynb		1. Random Forest Cross Validation And Hyperparameter Tuning.ipynb
2. Basic Logistic Regression and Comparison To Random Forests.ipynb		2. Basic Logistic Regression and Comparison To Random Forests.ipynb
3. Calibrating Expected Goals Models.ipynb		3. Calibrating Expected Goals Models.ipynb
4. Sanity Checking Our Expected Goals Models And Final Thoughts.ipynb		4. Sanity Checking Our Expected Goals Models And Final Thoughts.ipynb
FCPython.py		FCPython.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

andrewsimplebet/expected_goals_deep_dive

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages