Skip to content

Journey through the world of data science from beginner to advanced concepts and methodologies. In this repository, I will walk you through DS skills, hints, tips, and tricks all with Sports relevant examples!

Notifications You must be signed in to change notification settings

ant-vessicchio/learn_data_science_through_sports

Repository files navigation

Learn Data Science Through Sports

Journey through the world of data science from beginner to advanced concepts and methodologies. In this repository, I will walk you through DS skills, hints, tips, and tricks all with Sports relevant examples!

I will provide walkthroughs, notebooks, and code snippets weekly using some of my favorite examples from the sports world. We will go over concepts such as Python fundamentals, SQL fundamentals, Data Visualization, Data Cleaning, Data Mining, Algorithm development, automation, machine learning, AI, and more!

Love sports? Want to get into the world of data science? You've come to the right place!

Please connect with me on LinkedIn: https://www.linkedin.com/in/anthony-vessicchio/

The official Subreddit of this course: https://www.reddit.com/r/sportsanddatascience/

And check out my personal website: https://ant-vessicchio.github.io/

Before we Begin

The following curriculum is hosted in a Jupyter notebook environment. If you are not familiar with this platform/do not have it installed, follow this link: https://www.dataquest.io/blog/jupyter-notebook-tutorial/

All of my notebooks in the curriculum are .ipynb files (Jupyter notebook files). Once you have this environment set up, you can easily follow along with the entire curriculum!

The Curriculum

(Read through each of my descriptions and run each cell yourself. Try to make some modifications to each cell and enter some of your own content ideas!)

Rookie 1.0 (Your first steps in Python)

  1. Introduction to Python

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Introduction_To_Python.ipynb

  1. Variables and Naming

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Variables%20and%20Naming.ipynb

  1. Data Types

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Data_Types.ipynb

  1. Data Types Advanced

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Data_Types_Advanced.ipynb

Rookie 2.0 (Let's start with some logic!)

  1. If/Else Statements

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/If_Else.ipynb

  1. While Loops

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/While_Loops.ipynb

  1. For Loops

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/For_Loops.ipynb

  1. Python Functions

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Python_Functions.ipynb

  1. String Formatting

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/String_Formatting.ipynb

Rookie Challenge 1 (NFL Combine)

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Beginner_Challenge.ipynb

Rookie 3.0 (Let's "Look" at Some Data using Matplotlib)

  1. Plotting

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Plotting_in_Matplotlib.ipynb

  1. Scatterplots and Bar Graphs

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Scatterplots_and_Bar_Graphs.ipynb

  1. Histograms and Pie Charts

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Histograms_and_Pie_Charts.ipynb

  1. Rookie Final Challenge (Telling a story about a baseball roster)

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Rookie_Final_Challenge.ipynb

Congrats on completing the Rookie Section!

By now, you can tell how Python can be a great foundational tool in your Sports Data Science journey. The next portion of the curriculum is going to be very "learn by example" based. I feel that this is not only an engaging way to learn, but I feel like this is the best way to get the creative juices flowing!

I will be introducing new concepts, modules, and methodologies in the following examples without specifically dedicating a learning section to them. With repetition and "doing", these concepts will naturally stick! You will also begin to develop an analytical mindset and become familiar with some very common data science processes and workflows. I will also include a list of skills you will learn under each project name!

Novice Section (Let's begin on some real life projects)

(MLB Databank from 1871-2015)

The following Novice Projects will be centered around a databank composed of several dataset files containing various information on baseball players, teams and games from 1871 to 2015. The first part will serve as an "exploratory" look into the data and teach you some valuable workflow skills when dealing with a dataset you've never seen before. As we move further into this section, the Parts will become more difficult (but also more useful, applicable, and creative!)

Skills Used: Ingesting Datasets from Kaggle, Reading csv files into Dataframes, Exploring Dataframes, .loc and .iloc (accessing Dataframe elements), extracting data from specific columns, merging tables, intro to cleaning data, simple visualizations from a cleaned dataset, feature engineering, thinking critically about your data, connecting the industry/problem statement to tell a relevant story.

Novice Project 1.0 (Exploration)

https://github.com/ant-vessicchio/learn_data_science_through_sports/tree/main/Baseball_Databank_Exploration

Novice Project 2.0 (Was Babe Ruth Really That Good?)

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Baseball_Databank_Exploration/Babe_Ruth_Evaluation.ipynb

Pro Section (Your Intro to Machine Learning, Everyone's Favorite Buzz Word)

In this section I will be doing various examples and exercises illustrating different machine learning techniques and methodologies. I believe it is imoortant to understand at least some of the math and fundamentals behind the models so I will try to balance the technical terminology with the sports application in each example. Although this is a BIG step up in your Data Science journey, I believe this is where it becomes the most fun as it opens your toolbox up to take on so many different challenges. So sit tight and let's get it!

Pro Project 1.0 (Your First Decision Tree: Will a player get drafted or not based on their NFL Combine performance?)

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/NFL_Combine/NFL_Combine_1st_Decision_Tree.ipynb

All Star Section (Putting it all together, advanced machine learning, visualizations, and more!)

About

Journey through the world of data science from beginner to advanced concepts and methodologies. In this repository, I will walk you through DS skills, hints, tips, and tricks all with Sports relevant examples!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published