Journey through the world of data science from beginner to advanced concepts and methodologies. In this repository, I will walk you through DS skills, hints, tips, and tricks all with Sports relevant examples!
I will provide walkthroughs, notebooks, and code snippets weekly using some of my favorite examples from the sports world. We will go over concepts such as Python fundamentals, SQL fundamentals, Data Visualization, Data Cleaning, Data Mining, Algorithm development, automation, machine learning, AI, and more!
Love sports? Want to get into the world of data science? You've come to the right place!
Please connect with me on LinkedIn: https://www.linkedin.com/in/anthony-vessicchio/
The official Subreddit of this course: https://www.reddit.com/r/sportsanddatascience/
And check out my personal website: https://ant-vessicchio.github.io/
The following curriculum is hosted in a Jupyter notebook environment. If you are not familiar with this platform/do not have it installed, follow this link: https://www.dataquest.io/blog/jupyter-notebook-tutorial/
All of my notebooks in the curriculum are .ipynb files (Jupyter notebook files). Once you have this environment set up, you can easily follow along with the entire curriculum!
(Read through each of my descriptions and run each cell yourself. Try to make some modifications to each cell and enter some of your own content ideas!)
- Introduction to Python
- Variables and Naming
- Data Types
- Data Types Advanced
- If/Else Statements
- While Loops
- For Loops
- Python Functions
- String Formatting
- Plotting
- Scatterplots and Bar Graphs
- Histograms and Pie Charts
- Rookie Final Challenge (Telling a story about a baseball roster)
By now, you can tell how Python can be a great foundational tool in your Sports Data Science journey. The next portion of the curriculum is going to be very "learn by example" based. I feel that this is not only an engaging way to learn, but I feel like this is the best way to get the creative juices flowing!
I will be introducing new concepts, modules, and methodologies in the following examples without specifically dedicating a learning section to them. With repetition and "doing", these concepts will naturally stick! You will also begin to develop an analytical mindset and become familiar with some very common data science processes and workflows. I will also include a list of skills you will learn under each project name!
The following Novice Projects will be centered around a databank composed of several dataset files containing various information on baseball players, teams and games from 1871 to 2015. The first part will serve as an "exploratory" look into the data and teach you some valuable workflow skills when dealing with a dataset you've never seen before. As we move further into this section, the Parts will become more difficult (but also more useful, applicable, and creative!)
Skills Used: Ingesting Datasets from Kaggle, Reading csv files into Dataframes, Exploring Dataframes, .loc and .iloc (accessing Dataframe elements), extracting data from specific columns, merging tables, intro to cleaning data, simple visualizations from a cleaned dataset, feature engineering, thinking critically about your data, connecting the industry/problem statement to tell a relevant story.
In this section I will be doing various examples and exercises illustrating different machine learning techniques and methodologies. I believe it is imoortant to understand at least some of the math and fundamentals behind the models so I will try to balance the technical terminology with the sports application in each example. Although this is a BIG step up in your Data Science journey, I believe this is where it becomes the most fun as it opens your toolbox up to take on so many different challenges. So sit tight and let's get it!