- To gain a deeper understanding of player play styles. Many players in the NBA play hybrid positions, such as a mix of point guard and small forward. We want to be able to mathematically describe their play style. I.E. 30% point guard and 70% small forward.
- This project was spun out of my broader efforts to predict fantasy basketball points. I found the player labels assigned by the league only captures the position the player plays most often. Thus, when using these labels in other prediction tasks, valuable information about the player was lost. I want a probability distribution over all labels instead.
- ~77% top 1 accuracy, ~91% top 2 accuracy, ~99% top 3 accuracy
- Language: Python, Bash
- Packages: SK-Learn, Pandas, Flask
- Technologies: Docker
- Models tried: Decision Tree (Baseline), XGBoost, Random Forest, Multinomial Logistic Regression, DBSCAN /w PCA, Kernel SVM ( Gaussian RBF, polynomial with degrees two and three)
- Built web scrapers to collect NBA and ESPN data
- Exploratory data analysis to look at the structure of the data and what models are good candidates
- Deployed as API via Flask in Docker