Skip to content

Modelling with Tidymodels and Parsnip - A Tidy Approach to a Classification Problem

Notifications You must be signed in to change notification settings

DiegoUsaiUK/Classification_Churn_with_Parsnip

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modelling with Tidymodels and Parsnip

A Tidy Approach to a Classification Problem

22 June 2019

Recently I have completed the Business Analysis With R online course focused on applied data and business science with R, which introduced me to a couple of new modelling concepts and approaches. One that especially captured my attention is parsnip and its attempt to implement a unified modelling and analysis interface (similar to python's scikit-learn) to seamlessly access several modelling platforms in R.

parsnip is the brainchild of RStudio's Max Khun (of caret fame) and Davis Vaughan and forms part of tidymodels, a growing ensemble of tools to explore and iterate modelling tasks that shares a common philosophy (and a few libraries) with the tidyverse.

Although there are a number of packages at different stages in their development, I have decided to take tidymodels "for a spin", so to speak, and create and execute a "tidy" modelling workflow to tackle a classification problem. My aim is to show how easy it is to fit a simple logistic regression in R's glm and quickly switch to a cross-validated random forest using the ranger engine by changing only a few lines of code.

For this post in particular I'm focusing on four different libraries from the tidymodels suite: rsample for data sampling and cross-validation, recipes for data preprocessing, parsnip for model set up and estimation, and yardstick for model assessment.

Links

You can find the final article on my website

I've also published the article on Towards Data Science

Releases

No releases published

Packages

No packages published