Capstone Project Repository for DS-535: Advance Data Mining
Project Goal: find out the determining factor that drives the demand on bike share rentals, construct statistical models and then try to make prediction on rentals based on the information and models I have.
Moshood F. Yussuf
Ride sharing companies like Uber and Lyft are great business models that provide convenient, affordable and efficient transportation options for customers who want to go to places without the hassle of owning or operating a vehicle. However, with the increasing number of automobiles, riding sharing in cars are not efficient enough especially in crowded and busy areas like cities' downtown. Therefore, bike sharing is a brilliant idea which provides people with another short range transportation option that allows them to travel without worrying about being stuck in traffic and maybe enjoy city view or even workout at the same. In fact, bike sharing programs in the United States started about 15 years before Uber's ride share program started.
The dataset was gotten from Kaggle. The dataset I will be using contains 5000 observations and 12 variables. The variables are:
- "datetime", containing hourly date in timestamp format;
- "season", containing integers 1 to 4 representing "Winter", "Spring","Summer","Fall";
- "holiday", containing Boolean expressions in 1s and 0s representing whether the day of the observation is a holiday or not;
- "workingday", containing Boolean expressions in 1s and 0s representing whether the day of the observation is a working day or not;
- "weather", containing integers 1 to 4 representing four different lists of weather conditions:
- Clear or cloudy,
- Mists,
- Light rain or snow,
- Heavy rain, snow or even worse weather.
- "temp", containing values of temperature at the given time;
- "atemp", containing values of feeling temperature at the given time;
- "humidity", containing values of relative humidity level at the given time, in the scale of 1 to 100;
- "windspeed", containing values of wind speed, in mph (miles per hour);
- "casual", containing the count of non-registered user rentals, across all stations;
- "registered", containing the count of registered user rentals, across all stations;
- "count", containing the total count of rentals at the given hour, across all stations.