Data Analysis about the development of the Linux operating system by exploring its Git repository history.
-
Updated
Dec 11, 2018 - Jupyter Notebook
Data Analysis about the development of the Linux operating system by exploring its Git repository history.
Cruise Reviews - NLP - Text Classification
Project No. 4 in the Udacity Data Analyst Nanodegree Winter 2019-2020. Using Python, we’ll gather data from a variety of sources, assess its quality and tidiness, then clean it. We’ll document our wrangling efforts in a Jupyter Notebook, plus showcase them through analyses and visualizations using Python and SQL.
This repository creates an ETL pipeline which takes in movie data from Kaggle and Wikipedia. The ETL_create_database.ipynb file contains all the code necessary to perform all three steps.
Wrangling and analyzing we rate dogs twitter account which rates people's dogs with a humorous comment about the dog.
Data wrangling project from Udacity professional data analysis Nanodegree
Customer Segmentation is one of crucial analysis for business Marketing Strategy. In this dataset, from a raw customer purchasing history, use Python to clean, explore and prepare for further analysis. I applied 2 different approaches of Customer Segmentation: traditional and RFM.
Energy management, grid dependability, and the distribution of sustainable resources all depend heavily on understanding and forecasting energy demand trends. This project is extremely important in a number of areas.
R / Shiny - Clean, merge and visualize into Shiny a BWIN Datamart.
Udacity Data Analyst Nanodegree - Project IV
Dirty data project completed as part of Data Analysis course 🎓
Repo to show some basics techniques of data wrangling
This repository provides a Jupyter notebook on basic data cleaning and exploratory data analysis process with a CSV file that was scrapped from a real estate website in Belgium.
The Play Store apps data has enormous potential to drive app-making businesses to success. Actionable insights can be drawn for developers to work on and capture the Android market!
Analyze Diwali Sales data using Pandas, NumPy, Matplotlib, and Seaborn Libraries to Improve customer experience and also sales.
End-to-end projects: customer churning prediction using the Random Forest Classifier Algorithm with 97% accuracy; performing pre-processing steps; EDA and Visulization fitting data into the algorithm; and hyper-parameter tuning to reduce TN and FN values to perform our model with new data. Finally, deploy the model using the Streamlit web app.
TL:DR - I've done this project in order to excercise Data Cleaning Through SQL which was published and created by Shuki Molk. I'm adding here a link to the excercise
This dataset analyses roughly of 380,000 Kickstarter projects. It will lead you through a simple data exploration with excel to reveal interesting insights in Kickstarter projects and what attributes are important when it comes to examining the success (or failure) of a certain project.
Add a description, image, and links to the cleaning-data topic page so that developers can more easily learn about it.
To associate your repository with the cleaning-data topic, visit your repo's landing page and select "manage topics."