GitHub - Emmanuel-Ncube/Udacity-Data-Analyst-Nanodegree: Projects in fulfillment of the Udacity Data Analyst Nanodegree

Udacity Data Analyst Nanodegree

This repository serves as a showcase of my skills, a platform to share my projects, and a way to track my progress in Data Analytics and Data Science-related topics.

Installation

This project uses Python 3 and is designed to be completed through the Jupyter Notebooks IDE. It is highly recommended that you use the Anaconda distribution to install Python, since the distribution includes all necessary Python libraries as well as Jupyter Notebooks. The following libraries are expected to be used in this project:

Numpy
Pandas
Matplotlib
Seaborn
tweepy
json
request

Portfolio Projects

In this section I will list data analytics projects briefly.

Project 1 - Investigate A Dataset

We shall analyse a data set which contains information about 10000 movies collected from The Movie Database (TMDb), including user ratings and revenue. The dataset is from kaggle and contains information about 10,866 movies collected from The Movie Database (TMDb), including popularity, revenue, budget, cast and genres.

Project 2 - Wrangle and Analyze Data

The dataset that I worked on, wrangling (and analyzing and visualizing) is the tweet archive of Twitter user @dog_rates, known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage.

Goal: Wrangle WeRateDogs Twitter data to create interesting and trustworthy analyses and visualizations. The Twitter archive is great, but it only contains very basic tweet information. Additional gathering, then assessing and cleaning is required for "Wow!"-worthy analyses and visualizations.

Project 3 - Communicate Data Findings

This ProsperLoan Dataset contains 113,937 loans with 81 variables on each loan, including loan amount, prosper ratings, estimated loss, prosper and credit scores borrower rate (or interest rate), current loan status, borrower income, and many others. This data dictionary explains the features in the data set. The project objective is not expected to explore all of the variables in the dataset! But focus on only exploration on about 10-15 of visualizations.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
P1 - Investigate-A-Dataset		P1 - Investigate-A-Dataset
P2 - Wrangle-And-Analyze-Data		P2 - Wrangle-And-Analyze-Data
P3 - Communicate-Data-Findings		P3 - Communicate-Data-Findings
README.md		README.md
Udacity Certificate.pdf		Udacity Certificate.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Udacity Data Analyst Nanodegree

Table of contents

Installation

Portfolio Projects

Project 1 - Investigate A Dataset

Project 2 - Wrangle and Analyze Data

Project 3 - Communicate Data Findings

About

Releases

Packages

Languages

Emmanuel-Ncube/Udacity-Data-Analyst-Nanodegree

Folders and files

Latest commit

History

Repository files navigation

Udacity Data Analyst Nanodegree

Table of contents

Installation

Portfolio Projects

Project 1 - Investigate A Dataset

Project 2 - Wrangle and Analyze Data

Project 3 - Communicate Data Findings

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages