This is the final project I had to do to finish my Big Data Expert Program in U-TAD in September 2017. It uses the following technologies: Apache Spark v2.2.0, Python v2.7.3, Jupyter Notebook (PySpark), HDFS, Hive, Cloudera Impala, Cloudera HUE and Tableau.
python
linux
big-data
spark
apache-spark
hive
hadoop
jupyter
analytics
functional-programming
impala
cloudera
jupyter-notebook
pyspark
hdfs
mapreduce
hue
tableau
datawarehouse
datamart
-
Updated
May 4, 2018 - Jupyter Notebook