The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center and pgAdmin. This cluster is solely intended for usage in a development environment. Do not use it to run any production workloads.
airflow
kafka
spark
cassandra
hive
hadoop
schema-registry
postgresql
python3
pyspark
hdfs
flume
hue
zeppelin
pgadmin4
kadmin
sqoop
conda-environment
control-center
-
Updated
Feb 27, 2023 - Shell