A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
-
Updated
Aug 12, 2020 - Java
A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.
This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGGLE where everyone is aware of, we have downloaded loan, customers credit card and transactions datasets . After downloading the datsaets we have cleaned the data . Then after by using new tools and technologies…
BigData Engineering Capstone Project with Tech-stack : Linux, MySQL, sqoop, HDFS, Hive, Impala, SparkSQL, SparkML, git
Apache Sqoop tutorial
MapReduce Job Development, RDDs Programming, Medical Data Management, Sales Analysis, And Efficient Data Integration For Big Data Analysis. Spark: Big Data Processing, SQOOP Integration, And Spark Structured Streaming For Real-Time Data.
Built a data pipeline by creating tables in MySQL DB, ingested tables to Hadoop for data warehousing and built HiveQL views. Hive views in Linux VM were connected to Power BI application in Windows to create visualizations.
Created a utility to import data from traditional databases to hdfs using sqoop and implemented using bash
Real-Time & Batch Data Processing Pipeline
ETL Pipeline for Spar Nord Bank for the analysis of refilling frequency of the ATM's all over the europe
This repository consists of the source code and the screenshots of the output. This project uses Hive, SQL, and Sqoop to perform analysis.
heart data analysis
Import data into the Hive using Sqoop.
Build a data pipeline (using hadoop-hdfs, sqoop, hiveql) for data analysis out of an ambiguous and incomplete instruction.
A query system for a hypothetical bank scenario
Add a description, image, and links to the sqoop-import topic page so that developers can more easily learn about it.
To associate your repository with the sqoop-import topic, visit your repo's landing page and select "manage topics."