A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
-
Updated
Jan 11, 2024 - Java
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Data Engineering Project with Hadoop HDFS and Kafka
BigData Engineering Capstone Project with Tech-stack : Linux, MySQL, sqoop, HDFS, Hive, Impala, SparkSQL, SparkML, git
MapReduce Python Example
Simulation of a Hadoop distributed file system
News Sentiment Analysis using ETL pipeline
A Hadoop Wordcounter Job - Retrieves tweets and runs a MapReduce wordcounter for sentimental analysis
WIP: hdfs/libhdfs drop-in replacements without Java
Yelp data analysis using HBase Java API and building a QA application
The Ararajuba script aims to identify whether Optimized Row Columnar files in the Hadoop Distributed File System are corrupted, for this purpose it uses the count method and analyzes the difference in schemas in the tables.
This project aims to address Egypt's energy challenges by leveraging data-driven solutions. With increasing demand from urban centers and industries, conventional approaches such as random power cuts have proven ineffective. To tackle this issue, we are adopting a proactive strategy grounded in data analytics.
This is old repository from my archive . Hope this might help me in near future
Big Data project. Web client for HDFS. Working in the terminal. Has ability to manipulate local and Hadoop storage
Map-Reduce paradigm in Apache Hadoop for KNN algorithm based on Kaggle Titanic Dataset
Add a description, image, and links to the hdfs-dfs topic page so that developers can more easily learn about it.
To associate your repository with the hdfs-dfs topic, visit your repo's landing page and select "manage topics."