A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
-
Updated
Jan 11, 2024 - Java
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
A basic introductory example of hadoops mapreduce libraries to load and analyse large datasets in this case a US patent dataset sourced from https://www.nber.org/research/data/us-patents
This repository contains a simple Hadoop-like (MapReduce) distributed computing platform implemented in Java. It is extended from a course project at UIUC awarded the best Java version implementation and it's open-sourced for reference.
PageRank algorithm written in Java MapReduce framework
MapReduce in Cluster.
Titanic data analysis with Hadoop
Code samples, summaries, cheatsheets and other study material for Hadoop MapReduce and Apache Spark
Twitter data analysis using hadoop (hdfs), flume, map-reduce and hive. Sentiment Analysis is also done using affin dictionary for tweets related to Indian election.
Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.
Add a description, image, and links to the hadoop-framework topic page so that developers can more easily learn about it.
To associate your repository with the hadoop-framework topic, visit your repo's landing page and select "manage topics."