A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
-
Updated
Jan 11, 2024 - Java
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.
This repository contains a simple Hadoop-like (MapReduce) distributed computing platform implemented in Java. It is extended from a course project at UIUC awarded the best Java version implementation and it's open-sourced for reference.
Code samples, summaries, cheatsheets and other study material for Hadoop MapReduce and Apache Spark
Twitter data analysis using hadoop (hdfs), flume, map-reduce and hive. Sentiment Analysis is also done using affin dictionary for tweets related to Indian election.
PageRank algorithm written in Java MapReduce framework
A basic introductory example of hadoops mapreduce libraries to load and analyse large datasets in this case a US patent dataset sourced from https://www.nber.org/research/data/us-patents
MapReduce in Cluster.
Titanic data analysis with Hadoop
Add a description, image, and links to the hadoop-framework topic page so that developers can more easily learn about it.
To associate your repository with the hadoop-framework topic, visit your repo's landing page and select "manage topics."