MapReduce Analysis on Amazon Food Review Dataset (Big-Data)
-
Updated
Aug 6, 2017
MapReduce Analysis on Amazon Food Review Dataset (Big-Data)
Assignments belonging to the course Supercomputing for Big Data (ET4310) at TU Delft
Lab to familiarize yourself with Amazon Elastic MapReduce (EMR)
My AWS Playground
Completed a big data project using Hadoop, HBase, and Sqoop to ingest, process, and analyze a large dataset of taxi ride data on an AWS EMR cluster. Developed MapReduce codes to perform a variety of tasks. Exported the results of each MapReduce task to an RDS instance for visualization and analysis.
This project is to analyse amazon reviews as provided by aws
This repository contains the projects that I did for the Data Engineering Nanodegree by Udacity.
We Build an ETL pipeline using Airflow that accomplishes the following: Downloads data from an AWS S3 bucket, Runs a Spark/Spark SQL job on the downloaded data producing a cleaned-up dataset of delivery deadline missing orders and then Upload the cleaned-up dataset back to the same S3 bucket in a folder primed for higher level analytics
Utilize Apache Spark for ETL processes to prepare data, followed by the construction of a Machine Learning model for Natural Language Processing (NLP) classification. Subsequently, deploy the model within a Gradio web application for seamless interaction.
Generic python library that enables to provision emr clusters with yaml config files (Configuration as Code)
Code and documentation for the demonstration example of the real-time bushfire alerting with the Complex Event Processing (CEP) in Apache Flink on Amazon EMR and a simulated IoT sensor network as described on the AWS Big Data Blog: Real-time bushfire alerting with Complex Event Processing in Apache Flink on Amazon EMR and IoT sensor network.
Goal: Develop Machine Learning aplication in a distributed environment using AWS services with Spark.
Data Science and Engineering project - Programming for Big Data @ Simon Fraser University (SFU)
This repo contains all the assignments, project work on Engineering Big Data Systems coursework
Add a description, image, and links to the aws-emr topic page so that developers can more easily learn about it.
To associate your repository with the aws-emr topic, visit your repo's landing page and select "manage topics."