Skip to content

krithivas91/Flight-Data-Analysis-AWS-MapReduce

Repository files navigation

Description

Hadoop was configured in fully distributed mode on EC2 instances of AWS. In this project, I have deployed a MapReduce application on AWS that takes 22years of Flight data as input to perform analysis. We have used oozie workflow engine on top of Hadoop for coordination.

More information can be found in the project report available. 


OUTPUT
—————————————————————————————————————————————
Airlines

Highest probabilty of airlines on schedule	0.0
HA	0.7494274
AQ	0.62106043
DH	0.6090783

Lowest probabilty of airlines on schedule	0.0
PI	0.32256523
PS	0.4177755
HP	0.45224547


Airports

Airports with Longest taxi-in	0.0
CKB	183.0
LNY	88.01384
MTH	14.65625

Airports with Shortest taxi-in	0.0
BFF	2.0
PVU	2.5
DUT	2.5461006

Airports with Longest taxi-out	0.0
ACK	32.30936
SOP	26.157728
BQN	25.006365

Airports with Shortest taxi-out	0.0
MKK	4.5104165
KSM	6.1116753
VIS out	6.2633705


Cancellation

The most common reason for flights cancellations	0.0
Carrier	317971.0

About

Analytics on 22years of Flight Data- AWS/MapReduce

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages