Skip to content

anandaverma/pigeon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pigeon - An Event Driven Job Control Framework for Hadoop to help Chain Multiple MapReduce jobs

Overview

Hadoop MapReduce is a parallel computation framework for processing large and distributed data sets. In many cases, users want to chain multiple MapReduce jobs to accomplish complex tasks. Usually, such complex tasks are data-driven with data funneled through a sequence of jobs. In this project we have implemented a distributed notification system for Hadoop to help chain multiple MapReduce jobs based on events occurring in Hadoop cluster.

##Prerequisites To run Pigeon publisher and subscriber you will need to download [ActiveMQ] (http://activemq.apache.org/download.html). We used ActiveMQ 5.8.0 Release to build and test our application.

##Steps to Run the Pigeon

  • Run the ActiveMQ service on all the machines:

~path-to-ActiveMQ-bin-dir~$ ./activemq start

  • Run the subscriber on all machines, listening for event notification:

~path-to-pigeon-bin-dir~$ ./pigeon.sh subscribe <topicname> <jobscript> <tcp://host:port>

  • Run the publisher on machine which publishes event notification message:

~path-to-pigeon-bin-dir~$ ./pigeon.sh publish <topicname> <eventmessage> <tcp://host:port>

About

Distributed Event Notification System for Hadoop

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published