Repository housing the artifacts to deploy the Hadoop clusters on Azure for my learning.
Structure of the repository.
-
artifacts
contains the datasets or any other required artifacts for the learning.
-
deploy
contains the code required to deploy the HDInsight cluster on Azure (evolving).
-
notebooks
contains vairous jupyter notebooks.
-
scripts
collection of scripts required for bootstrapping HDInsight cluster.
-
tests
contains Infrastructure tests to validate the cluster is ready.
- Follow the Infra as Code approach to setting up the clusters.
- Add inra tests to make sure the cluster is operational.