Skip to content
This repository has been archived by the owner on Oct 11, 2023. It is now read-only.

Latest commit

 

History

History
107 lines (76 loc) · 7.36 KB

File metadata and controls

107 lines (76 loc) · 7.36 KB

Prerequisites and setup steps

In this section of the tutorial on pixel-level land classification from aerial imagery, we describe the steps needed to create an Azure Batch AI cluster with access to all necessary files to complete this tutorial. Once you have completed this section, you'll be ready to train a model from scratch using our sample data and provided scripts.

Prerequisites

Azure Subscription

This tutorial will require an Azure subscription with sufficient quota to create a storage account and two NC6 (single-GPU) VMs as a Batch AI cluster. This tutorial will likely take two hours to complete on the first pass.

Files from this repository

You will need local copies of the .json files included in this git repository. We recommend that you download or clone the full repository locally, but you can also download each file individually. (If you choose that approach, be careful to download the "raw" files -- it's common to accidentally save GitHub's HTML previews of the files instead.)

Utilities

This tutorial requires the following programs:

These programs are available for Windows and Linux. If you prefer not to install these programs locally, you may instead provision an Azure Data Science Virtual Machine. (Both programs are pre-installed on these VMs and available on the system path.) The commands included in this tutorial were written and tested in Windows, but readers will likely find it straightforward to adapt for Linux.

Once these programs are installed, open a command line interface and check that the binaries are available on the system path by issuing the commands below:

az
azcopy

Prepare to use the Azure CLI

In your command line interface, execute the following command. The output will contain a URL and token that you must visit to authenticate your login.

az login

You will now indicate which Azure subscription should be charged for the resources you create in this tutorial. List all Azure subscriptions associated with your account:

az account list

Identify the subscription of interest in the JSON-formatted output. Copy its "id" value into the bracketed expression in the command below, then issue the command to set the current subscription.

az account set -s [subscription id]

Register the Batch/BatchAI providers and grant Batch AI "Network Contributor" access on your subscription using the following commands. Note that you will need to copy your subscription's id into the bracketed expression before executing the command.

az provider register -n Microsoft.Batch
az provider register -n Microsoft.BatchAI
az role assignment create --scope /subscriptions/[subscription id] --role "Network Contributor" --assignee 9fcb3732-5f52-4135-8c08-9d4bbaf203ea

It may take ~10 minutes for the provider registration process to complete. You may proceed with the tutorial in the meantime.

Create the necessary Azure resources

Create an Azure resource group

We will create all resources for this tutorial in a single resource group, so that you may easily delete them when finished. Choose a name for your resource group and insert it into the bracketed expression below, then issue the commands:

set AZURE_RESOURCE_GROUP=[resource group name]
az group create --name %AZURE_RESOURCE_GROUP% --location eastus

Create an Azure storage account and populate it with files

We will create an Azure storage account to hold training and evaluation data, scripts, and output files. Choose a unique name for this storage account and insert it into the bracketed expression below. Then, issue the following commands to create your storage account and store its randomly-assigned access key:

set STORAGE_ACCOUNT_NAME=[storage account name]
az storage account create --name %STORAGE_ACCOUNT_NAME% --sku Standard_LRS --sku Standard_LRS --resource-group %AZURE_RESOURCE_GROUP% --location eastus
for /f "delims=" %a in ('az storage account keys list --account-name %STORAGE_ACCOUNT_NAME% --resource-group %AZURE_RESOURCE_GROUP% --query "[0].value"') do @set STORAGE_ACCOUNT_KEY=%a

With the commands below, we will create an Azure File Share to hold setup and job-specific logs, as well as an Azure Blob container for fast file I/O during model training and evaluation. Then, we'll use AzCopy to copy the necessary data files for this tutorial to your own storage account. Note that we will copy over only a subset of the available data, to save time and resources.

az storage share create --account-name %STORAGE_ACCOUNT_NAME% --name batchai
az storage container create --account-name %STORAGE_ACCOUNT_NAME% --name blobfuse
AzCopy /Source:https://aiforearthcollateral.blob.core.windows.net/imagesegmentationtutorial /SourceSAS:"?st=2018-01-16T10%3A40%3A00Z&se=2028-01-17T10%3A40%3A00Z&sp=rl&sv=2017-04-17&sr=c&sig=KeEzmTaFvVo2ptu2GZQqv5mJ8saaPpeNRNPoasRS0RE%3D" /Dest:https://%STORAGE_ACCOUNT_NAME%.blob.core.windows.net/blobfuse /DestKey:%STORAGE_ACCOUNT_KEY% /S

Expect the copy step to take 5-10 minutes.

Create an Azure Batch AI cluster

We will create an Azure Batch AI cluster containing two NC6 Ubuntu DSVMs. This two-GPU cluster will be used to train our model and then apply it to previously-unseen data. Before executing the command below, ensure that the cluster.json file provided in this repository (which specifies the Python packages that should be installed during setup) has been downloaded to your computer and is available on the path. We also recommend that you change the username and password to credentials of your choice.

az batchai cluster create -n batchaidemo --user-name lcuser --password lcpassword --afs-name batchai --image UbuntuDSVM --vm-size STANDARD_NC6 --max 2 --min 2 --storage-account-name %STORAGE_ACCOUNT_NAME% --container-name blobfuse --container-mount-path blobfuse -c cluster.json --resource-group %AZURE_RESOURCE_GROUP% --location eastus

This command will create a cluster whose credentials are a username-password pair. For increased security, we highly encourage the use of an SSH key as credential: for more information, see the Batch AI documentation and the output of the az batchai cluster create -h command.

It will take approximately ten minutes for cluster creation to complete. You can check on progress of the provisioning process using the command below: when provisioning is complete, you should see that the "errors" field is null and that your cluster has two "idle" nodes.

az batchai cluster show -n batchaidemo --resource-group %AZURE_RESOURCE_GROUP%

Next steps

You have now completed all of the setup steps required for this tutorial. We recommend proceeding to the model training section of this repository.

Click here to return to the main page of this repository for more information.

Cleanup

When you have completed all sections of interest to you in this repository, be sure to delete the resources you created with the following command:

az group delete -n %AZURE_RESOURCE_GROUP%