Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Infiniband Updated Ubuntu Versions #462

Merged
merged 19 commits into from
Oct 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion source/cloud/azure/azureml.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ The compute instance provides an integrated Jupyter notebook service, JupyterLab

Sign in to [Azure Machine Learning Studio](https://ml.azure.com/) and navigate to your workspace on the left-side menu.

Select **Compute** > **+ New** > choose a [RAPIDS compatible GPU](https://medium.com/dropout-analytics/which-gpus-work-with-rapids-ai-f562ef29c75f) VM size (e.g., `Standard_NC12s_v3`)
Select **Compute** > **+ New** (Create compute instance) > choose a [RAPIDS compatible GPU](https://medium.com/dropout-analytics/which-gpus-work-with-rapids-ai-f562ef29c75f) VM size (e.g., `Standard_NC12s_v3`)

![Screenshot of create new notebook with a gpu-instance](../../images/azureml-create-notebook-instance.png)

Expand Down
12 changes: 8 additions & 4 deletions source/examples/rapids-azureml-hpo/notebook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
]
},
"source": [
"# Train and Hyperparameter-Tune with RAPIDS"
"# Train and Hyperparameter-Tune with RAPIDS on AzureML"
]
},
{
Expand Down Expand Up @@ -97,12 +97,16 @@
"from azure.ai.ml import MLClient\n",
"from azure.identity import DefaultAzureCredential\n",
"\n",
"subscription_id = \"FILL IN WITH YOUR AZURE ML CREDENTIALS\"\n",
"resource_group_name = \"FILL IN WITH YOUR AZURE ML CREDENTIALS\"\n",
"workspace_name = \"FILL IN WITH YOUR AZURE ML CREDENTIALS\"\n",
"\n",
"# Get a handle to the workspace\n",
"ml_client = MLClient(\n",
" credential=DefaultAzureCredential(),\n",
" subscription_id=\"fc4f4a6b-4041-4b1c-8249-854d68edcf62\",\n",
" resource_group_name=\"rapidsai-deployment\",\n",
" workspace_name=\"rapids-aml-cluster\",\n",
" subscription_id=subscription_id,\n",
" resource_group_name=resource_group_name,\n",
" workspace_name=workspace_name,\n",
")\n",
"\n",
"print(\n",
Expand Down
20 changes: 13 additions & 7 deletions source/guides/azure/infiniband.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ for demonstration.
- Select `East US` region.
- Change `Availability options` to `Availability set` and create a set.
- If building multiple instances put additional instances in the same set.
- Use the 2nd Gen Ubuntu 20.04 image.
- Search all images for `Ubuntu Server 20.04` and choose the second one down on the list.
- Use the 2nd Gen Ubuntu 24.04 image.
- Search all images for `Ubuntu Server 24.04` and choose the second one down on the list.
- Change size to `ND40rs_v2`.
- Set password login with credentials.
- User `someuser`
Expand All @@ -39,8 +39,8 @@ The commands below should work for Ubuntu. See the [CUDA Toolkit documentation](
```shell
sudo apt-get install -y linux-headers-$(uname -r)
distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e 's/\.//g')
wget https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
wget https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-drivers
```
Expand Down Expand Up @@ -118,11 +118,11 @@ Mon Nov 14 20:32:39 2022

### InfiniBand Driver

On Ubuntu 20.04
On Ubuntu 24.04

```shell
sudo apt-get install -y automake dh-make git libcap2 libnuma-dev libtool make pkg-config udev curl librdmacm-dev rdma-core \
libgfortran5 bison chrpath flex graphviz gfortran tk dpatch quilt swig tcl ibverbs-utils
libgfortran5 bison chrpath flex graphviz gfortran tk quilt swig tcl ibverbs-utils
```

Check install
Expand Down Expand Up @@ -247,7 +247,13 @@ wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforg
bash Mambaforge-Linux-x86_64.sh
```

Accept the default and allow conda init to run. Then start a new shell.
Accept the default and allow conda init to run.

```shell
~/mambaforge/bin/conda init
```

Then start a new shell.

Create a conda environment (see [UCX-Py](https://ucx-py.readthedocs.io/en/latest/install.html) docs)

Expand Down