Skip to content

Commit

Permalink
- [Docs] Minor edits
Browse files Browse the repository at this point in the history
  • Loading branch information
peterschmidt85 committed Oct 9, 2024
1 parent 543da6e commit 70fe369
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 13 deletions.
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,12 @@ for AI workloads both in the cloud and on-prem, speeding up the development, tra

## Major news ✨

- [2024/08] [dstack 0.18.11: AMD, encryption, and more](https://github.com/dstackai/dstack/releases/tag/0.18.11) (Release)
- [2024/08] [dstack 0.18.10: Control plane UI](https://github.com/dstackai/dstack/releases/tag/0.18.10) (Release)
- [2024/07] [dstack 0.18.7: Fleets, RunPod volumes, dstack apply, and more](https://github.com/dstackai/dstack/releases/tag/0.18.7) (Release)
- [2024/05] [dstack 0.18.4: Google Cloud TPU, and more](https://github.com/dstackai/dstack/releases/tag/0.18.4) (Release)
- [2024/05] [dstack 0.18.2: On-prem clusters, private subnets, and more](https://github.com/dstackai/dstack/releases/tag/0.18.2) (Release)
- [2024/10] [dstack 0.18.17: on-prem AMD GPUs, AWS EFA, and more](https://github.com/dstackai/dstack/releases/tag/0.18.17)
- [2024/08] [dstack 0.18.11: AMD, encryption, and more](https://github.com/dstackai/dstack/releases/tag/0.18.11)
- [2024/08] [dstack 0.18.10: Control plane UI](https://github.com/dstackai/dstack/releases/tag/0.18.10)
- [2024/07] [dstack 0.18.7: Fleets, RunPod volumes, dstack apply, and more](https://github.com/dstackai/dstack/releases/tag/0.18.7)
- [2024/05] [dstack 0.18.4: Google Cloud TPU, and more](https://github.com/dstackai/dstack/releases/tag/0.18.4)
- [2024/05] [dstack 0.18.2: On-prem clusters, private subnets, and more](https://github.com/dstackai/dstack/releases/tag/0.18.2)

## Installation

Expand Down
11 changes: 6 additions & 5 deletions docs/docs/concepts/fleets.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,11 @@ are both acceptable).
to the specified parameters.

!!! info "Network"
Set `placement` to `cluster` if the nodes should be interconnected
(e.g. if you'd like to use them for [multi-node tasks](../reference/dstack.yml/task.md#distributed-tasks)).
In that case, `dstack` will provision all nodes in the same backend and region and configure the optimal
connectivity via availability zones, placement groups, etc.
To ensure the nodes of the fleet are interconnected (e.g., if you'd like to use them for
[multi-node tasks](../reference/dstack.yml/task.md#distributed-tasks)),
set `placement` to `cluster`.
In this case, `dstack` will provision all nodes in the same backend and region and configure optimal
inter-node connectivity.

??? info "AWS"
`dstack` automatically enables [Elastic Fabric Adapter :material-arrow-top-right-thin:{ .external }](https://aws.amazon.com/hpc/efa/){:target="_blank"}
Expand All @@ -62,7 +63,7 @@ are both acceptable).
`g6.16xlarge`, `g6.24xlarge`, `g6.48xlarge`, `g6.8xlarge`, `gr6.8xlarge`

Currently, only one EFA interface is enabled regardless of the maximum number of interfaces supported by the instance type.
This limitation will be resolved in the future.
This limitation will be lifted once [this issue :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/issues/1804){:target="_blank"} is fixed.

Note that cloud fleets aren't supported for the `kubernetes`, `vastai`, and `runpod` backends.

Expand Down
9 changes: 6 additions & 3 deletions docs/docs/reference/dstack.yml/task.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,9 +258,12 @@ resources:
</div>

If you run the task, `dstack` first provisions the master node and then runs the other nodes of the cluster.
All nodes are provisioned in the same region. It is recommended to use a [fleet](../../concepts/fleets.md),
as in this case all nodes are provisioned into a cluster placement group for better connectivity if the backend
supports this feature.

??? info "Network"
To ensure all nodes are provisioned into a cluster placement group and to enable the highest level of inter-node
connectivity, it is recommended to manually create a [fleet](../../concepts/fleets.md) before running a task.
This won’t be needed once [this issue :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/issues/1805){:target="_blank"}
is fixed.

> `dstack` is easy to use with `accelerate`, `torchrun`, and other distributed frameworks. All you need to do
is pass the corresponding environment variables such as `DSTACK_GPUS_PER_NODE`, `DSTACK_NODE_RANK`, `DSTACK_NODES_NUM`,
Expand Down

0 comments on commit 70fe369

Please sign in to comment.