dstack
is an open-source tool that allows running reproducible ML workflows independently of
the environment (locally or in the cloud), and collaborate around data and models.
Docs • Quick start • Basics • Slack
dstack
is an open-source tool that allows running reproducible ML workflows independently of
the environment. It allows running ML workflows locally or remotely (e.g. in a configured cloud account).
Additionally, dstack
facilitates versioning and reuse of artifacts (such as data and models), across teams.
In brief, dstack
simplifies the process of establishing ML training pipelines that are independent of a
particular vendor, and facilitates collaboration within teams on data and models.
- Define workflows via YAML
- Run workflows locally via CLI
- Track and reuse artifacts across workflows
- Run workflows remotely (in any configured cloud) via CLI
- Version and share artifacts across teams
Use pip to install the dstack
CLI:
pip install dstack --upgrade
Here's an example from the Quick start.
workflows:
- name: mnist-data
provider: bash
commands:
- pip install torchvision
- python mnist/mnist_data.py
artifacts:
- path: ./data
- name: train-mnist
provider: bash
deps:
- workflow: mnist-data
commands:
- pip install torchvision pytorch-lightning tensorboard
- python mnist/train_mnist.py
artifacts:
- path: ./lightning_logs
With workflows defined in this manner, dstack
allows for effortless execution either locally or in a configured cloud
account, while also enabling reuse of artifacts.
Use the dstack
CLI to run workflows locally:
dstack run mnist-data
To run workflows remotely (e.g. in the cloud) or share artifacts outside your machine,
you must configure your remote settings using the dstack config
command:
dstack config
This command will ask you to choose an AWS profile (which will be used for AWS credentials), an AWS region (where workflows will be run), and an S3 bucket (to store remote artifacts and metadata).
AWS profile: default
AWS region: eu-west-1
S3 bucket: dstack-142421590066-eu-west-1
EC2 subnet: none
For more details on how to configure a remote, check the installation guide.
Once a remote is configured, use the --remote
flag with the dstack run
command to run the
workflow in the configured cloud:
dstack run mnist-data --remote
You can configure the required resources to run the workflows either via the resources
property in YAML
or the dstack run
command's arguments, such as --gpu
, --gpu-name
, etc:
dstack run train-mnist --remote --gpu 1
When you run a workflow remotely, dstack
automatically creates resources in the configured cloud,
and releases them once the workflow is finished.
For additional information and examples, see the following links: