Skip to content

Commit

Permalink
- Added the Leveraging spot instances effectively guide to /learn
Browse files Browse the repository at this point in the history
- Added a usage example to `/docs/reference/api/python/index.md`
- Updated some URLs within `learn` (to make them unified and short)
  • Loading branch information
peterschmidt85 committed Dec 10, 2023
1 parent c4a32e3 commit 9bb8e21
Show file tree
Hide file tree
Showing 16 changed files with 181 additions and 70 deletions.
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,20 @@ Train and deploy generative AI on any cloud
[![PyPI - License](https://img.shields.io/pypi/l/dstack?style=flat-square&color=blue)](https://github.com/dstackai/dstack/blob/master/LICENSE.md)
</div>

`dstack` simplifies training, fine-tuning, and deploying
generative AI models on any cloud.
`dstack` simplifies training, fine-tuning, and deployment of generative AI models on any cloud.

Supported providers: AWS, GCP, Azure, Lambda, TensorDock, and Vast.ai.

## Latest news ✨

- [2023/11] [Access the GPU marketplace with Vast.ai](https://dstack.ai/blog/2023/11/21/vastai/) (Release)
- [2023/10] [Use world's cheapest GPUs with TensorDock](https://dstack.ai/blog/2023/10/31/tensordock/) (Release)
- [2023/09] [RAG with Llama Index and Weaviate](https://dstack.ai/learn/llama-index-weaviate) (Example)
- [2023/08] [Deploying Stable Diffusion using FastAPI](https://dstack.ai/learn/stable-diffusion-xl) (Example)
- [2023/07] [Deploying LLMs using TGI](https://dstack.ai/learn/text-generation-inference) (Example)
- [2023/07] [Deploying LLMs using vLLM](https://dstack.ai/learn/vllm) (Example)
- [2023/12] [Leveraging spot instances effectively](https://dstack.ai/learn/spot) (Learn)
- [2023/11] [Access the GPU marketplace with Vast.ai](https://dstack.ai/blog/2023/11/21/vastai/) (Blog)
- [2023/10] [Use world's cheapest GPUs with TensorDock](https://dstack.ai/blog/2023/10/31/tensordock/) (Blog)
- [2023/09] [RAG with Llama Index and Weaviate](https://dstack.ai/learn/llama-index) (Learn)
- [2023/08] [Fine-tuning Llama 2 using QLoRA](https://dstack.ai/learn/qlora) (Learn)
- [2023/08] [Deploying Stable Diffusion using FastAPI](https://dstack.ai/learn/sdxl) (Learn)
- [2023/07] [Deploying LLMs using TGI](https://dstack.ai/learn/tgi) (Learn)
- [2023/07] [Deploying LLMs using vLLM](https://dstack.ai/learn/vllm) (Learn)

## Installation

Expand Down
5 changes: 4 additions & 1 deletion docs/assets/stylesheets/landing.css
Original file line number Diff line number Diff line change
Expand Up @@ -254,10 +254,13 @@
max-width: 500px;
margin-top: 1.75em;
margin-bottom: 2.5em;
letter-spacing: -1.5px;
}

.tx-landing__highlights_text h2 .gradient {
background: -webkit-linear-gradient(45deg, #0048ff, #ce00ff);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
letter-spacing: -1.5px;
}

.tx-landing__highlights_grid .feature-cell {
Expand Down
4 changes: 2 additions & 2 deletions docs/blog/posts/simplified-cloud-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ projects:
</div>
Regions and other settings are optional. Learn more on what credential types are supported
via [Clouds](../../docs/configuration/server.md).
via [Clouds](../../docs/config/server.md).
## Enhanced API
Expand Down Expand Up @@ -98,7 +98,7 @@ This means you'll need to delete `~/.dstack` and configure `dstack` from scratch

1. `pip install "dstack[all]==0.12.0"`
2. Delete `~/.dstack`
3. Configure clouds via `~/.dstack/server/config.yml` (see the [new guide](../../docs/configuration/server.md))
3. Configure clouds via `~/.dstack/server/config.yml` (see the [new guide](../../docs/config/server.md))
4. Run `dstack server`

The [documentation](../../docs/index.md) and [examples](../../examples/index.md) are updated.
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ or use the cloud version (which provides GPU out of the box).
If you have default AWS, GCP, or Azure credentials on your machine, the `dstack` server will pick them up automatically.

Otherwise, you need to manually specify the cloud credentials in `~/.dstack/server/config.yml`.
For further details, refer to [server configuration](configuration/server.md).
For further details, refer to [server configuration](config/server.md).

### Start the server

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/installation/docker.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ $ docker run --name dstack -p &lt;port-on-host&gt;:3000 \

!!! info "Configure clouds"
Upon startup, the server sets up the default project called `main`.
Prior to using `dstack`, make sure to [configure clouds](../configuration/server.md).
Prior to using `dstack`, make sure to [configure clouds](../config/server.md).

## Environment variables

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/installation/pip.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ The server is available at http://127.0.0.1:3000?token=b934d226-e24a-4eab-eb92b3

!!! info "Configure clouds"
Upon startup, the server sets up the default project called `main`.
Prior to using `dstack`, make sure to [configure clouds](../configuration/server.md).
Prior to using `dstack`, make sure to [configure clouds](../config/server.md).
63 changes: 36 additions & 27 deletions docs/docs/reference/api/python/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,42 @@

The Python API allows for running tasks, services, and managing runs programmatically.

#### Usage example

```python
import sys

from dstack.api import Task, GPU, Client, Resources

client = Client.from_config()

task = Task(
image="ghcr.io/huggingface/text-generation-inference:latest",
env={"MODEL_ID": "TheBloke/Llama-2-13B-chat-GPTQ"},
commands=[
"text-generation-launcher --trust-remote-code --quantize gptq",
],
ports=["80"],
)

run = client.runs.submit(
run_name="my-awesome-run", # (Optional) If not specified,
configuration=task,
resources=Resources(gpu=GPU(memory="24GB")),
)

run.attach()

try:
for log in run.logs():
sys.stdout.buffer.write(log)
sys.stdout.buffer.flush()
except KeyboardInterrupt:
run.stop(abort=True)
finally:
run.detach()
```

## `dstack.api` { #dstack.api data-toc-label="dstack.api" }

### `dstack.api.Client` { #dstack.api.Client data-toc-label="Client" }
Expand Down Expand Up @@ -30,33 +66,6 @@ The Python API allows for running tasks, services, and managing runs programmati
show_root_toc_entry: false
heading_level: 4

### `dstack.api.FineTuningTask` { #dstack.api.FineTuningTask data-toc-label="FineTuningTask" }

::: dstack.api.FineTuningTask
options:
show_bases: false
show_root_heading: false
show_root_toc_entry: false
heading_level: 4

### `dstack.api.CompletionService` { #dstack.api.CompletionService data-toc-label="CompletionService" }

::: dstack.api.CompletionService
options:
show_bases: false
show_root_heading: false
show_root_toc_entry: false
heading_level: 4

### `dstack.api.CompletionTask` { #dstack.api.CompletionTask data-toc-label="CompletionTask" }

::: dstack.api.CompletionTask
options:
show_bases: false
show_root_heading: false
show_root_toc_entry: false
heading_level: 4

### `dstack.api.Run` { ##dstack.api.Run data-toc-label="Run" }

::: dstack.api.Run
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
76 changes: 76 additions & 0 deletions docs/learn/spot.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Leveraging spot instances effectively

Cloud instances come in three types: `reserved` (for long-term commitments at a cheaper rate), `on-demand` (used as needed but
more expensive), and `spot` (cheapest, provided when available, but can be taken away when requested by someone else).

There are three cloud providers that offer spot instances: AWS, GCP, and Azure.
Once you've [configured](../docs/config/server.md) any of these, you can use spot instances
for [dev environments](../docs/guides/dev-environments.md), [tasks](../docs/guides/tasks.md), and
[services](../docs/guides/services.md).

!!! info "Quotas"
Note, before you can use spot instances with AWS, GCP, and Azure, ensure you request the necessary quota via a support
ticket.

## Setting a spot policy

For dev environments, `dstack` uses on-demand instances by default. For
tasks and services, `dstack` tries to use spot instances if they are available, falling back to on-demand instances.

The `dstack run` command allows you to override the default behavior.
To use `spot` instances, pass `--spot`. To use `spot` instances only they are available,
pass `--spot-auto`.

<div class="termy">

```shell
$ dstack run . --gpu 24GB --spot-auto
Max price -
Max duration 6h
Spot policy auto
Retry policy no

# BACKEND REGION RESOURCES SPOT PRICE
1 gcp us-central1 4xCPU, 16GB, L4 (24GB) yes $0.223804
2 gcp us-east1 4xCPU, 16GB, L4 (24GB) yes $0.223804
3 gcp us-west1 4xCPU, 16GB, L4 (24GB) yes $0.223804
...

Continue? [y/n]:
```

</div>

## Setting a retry policy

If the requested instance is unavailable, the `dstack run` command will fail unless you specify a retry policy.
The policy can be specified via `--retry-limit`:

<div class="termy">

```shell
$ dstack run . --gpu 24GB --spot --retry-limit 1h
```

</div>

In this case, `dstack` will retry to find spot instances within one hour.

!!! info "NOTE:"
Note that if you've set the retry duration and the spot instance is taken while your run was not
finished, `dstack` will restart it from scratch.

If you run a service using spot instances, the default retry duration is set to infinity.

## Tips and tricks

1. The `--spot-auto` policy allows for the automatic use of spot instances when available, seamlessly reverting to
on-demand instances if spots aren't accessible. You can enable it via `dstack run` or
via [`profiles.yml`](../docs/reference/profiles.yml.md).
2. You can use multiple cloud providers (incl. AWS, GCP, and Azure) and regions to increase the likelihood of
obtaining a spot instance. However, in doing so, beware of data transfer costs if large volumes of data
need to be loaded.
3. When using spot instances for training, ensure you save checkpoints regularly and load them if the run is restarted
due to interruption.


File renamed without changes.
8 changes: 4 additions & 4 deletions docs/overrides/home.html
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ <h1>An <span class="gradient">easier way</span> to train and deploy <span

<p>
dstack simplifies training, fine-tuning, and deploying
generative AI models, leveraging the open-source ecosystem.
generative AI models, while leveraging the open-source ecosystem.
</p>
</div>

Expand Down Expand Up @@ -145,7 +145,7 @@ <h1>An <span class="gradient">easier way</span> to train and deploy <span
</div>

<div class="tx-landing__integrations_text">
Use your own cloud account
Use your own cloud accounts
</div>
</div>

Expand Down Expand Up @@ -226,7 +226,7 @@ <h2>Services</h2>

<div class="tx-landing__plans">
<div class="tx-landing__highlights_text">
<h2>Use our cloud GPU or utilize your own cloud account</h2>
<h2>Opt for <span class="gradient">our cloud GPU</span> or use with <span class="gradient">your cloud accounts</span></h2>
</div>

<div class="tx-landing__plans_cards">
Expand Down Expand Up @@ -307,7 +307,7 @@ <h2>Use our cloud GPU or utilize your own cloud account</h2>
<div class="plans_card__buttons">
<a href="/docs" target="_blank"
class="md-button md-button-secondary small">Install open-source</a>
<div class="plans_card__buttons_subtitle">Use your own cloud account</div>
<div class="plans_card__buttons_subtitle">Use your own cloud accounts</div>
</div>
</div>
</div>
Expand Down
46 changes: 33 additions & 13 deletions docs/overrides/learn.html
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,25 @@ <h2>Learning center</h2>
</div>

<div class="tx-landing__highlights_grid">
<a href="/learn/llama-index-weaviate" class="feature-cell">
<a href="/learn/spot" class="feature-cell">
<h3>
RAG with Llama Index and Weaviate
Leveraging spot instances effectively
</h3>

<p>
Use spot instances with <strong>AWS</strong>, <strong>GCP</strong>, and
<strong>Azure</strong> to significantly reduce the cost of cloud GPU.
</p>

<div class="feature-tags">
<div class="feature-tag">Guide</div>
<div class="feature-tag">GPU</div>
</div>
</a>

<a href="/learn/llama-index" class="feature-cell">
<h3>
Building RAG with Llama Index
</h3>

<p>
Expand All @@ -20,12 +36,12 @@ <h3>
</p>

<div class="feature-tags">
<div class="feature-tag">Examples</div>
<div class="feature-tag">Example</div>
<div class="feature-tag">RAG</div>
</div>
</a>

<a href="/learn/finetuning-llama-2" class="feature-cell">
<a href="/learn/qlora" class="feature-cell">
<h3>
Fine-tuning LLMs with QLoRA
</h3>
Expand All @@ -36,12 +52,13 @@ <h3>
</p>

<div class="feature-tags">
<div class="feature-tag">Examples</div>
<div class="feature-tag">Example</div>
<div class="feature-tag">Fine-tuning</div>
<div class="feature-tag">LLMs</div>
</div>
</a>

<a href="/learn/text-generation-inference" class="feature-cell">
<a href="/learn/tgi" class="feature-cell">
<h3>
Deploying LLMs with TGI
</h3>
Expand All @@ -52,12 +69,13 @@ <h3>
</p>

<div class="feature-tags">
<div class="feature-tag">Examples</div>
<div class="feature-tag">Deployment</div>
<div class="feature-tag">Example</div>
<div class="feature-tag">Inference</div>
<div class="feature-tag">LLMs</div>
</div>
</a>

<a href="/learn/stable-diffusion-xl" class="feature-cell">
<a href="/learn/sdxl" class="feature-cell">
<h3>
Deploying SDXL with FastAPI
</h3>
Expand All @@ -70,8 +88,9 @@ <h3>
</p>

<div class="feature-tags">
<div class="feature-tag">Examples</div>
<div class="feature-tag">Deployment</div>
<div class="feature-tag">Example</div>
<div class="feature-tag">Inference</div>
<div class="feature-tag">Image generation</div>
</div>
</a>

Expand All @@ -88,8 +107,9 @@ <h3>
</p>

<div class="feature-tags">
<div class="feature-tag">Examples</div>
<div class="feature-tag">Deployment</div>
<div class="feature-tag">Example</div>
<div class="feature-tag">Inference</div>
<div class="feature-tag">LLMs</div>
</div>
</a>
</div>
Expand Down
Loading

0 comments on commit 9bb8e21

Please sign in to comment.