- Added the Leveraging spot instances effectively guide to /learn

- Added a usage example to `/docs/reference/api/python/index.md` - Updated some URLs within `learn` (to make them unified and short)
dstackai · Dec 10, 2023 · 9bb8e21 · 9bb8e21
1 parent c4a32e3
commit 9bb8e21
Show file tree

Hide file tree

Showing 16 changed files with 181 additions and 70 deletions.
diff --git a/README.md b/README.md
@@ -23,19 +23,20 @@ Train and deploy generative AI on any cloud
 [![PyPI - License](https://img.shields.io/pypi/l/dstack?style=flat-square&color=blue)](https://github.com/dstackai/dstack/blob/master/LICENSE.md)
 </div>
 
-`dstack` simplifies training, fine-tuning, and deploying
-generative AI models on any cloud. 
+`dstack` simplifies training, fine-tuning, and deployment of generative AI models on any cloud.
 
 Supported providers: AWS, GCP, Azure, Lambda, TensorDock, and Vast.ai.
 
 ## Latest news ✨
 
-- [2023/11] [Access the GPU marketplace with Vast.ai](https://dstack.ai/blog/2023/11/21/vastai/) (Release)
-- [2023/10] [Use world's cheapest GPUs with TensorDock](https://dstack.ai/blog/2023/10/31/tensordock/) (Release)
-- [2023/09] [RAG with Llama Index and Weaviate](https://dstack.ai/learn/llama-index-weaviate) (Example)
-- [2023/08] [Deploying Stable Diffusion using FastAPI](https://dstack.ai/learn/stable-diffusion-xl) (Example)
-- [2023/07] [Deploying LLMs using TGI](https://dstack.ai/learn/text-generation-inference) (Example)
-- [2023/07] [Deploying LLMs using vLLM](https://dstack.ai/learn/vllm) (Example)
+- [2023/12] [Leveraging spot instances effectively](https://dstack.ai/learn/spot) (Learn)
+- [2023/11] [Access the GPU marketplace with Vast.ai](https://dstack.ai/blog/2023/11/21/vastai/) (Blog)
+- [2023/10] [Use world's cheapest GPUs with TensorDock](https://dstack.ai/blog/2023/10/31/tensordock/) (Blog)
+- [2023/09] [RAG with Llama Index and Weaviate](https://dstack.ai/learn/llama-index) (Learn)
+- [2023/08] [Fine-tuning Llama 2 using QLoRA](https://dstack.ai/learn/qlora) (Learn)
+- [2023/08] [Deploying Stable Diffusion using FastAPI](https://dstack.ai/learn/sdxl) (Learn)
+- [2023/07] [Deploying LLMs using TGI](https://dstack.ai/learn/tgi) (Learn)
+- [2023/07] [Deploying LLMs using vLLM](https://dstack.ai/learn/vllm) (Learn)
 
 ## Installation
 

diff --git a/docs/assets/stylesheets/landing.css b/docs/assets/stylesheets/landing.css
@@ -254,10 +254,13 @@
     max-width: 500px;
     margin-top: 1.75em;
     margin-bottom: 2.5em;
+    letter-spacing: -1.5px;
+}
+
+.tx-landing__highlights_text h2 .gradient {
     background: -webkit-linear-gradient(45deg, #0048ff, #ce00ff);
     -webkit-background-clip: text;
     -webkit-text-fill-color: transparent;
-    letter-spacing: -1.5px;
 }
 
 .tx-landing__highlights_grid .feature-cell {

diff --git a/docs/blog/posts/simplified-cloud-setup.md b/docs/blog/posts/simplified-cloud-setup.md
@@ -40,7 +40,7 @@ projects:
 </div>
 
 Regions and other settings are optional. Learn more on what credential types are supported 
-via [Clouds](../../docs/configuration/server.md).
+via [Clouds](../../docs/config/server.md).
 
 ## Enhanced API
 
@@ -98,7 +98,7 @@ This means you'll need to delete `~/.dstack` and configure `dstack` from scratch
 
 1. `pip install "dstack[all]==0.12.0"`
 2. Delete `~/.dstack`
-3. Configure clouds via `~/.dstack/server/config.yml` (see the [new guide](../../docs/configuration/server.md))
+3. Configure clouds via `~/.dstack/server/config.yml` (see the [new guide](../../docs/config/server.md))
 4. Run `dstack server`
 
 The [documentation](../../docs/index.md) and [examples](../../examples/index.md) are updated.

diff --git a/docs/docs/configuration/server.md → docs/docs/config/server.md b/docs/docs/configuration/server.md → docs/docs/config/server.md
diff --git a/docs/docs/index.md b/docs/docs/index.md
@@ -33,7 +33,7 @@ or use the cloud version (which provides GPU out of the box).
     If you have default AWS, GCP, or Azure credentials on your machine, the `dstack` server will pick them up automatically.
 
     Otherwise, you need to manually specify the cloud credentials in `~/.dstack/server/config.yml`.
-    For further details, refer to [server configuration](configuration/server.md).
+    For further details, refer to [server configuration](config/server.md).
 
     ### Start the server
 

diff --git a/docs/docs/installation/docker.md b/docs/docs/installation/docker.md
@@ -14,7 +14,7 @@ $ docker run --name dstack -p &lt;port-on-host&gt;:3000 \
 
 !!! info "Configure clouds"
     Upon startup, the server sets up the default project called `main`.
-    Prior to using `dstack`, make sure to [configure clouds](../configuration/server.md).
+    Prior to using `dstack`, make sure to [configure clouds](../config/server.md).
 
 ## Environment variables
 

diff --git a/docs/docs/installation/pip.md b/docs/docs/installation/pip.md
@@ -15,4 +15,4 @@ The server is available at http://127.0.0.1:3000?token=b934d226-e24a-4eab-eb92b3
 
 !!! info "Configure clouds"
     Upon startup, the server sets up the default project called `main`.
-    Prior to using `dstack`, make sure to [configure clouds](../configuration/server.md).
+    Prior to using `dstack`, make sure to [configure clouds](../config/server.md).
diff --git a/docs/docs/reference/api/python/index.md b/docs/docs/reference/api/python/index.md
@@ -2,6 +2,42 @@
 
 The Python API allows for running tasks, services, and managing runs programmatically.
 
+#### Usage example
+
+```python
+import sys
+
+from dstack.api import Task, GPU, Client, Resources
+
+client = Client.from_config()
+
+task = Task(
+    image="ghcr.io/huggingface/text-generation-inference:latest",
+    env={"MODEL_ID": "TheBloke/Llama-2-13B-chat-GPTQ"},
+    commands=[
+        "text-generation-launcher --trust-remote-code --quantize gptq",
+    ],
+    ports=["80"],
+)
+
+run = client.runs.submit(
+    run_name="my-awesome-run",  # (Optional) If not specified, 
+    configuration=task,
+    resources=Resources(gpu=GPU(memory="24GB")),
+)
+
+run.attach()
+
+try:
+    for log in run.logs():
+        sys.stdout.buffer.write(log)
+        sys.stdout.buffer.flush()
+except KeyboardInterrupt:
+    run.stop(abort=True)
+finally:
+    run.detach()
+```
+
 ## `dstack.api` { #dstack.api data-toc-label="dstack.api" }
 
 ### `dstack.api.Client` { #dstack.api.Client data-toc-label="Client" }
@@ -30,33 +66,6 @@ The Python API allows for running tasks, services, and managing runs programmati
       show_root_toc_entry: false
       heading_level: 4
 
-### `dstack.api.FineTuningTask` { #dstack.api.FineTuningTask data-toc-label="FineTuningTask" }
-
-::: dstack.api.FineTuningTask
-    options:
-      show_bases: false
-      show_root_heading: false
-      show_root_toc_entry: false
-      heading_level: 4
-
-### `dstack.api.CompletionService` { #dstack.api.CompletionService data-toc-label="CompletionService" }
-
-::: dstack.api.CompletionService
-    options:
-      show_bases: false
-      show_root_heading: false
-      show_root_toc_entry: false
-      heading_level: 4
-
-### `dstack.api.CompletionTask` { #dstack.api.CompletionTask data-toc-label="CompletionTask" }
-
-::: dstack.api.CompletionTask
-    options:
-      show_bases: false
-      show_root_heading: false
-      show_root_toc_entry: false
-      heading_level: 4
-
 ### `dstack.api.Run` { ##dstack.api.Run data-toc-label="Run" }
 
 ::: dstack.api.Run

diff --git a/docs/learn/llama-index-weaviate.md → docs/learn/llama-index.md b/docs/learn/llama-index-weaviate.md → docs/learn/llama-index.md
diff --git a/docs/learn/finetuning-llama-2.md → docs/learn/qlora.md b/docs/learn/finetuning-llama-2.md → docs/learn/qlora.md
diff --git a/docs/learn/stable-diffusion-xl.md → docs/learn/sdxl.md b/docs/learn/stable-diffusion-xl.md → docs/learn/sdxl.md
diff --git a/docs/learn/spot.md b/docs/learn/spot.md
@@ -0,0 +1,76 @@
+# Leveraging spot instances effectively
+
+Cloud instances come in three types: `reserved` (for long-term commitments at a cheaper rate), `on-demand` (used as needed but
+more expensive), and `spot` (cheapest, provided when available, but can be taken away when requested by someone else).
+
+There are three cloud providers that offer spot instances: AWS, GCP, and Azure. 
+Once you've [configured](../docs/config/server.md) any of these, you can use spot instances 
+for [dev environments](../docs/guides/dev-environments.md), [tasks](../docs/guides/tasks.md), and 
+[services](../docs/guides/services.md).
+
+!!! info "Quotas"
+    Note, before you can use spot instances with AWS, GCP, and Azure, ensure you request the necessary quota via a support
+    ticket.
+
+## Setting a spot policy
+
+For dev environments, `dstack` uses on-demand instances by default. For
+tasks and services, `dstack` tries to use spot instances if they are available, falling back to on-demand instances.
+
+The `dstack run` command allows you to override the default behavior.
+To use `spot` instances, pass `--spot`. To use `spot` instances only they are available,
+pass `--spot-auto`.
+
+<div class="termy">
+
+```shell
+$ dstack run . --gpu 24GB --spot-auto
+ Max price      -
+ Max duration   6h
+ Spot policy    auto
+ Retry policy   no
+
+ #  BACKEND  REGION       RESOURCES               SPOT  PRICE
+ 1  gcp      us-central1  4xCPU, 16GB, L4 (24GB)  yes   $0.223804
+ 2  gcp      us-east1     4xCPU, 16GB, L4 (24GB)  yes   $0.223804
+ 3  gcp      us-west1     4xCPU, 16GB, L4 (24GB)  yes   $0.223804
+    ...
+
+Continue? [y/n]:
+```
+
+</div>
+
+## Setting a retry policy
+
+If the requested instance is unavailable, the `dstack run` command will fail unless you specify a retry policy.
+The policy can be specified via `--retry-limit`:
+
+<div class="termy">
+
+```shell
+$ dstack run . --gpu 24GB --spot --retry-limit 1h
+```
+
+</div>
+
+In this case, `dstack` will retry to find spot instances within one hour.
+
+!!! info "NOTE:"
+    Note that if you've set the retry duration and the spot instance is taken while your run was not 
+    finished, `dstack` will restart it from scratch.
+
+If you run a service using spot instances, the default retry duration is set to infinity.  
+
+## Tips and tricks
+
+1. The `--spot-auto` policy allows for the automatic use of spot instances when available, seamlessly reverting to
+   on-demand instances if spots aren't accessible. You can enable it via `dstack run` or 
+   via [`profiles.yml`](../docs/reference/profiles.yml.md).
+2. You can use multiple cloud providers (incl. AWS, GCP, and Azure) and regions to increase the likelihood of
+   obtaining a spot instance. However, in doing so, beware of data transfer costs if large volumes of data
+   need to be loaded.
+3. When using spot instances for training, ensure you save checkpoints regularly and load them if the run is restarted
+   due to interruption.
+
+
diff --git a/docs/learn/text-generation-inference.md → docs/learn/tgi.md b/docs/learn/text-generation-inference.md → docs/learn/tgi.md
diff --git a/docs/overrides/home.html b/docs/overrides/home.html
@@ -100,7 +100,7 @@ <h1>An <span class="gradient">easier way</span> to train and deploy <span
 
                 <p>
                     dstack simplifies training, fine-tuning, and deploying
-                    generative AI models, leveraging the open-source ecosystem.
+                    generative AI models, while leveraging the open-source ecosystem.
                 </p>
             </div>
 
@@ -145,7 +145,7 @@ <h1>An <span class="gradient">easier way</span> to train and deploy <span
                         </div>
 
                         <div class="tx-landing__integrations_text">
-                            Use your own cloud account
+                            Use your own cloud accounts
                         </div>
                     </div>
 
@@ -226,7 +226,7 @@ <h2>Services</h2>
 
         <div class="tx-landing__plans">
             <div class="tx-landing__highlights_text">
-                <h2>Use our cloud GPU or utilize your own cloud account</h2>
+                <h2>Opt for <span class="gradient">our cloud GPU</span> or use with <span class="gradient">your cloud accounts</span></h2>
             </div>
 
             <div class="tx-landing__plans_cards">
@@ -307,7 +307,7 @@ <h2>Use our cloud GPU or utilize your own cloud account</h2>
                     <div class="plans_card__buttons">
                         <a href="/docs" target="_blank"
                            class="md-button md-button-secondary small">Install open-source</a>
-                        <div class="plans_card__buttons_subtitle">Use your own cloud account</div>
+                        <div class="plans_card__buttons_subtitle">Use your own cloud accounts</div>
                     </div>
                 </div>
             </div>

diff --git a/docs/overrides/learn.html b/docs/overrides/learn.html
@@ -9,9 +9,25 @@ <h2>Learning center</h2>
             </div>
 
             <div class="tx-landing__highlights_grid">
-                <a href="/learn/llama-index-weaviate" class="feature-cell">
+                <a href="/learn/spot" class="feature-cell">
                     <h3>
-                        RAG with Llama Index and Weaviate
+                        Leveraging spot instances effectively
+                    </h3>
+
+                    <p>
+                        Use spot instances with <strong>AWS</strong>, <strong>GCP</strong>, and
+                        <strong>Azure</strong> to significantly reduce the cost of cloud GPU.
+                    </p>
+
+                    <div class="feature-tags">
+                        <div class="feature-tag">Guide</div>
+                        <div class="feature-tag">GPU</div>
+                    </div>
+                </a>
+
+                <a href="/learn/llama-index" class="feature-cell">
+                    <h3>
+                        Building RAG with Llama Index
                     </h3>
 
                     <p>
@@ -20,12 +36,12 @@ <h3>
                     </p>
 
                     <div class="feature-tags">
-                        <div class="feature-tag">Examples</div>
+                        <div class="feature-tag">Example</div>
                         <div class="feature-tag">RAG</div>
                     </div>
                 </a>
 
-                <a href="/learn/finetuning-llama-2" class="feature-cell">
+                <a href="/learn/qlora" class="feature-cell">
                     <h3>
                         Fine-tuning LLMs with QLoRA
                     </h3>
@@ -36,12 +52,13 @@ <h3>
                     </p>
 
                     <div class="feature-tags">
-                        <div class="feature-tag">Examples</div>
+                        <div class="feature-tag">Example</div>
                         <div class="feature-tag">Fine-tuning</div>
+                        <div class="feature-tag">LLMs</div>
                     </div>
                 </a>
 
-                <a href="/learn/text-generation-inference" class="feature-cell">
+                <a href="/learn/tgi" class="feature-cell">
                     <h3>
                         Deploying LLMs with TGI
                     </h3>
@@ -52,12 +69,13 @@ <h3>
                     </p>
 
                     <div class="feature-tags">
-                        <div class="feature-tag">Examples</div>
-                        <div class="feature-tag">Deployment</div>
+                        <div class="feature-tag">Example</div>
+                        <div class="feature-tag">Inference</div>
+                        <div class="feature-tag">LLMs</div>
                     </div>
                 </a>
 
-                <a href="/learn/stable-diffusion-xl" class="feature-cell">
+                <a href="/learn/sdxl" class="feature-cell">
                     <h3>
                         Deploying SDXL with FastAPI
                     </h3>
@@ -70,8 +88,9 @@ <h3>
                     </p>
 
                     <div class="feature-tags">
-                        <div class="feature-tag">Examples</div>
-                        <div class="feature-tag">Deployment</div>
+                        <div class="feature-tag">Example</div>
+                        <div class="feature-tag">Inference</div>
+                        <div class="feature-tag">Image generation</div>
                     </div>
                 </a>
 
@@ -88,8 +107,9 @@ <h3>
                     </p>
 
                     <div class="feature-tags">
-                        <div class="feature-tag">Examples</div>
-                        <div class="feature-tag">Deployment</div>
+                        <div class="feature-tag">Example</div>
+                        <div class="feature-tag">Inference</div>
+                        <div class="feature-tag">LLMs</div>
                     </div>
                 </a>
             </div>