Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow - update-children and more #379

Merged
merged 13 commits into from
May 24, 2023
Prev Previous commit
Next Next commit
add section on optimizing for workflows
  • Loading branch information
Alex Heneveld committed May 24, 2023
commit cfb63e2d949c12f876ce48046d18c25069ec3ef5
21 changes: 20 additions & 1 deletion guide/blueprints/workflow/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Some of the common properties permitted on [steps](common.md) also apply to work
including `condition`, `timeout`, and `on-error`.

This rest of this section describes the remaining properties for more advanced use cases
including mutex locking and resilient workflows with replay points.
including mutex locking and resilient workflows with replay points, and some tips on optimizing.


## Locks and Mutual Exclusion Behavior
Expand Down Expand Up @@ -477,3 +477,22 @@ on-error:
- workflow retention parent
```

## Optimizing for Workflows

Workflows can generate a huge amount of data which can impact memory usage, persistence, and the UI.
The REST API and UI do some filtering (e.g. in the body of the `internal` sensors used by workflow),
but when working with large `ssh` `output` and `http` `content` payloads, and with `update-children`,
performance can be dramatically improved by following these tips:

* Optimize external calls to return the minimal amount of information needed
* Use `jq` to filter when using `ssh` or `container` steps
* Pass filter argumetns to `http` endpoints that accept them
* Use small page sizes with `retry from` steps

* Optimize the data which is stored
* Override the `output` on `ssh` and `http` steps to remove unnecessary objects;
for example `http` returns several `content*` fields, and often just one is needed.
Simply settings `output: { content: ${content} }` will achieve this.
* Set `retention: 1` or `retention: 0` on workflows that use a large amount of information
and can simply be replayed from the start