Skip to content
This repository has been archived by the owner on Jul 7, 2020. It is now read-only.

Improve readability of clusters-jobs section #276

Merged
merged 1 commit into from
Mar 5, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/source/concepts/clusters-jobs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@ Each cluster has a set of Commands. Commands reference shell scripts (usually w

Where ``job-task.sh`` is just a plain old shell script (in this case it's mostly just a collection of JVM flags). The node and job information in brackets is templated in, see the section on :ref:`sharding <shards-pipelines>` for the details.

Each job is defined by it's command, the configuration you provide, number of :ref:`shards <shards-pipelines>`, how long it should run, etc. Jobs are broken apart into a number of *tasks*. These tasks are then allocated among the minions. A minion may end with multiple tasks from the same job (if that is best for cluster health as a whole, or if that job has more tasks than there are total minions) but usually they are spread around.
Each job is defined by its command, the configuration you provide, the number of :ref:`shards <shards-pipelines>`, how long the job should run, etc. Jobs are divided into a number of *tasks*. These tasks are then allocated among the minions. A minion may end with multiple tasks from the same job (if that is best for cluster health as a whole, or if that job has more tasks than the total minions) but usually tasks are spread around.

If your job is interesting you probably want to keep it running so you find out what's new every day. Jobs can be scheduled to run again or *kick* periodically. Spawn will take care making sure your job is submitted again within however many minutes you specify. Keep in mind there is no way to guarantee that your job will not have to wait in line behind other jobs. To keep a (perhaps inadvertently) greedy job from taking all of the clusters resources (keeping *you* from running your totally rad new job) it is good practice to set a maximum time for your job to run before others get a turn. You may also give a job priority that will cause it to wait in line for less time, but being a good cluster citizen is preferable to creating a complex ontology of job priority.
If your job is interesting, you probably want to keep it running so you find out what's new every day. Jobs can be scheduled to run again or *kick* periodically. Spawn will take care making sure your job is submitted again within however many minutes you specify. Keep in mind there is no way to guarantee that your job will not have to wait in line behind other jobs. To keep a (perhaps inadvertently) greedy job from taking all of the clusters resources (keeping *you* from running your totally rad new job), it is a good practice to set a maximum time for your job to run before others get a turn. You may also give a job priority that will cause it to wait in line for less time, but being a good cluster citizen is preferable to creating a complex ontology of job priority.

To support pipelines you can have jobs trigger the execution of other jobs on successful completion.

Expand Down