Skip to content

Commit

Permalink
Merge branch 'flow'
Browse files Browse the repository at this point in the history
  • Loading branch information
Michael Drogalis committed Feb 22, 2015
2 parents 43ab6c0 + 3802595 commit 2fcdf27
Show file tree
Hide file tree
Showing 6 changed files with 85 additions and 22 deletions.
1 change: 0 additions & 1 deletion doc/user-guide/apis.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ Onyx ships with three distinct APIs to accommodate different needs. A descriptio
- [`shutdown-peer`](#shutdown-peer)
- [`shutdown-env`](#shutdown-env)
- [Task Lifecycle API](#task-lifecycle-api)
- [`start-lifecycle?`](#start-lifecycle)
- [`start-lifecycle?`](#start-lifecycle)
- [`inject-lifecycle-resources`](#inject-lifecycle-resources)
- [`inject-temporal-resources`](#inject-temporal-resources)
Expand Down
14 changes: 14 additions & 0 deletions doc/user-guide/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ We'll take a quick overview of some terms you'll see in the rest of this user gu
- [Task](#task)
- [Workflow](#workflow)
- [Catalog](#catalog)
- [Flow Conditions](#flow-conditions)
- [Segment](#segment)
- [Function](#function)
- [Plugin](#plugin)
Expand Down Expand Up @@ -142,6 +143,19 @@ Example:
:onyx/doc "A HornetQ output stream"}]
```

#### Flow Conditions

In contrast to a workflow, flow conditions specify on a segment-by-segment basis which direction data should flow determined by predicate functions. This is helpful for conditionally processing a segment based off of its content.

Example:

```clojure
[{:flow/from :input-stream
:flow/to [:process-adults]
:flow/predicate :my.ns/adult?
:flow/doc "Emits segment if this segment is an adult."}
```

#### Segment

A segment is the unit of data in Onyx, and it's represented by a Clojure map. Segments represent the data flowing through the cluster.
Expand Down
44 changes: 39 additions & 5 deletions doc/user-guide/flow-conditions.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,22 @@

This section covers flow conditions. Flow conditions are used for isolating logic about whether or not segments should pass through different tasks in a workflow.

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents** *generated with [DocToc](http://doctoc.herokuapp.com/)*

- [Flow Conditions](#flow-conditions)
- [Summary](#summary)
- [Motivating Example](#motivating-example)
- [Predicate Function Signatures](#predicate-function-signatures)
- [Predicate Parameters](#predicate-parameters)
- [Key Exclusion](#key-exclusion)
- [Predicate Composition](#predicate-composition)
- [Match All/None](#match-allnone)
- [Short Circuiting](#short-circuiting)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

### Summary

Workflows specify the structure of your computation as a directed, acyclic graph. A workflow describes all *possible* routes that a segment can take as it enters your workflow. On the other hand, we often have the need to articulate how an *individual* segment moves throughout your workflow. Many times, a segment conditionally moves from one task to another. This is a concept that Onyx takes apart and turns into its own idea, independent of the rest of your computation. They're called Flow Conditions.
Expand All @@ -11,10 +27,10 @@ Workflows specify the structure of your computation as a directed, acyclic graph
The easiest way to learn how to use flow conditions is to see an example. Suppose we have the following workflow snippet:

```clojure
[[:input-stream :children]
[:input-stream :adults]
[:input-stream :western-females]
[:input-stream :everyone]
[[:input-stream :process-children]
[:input-stream :process-adults]
[:input-stream :process-western-females]
[:input-stream :process-everyone]
...]
```

Expand Down Expand Up @@ -79,6 +95,23 @@ Predicate functions can take parameters at runtime. In this first flow condition

Sometimes, the decision of whether to allow a segment to pass through to the next task depends on some side effects that were a result of the original segment transformation. Onyx allows you to handle this case by adding extra keys to your segment that comes out of the transformation function. These extra keys are visible in your predicate function, and then stripped off before being sent to the next task. You can indicate these "extra keys" by the setting `:onyx/exclude-keys` to a vector of keys.

For example, if we had the following transformation function:

```clojure
(defn my-function [x]
(assoc x :result 42 :side-effects-result :blah))
```

Our predicate for flow conditions might need to use the `:side-effects-result` to make a decision. We don't want to actually send that information over out to the next task, though - so we `:flow/exclude-keys` on `:side-effects-results` to make it disappear after the predicate result has been realized.

```clojure
{:flow/from :input-stream
:flow/to [:process-adults]
:flow/predicate :my.ns/adult?
:flow/exclude-keys [:side-effects-result]
:flow/doc "Emits segment if this segment is an adult."}
```

### Predicate Composition

One extraordinarily powerful feature of Flow Conditions is its composition characters. Predicates can be composed with logical `and`, `or`, and `not`. We use composition to check if the segment is both female and western living in `[:and :my.ns/female? :my.ns/western?]`. Logical function calls must be surrounded with brackets, and may be nested arbitrarily. Functions inside of logical operator calls may be parameterized, as in `[:and :my.ns/female? [:my.ns/western? :my/state-param]]` Parameters *may not* specify logical functions.
Expand Down Expand Up @@ -109,4 +142,5 @@ If a flow condition specifies `:all` as its `:flow/to`, it must come before any

### Short Circuiting

If multiple flow condition entries evaluate to a true predicate, their `:flow/to` values are combined, as well as their `:flow/exclude-keys`. Sometimes you don't want this behavior, and you want to specify exactly the downstream tasks to emit to - and not check any more flow condition entries. You can do this with `:flow/short-circuit?` set to `true`. Any entry that has `:flow/short-circuit?` set to `true` must come before any entries for an task that have it set to `false` or `nil`.
If multiple flow condition entries evaluate to a true predicate, their `:flow/to` values are unioned (duplicates aren't acknowledged), as well as their `:flow/exclude-keys`. Sometimes you don't want this behavior, and you want to specify exactly the downstream tasks to emit to - and not check any more flow condition entries. You can do this with `:flow/short-circuit?` set to `true`. Any entry that has `:flow/short-circuit?` set to `true` must come before any entries for an task that have it set to `false` or `nil`.

17 changes: 16 additions & 1 deletion doc/user-guide/information-model.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Information Model

This chapter specifies what a valid catalog and workflow look like, as well as how the underlying ZooKeeper representation is realized.
This section specifies what a valid catalog, workflow, and flow conditions look like.

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
Expand All @@ -13,6 +13,7 @@ This chapter specifies what a valid catalog and workflow look like, as well as h
- [Maps with `:onyx/type` set to `:input` or `:output` must have these keys](#maps-with-onyxtype-set-to-input-or-output-must-have-these-keys)
- [Maps with `:onyx/type` set to `:function` must have these keys](#maps-with-onyxtype-set-to-function-must-have-these-keys)
- [Maps with `:onyx/type` set to `:function` may optionally have these keys](#maps-with-onyxtype-set-to-function-may-optionally-have-these-keys)
- [Flow Conditions](#flow-conditions)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

Expand Down Expand Up @@ -65,3 +66,17 @@ This chapter specifies what a valid catalog and workflow look like, as well as h
|`:onyx/group-by-key`| `keyword` | `any`
|`:onyx/group-by-fn` | `keyword` | `any`


### Flow Conditions

- a single Clojure vector which is EDN serializable/deserializable
- all elements in the vector must be Clojure maps

| key name |type | optional?| default
|----------------------|------------------------------|----------|--------
|`:flow/from` |`keyword` | no |
|`:flow/to` |`:all`, `:none` or `[keyword]`| no |
|`:flow/predicate` |`keyword` or `[keyword]` | no |
|`:flow/exclude-keys` |`[keyword]` | yes | `[]`
|`:flow/short-circuit?`|`boolean` | yes |`false`

6 changes: 4 additions & 2 deletions doc/user-guide/peer-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@ The chapter describes the options available to configure the Virtual Peers.
- [Coordinator & Peer](#coordinator-&-peer)
- [Base Configuration](#base-configuration)
- [`:hornetq/mode`](#hornetqmode)
- [VM](#vm)
- [Standalone](#standalone)
- [UDP Configuration](#udp-configuration)
- [JGroups Configuration](#jgroups-configuration)
- [Environment Only](#environment-only)
- [`:hornetq/server`](#hornetqserver)
- [`:hornetq/server?`](#hornetqserver)
- [`:hornetq.server/type`](#hornetqservertype)
- [`:zookeeper/server`](#zookeeperserver)
- [`:zookeeper/server?`](#zookeeperserver)
- [Embedded Configuration](#embedded-configuration)
- [`:hornetq.embedded/config hq-servers`](#hornetqembeddedconfig-hq-servers)
- [Peer Only](#peer-only)
Expand Down
25 changes: 12 additions & 13 deletions doc/user-guide/scheduling.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,18 @@ Onyx offers fine-grained control of how many peers are allocated to particular j
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents** *generated with [DocToc](http://doctoc.herokuapp.com/)*

- [Scheduling](#scheduling)
- [Allocating Peers to Jobs and Tasks](#allocating-peers-to-jobs-and-tasks)
- [Job Schedulers](#job-schedulers)
- [Greedy Job Scheduler](#greedy-job-scheduler)
- [Round Robin Job Scheduler](#round-robin-job-scheduler)
- [Round Robin Rebalancing Strategy](#round-robin-rebalancing-strategy)
- [Percentage Job Scheduler](#percentage-job-scheduler)
- [Percentage Rebalancing Strategy](#percentage-rebalancing-strategy)
- [Task Schedulers](#task-schedulers)
- [Greedy Task Scheduler](#greedy-task-scheduler)
- [Round Robin Task Scheduler](#round-robin-task-scheduler)
- [Percentage Task Scheduler](#percentage-task-scheduler)
- [Examples](#examples)
- [Allocating Peers to Jobs and Tasks](#allocating-peers-to-jobs-and-tasks)
- [Job Schedulers](#job-schedulers)
- [Greedy Job Scheduler](#greedy-job-scheduler)
- [Round Robin Job Scheduler](#round-robin-job-scheduler)
- [Round Robin Rebalancing Strategy](#round-robin-rebalancing-strategy)
- [Percentage Job Scheduler](#percentage-job-scheduler)
- [Percentage Rebalancing Strategy](#percentage-rebalancing-strategy)
- [Task Schedulers](#task-schedulers)
- [Greedy Task Scheduler](#greedy-task-scheduler)
- [Round Robin Task Scheduler](#round-robin-task-scheduler)
- [Percentage Task Scheduler](#percentage-task-scheduler)
- [Examples](#examples)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

Expand Down

0 comments on commit 2fcdf27

Please sign in to comment.