Merge branch 'flow'

ilovejs · Feb 22, 2015 · 2fcdf27 · 2fcdf27
2 parents 43ab6c0 + 3802595
commit 2fcdf27
Show file tree

Hide file tree

Showing 6 changed files with 85 additions and 22 deletions.
diff --git a/doc/user-guide/apis.md b/doc/user-guide/apis.md
@@ -14,7 +14,6 @@ Onyx ships with three distinct APIs to accommodate different needs. A descriptio
     - [`shutdown-peer`](#shutdown-peer)
     - [`shutdown-env`](#shutdown-env)
 - [Task Lifecycle API](#task-lifecycle-api)
-    - [`start-lifecycle?`](#start-lifecycle)
     - [`start-lifecycle?`](#start-lifecycle)
     - [`inject-lifecycle-resources`](#inject-lifecycle-resources)
     - [`inject-temporal-resources`](#inject-temporal-resources)

diff --git a/doc/user-guide/concepts.md b/doc/user-guide/concepts.md
@@ -10,6 +10,7 @@ We'll take a quick overview of some terms you'll see in the rest of this user gu
   - [Task](#task)
   - [Workflow](#workflow)
   - [Catalog](#catalog)
+  - [Flow Conditions](#flow-conditions)
   - [Segment](#segment)
   - [Function](#function)
   - [Plugin](#plugin)
@@ -142,6 +143,19 @@ Example:
  :onyx/doc "A HornetQ output stream"}]
 ```
 
+#### Flow Conditions
+
+In contrast to a workflow, flow conditions specify on a segment-by-segment basis which direction data should flow determined by predicate functions. This is helpful for conditionally processing a segment based off of its content.
+
+Example:
+
+```clojure
+[{:flow/from :input-stream
+  :flow/to [:process-adults]
+  :flow/predicate :my.ns/adult?
+  :flow/doc "Emits segment if this segment is an adult."}
+```
+
 #### Segment
 
 A segment is the unit of data in Onyx, and it's represented by a Clojure map. Segments represent the data flowing through the cluster.

diff --git a/doc/user-guide/flow-conditions.md b/doc/user-guide/flow-conditions.md
@@ -2,6 +2,22 @@
 
 This section covers flow conditions. Flow conditions are used for isolating logic about whether or not segments should pass through different tasks in a workflow.
 
+<!-- START doctoc generated TOC please keep comment here to allow auto update -->
+<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
+**Table of Contents**  *generated with [DocToc](http://doctoc.herokuapp.com/)*
+
+- [Flow Conditions](#flow-conditions)
+  - [Summary](#summary)
+  - [Motivating Example](#motivating-example)
+  - [Predicate Function Signatures](#predicate-function-signatures)
+  - [Predicate Parameters](#predicate-parameters)
+  - [Key Exclusion](#key-exclusion)
+  - [Predicate Composition](#predicate-composition)
+  - [Match All/None](#match-allnone)
+  - [Short Circuiting](#short-circuiting)
+
+<!-- END doctoc generated TOC please keep comment here to allow auto update -->
+
 ### Summary
 
 Workflows specify the structure of your computation as a directed, acyclic graph. A workflow describes all *possible* routes that a segment can take as it enters your workflow. On the other hand, we often have the need to articulate how an *individual* segment moves throughout your workflow. Many times, a segment conditionally moves from one task to another. This is a concept that Onyx takes apart and turns into its own idea, independent of the rest of your computation. They're called Flow Conditions.
@@ -11,10 +27,10 @@ Workflows specify the structure of your computation as a directed, acyclic graph
 The easiest way to learn how to use flow conditions is to see an example. Suppose we have the following workflow snippet:
 
 ```clojure
-[[:input-stream :children]
- [:input-stream :adults]
- [:input-stream :western-females]
- [:input-stream :everyone]
+[[:input-stream :process-children]
+ [:input-stream :process-adults]
+ [:input-stream :process-western-females]
+ [:input-stream :process-everyone]
  ...]
 ```
 
@@ -79,6 +95,23 @@ Predicate functions can take parameters at runtime. In this first flow condition
 
 Sometimes, the decision of whether to allow a segment to pass through to the next task depends on some side effects that were a result of the original segment transformation. Onyx allows you to handle this case by adding extra keys to your segment that comes out of the transformation function. These extra keys are visible in your predicate function, and then stripped off before being sent to the next task. You can indicate these "extra keys" by the setting `:onyx/exclude-keys` to a vector of keys.
 
+For example, if we had the following transformation function:
+
+```clojure
+(defn my-function [x]
+  (assoc x :result 42 :side-effects-result :blah))
+```
+
+Our predicate for flow conditions might need to use the `:side-effects-result` to make a decision. We don't want to actually send that information over out to the next task, though - so we `:flow/exclude-keys` on `:side-effects-results` to make it disappear after the predicate result has been realized.
+
+```clojure
+{:flow/from :input-stream
+ :flow/to [:process-adults]
+ :flow/predicate :my.ns/adult?
+ :flow/exclude-keys [:side-effects-result]
+ :flow/doc "Emits segment if this segment is an adult."}
+```
+
 ### Predicate Composition
 
 One extraordinarily powerful feature of Flow Conditions is its composition characters. Predicates can be composed with logical `and`, `or`, and `not`. We use composition to check if the segment is both female and western living in `[:and :my.ns/female? :my.ns/western?]`. Logical function calls must be surrounded with brackets, and may be nested arbitrarily. Functions inside of logical operator calls may be parameterized, as in `[:and :my.ns/female? [:my.ns/western? :my/state-param]]` Parameters *may not* specify logical functions.
@@ -109,4 +142,5 @@ If a flow condition specifies `:all` as its `:flow/to`, it must come before any
 
 ### Short Circuiting
 
-If multiple flow condition entries evaluate to a true predicate, their `:flow/to` values are combined, as well as their `:flow/exclude-keys`. Sometimes you don't want this behavior, and you want to specify exactly the downstream tasks to emit to - and not check any more flow condition entries. You can do this with `:flow/short-circuit?` set to `true`. Any entry that has `:flow/short-circuit?` set to `true` must come before any entries for an task that have it set to `false` or `nil`.
+If multiple flow condition entries evaluate to a true predicate, their `:flow/to` values are unioned (duplicates aren't acknowledged), as well as their `:flow/exclude-keys`. Sometimes you don't want this behavior, and you want to specify exactly the downstream tasks to emit to - and not check any more flow condition entries. You can do this with `:flow/short-circuit?` set to `true`. Any entry that has `:flow/short-circuit?` set to `true` must come before any entries for an task that have it set to `false` or `nil`.
+
diff --git a/doc/user-guide/information-model.md b/doc/user-guide/information-model.md
@@ -1,6 +1,6 @@
 ## Information Model
 
-This chapter specifies what a valid catalog and workflow look like, as well as how the underlying ZooKeeper representation is realized.
+This section specifies what a valid catalog, workflow, and flow conditions look like.
 
 <!-- START doctoc generated TOC please keep comment here to allow auto update -->
 <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
@@ -13,6 +13,7 @@ This chapter specifies what a valid catalog and workflow look like, as well as h
   - [Maps with `:onyx/type` set to `:input` or `:output` must have these keys](#maps-with-onyxtype-set-to-input-or-output-must-have-these-keys)
   - [Maps with `:onyx/type` set to `:function` must have these keys](#maps-with-onyxtype-set-to-function-must-have-these-keys)
   - [Maps with `:onyx/type` set to `:function` may optionally have these keys](#maps-with-onyxtype-set-to-function-may-optionally-have-these-keys)
+- [Flow Conditions](#flow-conditions)
 
 <!-- END doctoc generated TOC please keep comment here to allow auto update -->
 
@@ -65,3 +66,17 @@ This chapter specifies what a valid catalog and workflow look like, as well as h
 |`:onyx/group-by-key`| `keyword`  | `any`
 |`:onyx/group-by-fn` | `keyword`  | `any`
 
+
+### Flow Conditions
+
+- a single Clojure vector which is EDN serializable/deserializable
+- all elements in the vector must be Clojure maps
+
+| key name             |type                          | optional?| default
+|----------------------|------------------------------|----------|--------
+|`:flow/from`          |`keyword`                     | no       |
+|`:flow/to`            |`:all`, `:none` or `[keyword]`| no       |
+|`:flow/predicate`     |`keyword` or `[keyword]`      | no       |
+|`:flow/exclude-keys`  |`[keyword]`                   | yes      | `[]`
+|`:flow/short-circuit?`|`boolean`                     | yes      |`false`
+
diff --git a/doc/user-guide/peer-config.md b/doc/user-guide/peer-config.md
@@ -9,12 +9,14 @@ The chapter describes the options available to configure the Virtual Peers.
 - [Coordinator & Peer](#coordinator-&-peer)
   - [Base Configuration](#base-configuration)
   - [`:hornetq/mode`](#hornetqmode)
+  - [VM](#vm)
+  - [Standalone](#standalone)
   - [UDP Configuration](#udp-configuration)
   - [JGroups Configuration](#jgroups-configuration)
 - [Environment Only](#environment-only)
-    - [`:hornetq/server`](#hornetqserver)
+    - [`:hornetq/server?`](#hornetqserver)
     - [`:hornetq.server/type`](#hornetqservertype)
-    - [`:zookeeper/server`](#zookeeperserver)
+    - [`:zookeeper/server?`](#zookeeperserver)
   - [Embedded Configuration](#embedded-configuration)
     - [`:hornetq.embedded/config hq-servers`](#hornetqembeddedconfig-hq-servers)
 - [Peer Only](#peer-only)

diff --git a/doc/user-guide/scheduling.md b/doc/user-guide/scheduling.md
@@ -6,19 +6,18 @@ Onyx offers fine-grained control of how many peers are allocated to particular j
 <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
 **Table of Contents**  *generated with [DocToc](http://doctoc.herokuapp.com/)*
 
-- [Scheduling](#scheduling)
-  - [Allocating Peers to Jobs and Tasks](#allocating-peers-to-jobs-and-tasks)
-    - [Job Schedulers](#job-schedulers)
-      - [Greedy Job Scheduler](#greedy-job-scheduler)
-      - [Round Robin Job Scheduler](#round-robin-job-scheduler)
-      - [Round Robin Rebalancing Strategy](#round-robin-rebalancing-strategy)
-      - [Percentage Job Scheduler](#percentage-job-scheduler)
-      - [Percentage Rebalancing Strategy](#percentage-rebalancing-strategy)
-    - [Task Schedulers](#task-schedulers)
-      - [Greedy Task Scheduler](#greedy-task-scheduler)
-      - [Round Robin Task Scheduler](#round-robin-task-scheduler)
-      - [Percentage Task Scheduler](#percentage-task-scheduler)
-    - [Examples](#examples)
+- [Allocating Peers to Jobs and Tasks](#allocating-peers-to-jobs-and-tasks)
+  - [Job Schedulers](#job-schedulers)
+    - [Greedy Job Scheduler](#greedy-job-scheduler)
+    - [Round Robin Job Scheduler](#round-robin-job-scheduler)
+    - [Round Robin Rebalancing Strategy](#round-robin-rebalancing-strategy)
+    - [Percentage Job Scheduler](#percentage-job-scheduler)
+    - [Percentage Rebalancing Strategy](#percentage-rebalancing-strategy)
+  - [Task Schedulers](#task-schedulers)
+    - [Greedy Task Scheduler](#greedy-task-scheduler)
+    - [Round Robin Task Scheduler](#round-robin-task-scheduler)
+    - [Percentage Task Scheduler](#percentage-task-scheduler)
+  - [Examples](#examples)
 
 <!-- END doctoc generated TOC please keep comment here to allow auto update -->