Onyx

distributed computation for the cloud

Data Driven API

Onyx programs are described using immutable data structures, putting a powerful force in the hands of the developer to cross language and machine boundaries at runtime. Onyx knows how to speak to many storage solutions, including those below.

{
    "onyx/name": "in",
    "onyx/medium": "kafka",
    "onyx/plugin": "onyx.plugin.kafka/input",
    "onyx/type": "input",
    "onyx/language": "clojure",
    "onyx/batch-size": 20,
    "onyx/doc": "Reads segments from a Kafka topic",
    "kafka/topic": "tweets",
    "kafka/zookeeper": "127.0.0.1:2181",
    "kafka/group-id": "onyx-consumer"
}

Powerful Streaming Primitives

Advancing windowing APIs split up a potentially unbounded data set into finite, possibly overlapping portions. Aggregations over distinct portions of a stream are as simple as a data structure.

fixed-windows sliding-windows session-windows global-windows
aggregations timer-triggers segment-triggers refinement-modes
{
    "window/id": "sum-transactions",
    "window/task": "parse-price",
    "window/type": "sliding",
    "window/aggregation":
      ["onyx.windowing.aggregation/sum", "price"],
    "window/window-key": "event-time",
    "window/range": [5, "minutes"],
    "window/slide": [1, "minute"],
    "window/doc": "Sum the price field into sliding windows"
}

Native Abstractions

Onyx keeps its own abstractions to an absolute minimum. Instead, it leverages the abstractions that already exist in languages for data structures and functional transformations. When in Clojure, write plain Clojure functions. When in Java, write plain Java classes. No cruft, no inheritance insanity, no ill-defined interfaces. Data in, data out.

(defn split-into-words [min-chars {:keys [tweet]}]
  (->> (clojure.string/split tweet #"\s")
       (filter (fn [word] (> (count word) min-chars)))
       (map (fn [word] {:word word}))))