I’m excited to reveal one of the most exciting usability enhancements yet to the Onyx Platform. Onyx version 0.9.6 contains a revamped module for performing static analysis on job submission. In this post, we’ll look at how Onyx uses static analysis, what the benefits are, and how we’ve significantly advanced upon our previous work.

Data Driven APIs: A Double Edged Sword

Onyx’s dominant quality is that it uses a purely data driven information model as its first class API. That is, developers use data structures to describe the vast majority of the distributed programming model that Onyx exposes. This attribute, coupled with judicious use of Clojure code to express behavior, is a tremendously powerful weapon for creating ultra-flexible distributed applications. Onyx’s qualities make it malleable enough to feel like a Lisp for programming distributed systems.

Flexibility, of course, always comes with a cost. One strong advantage of using a code-level API is that the implementer can make use of the programming language’s type checker to statically look for some classes of errors. For languages with sophisticated type systems, end users instantly benefit from the work that the language designers already put in place.

Interfaces for pure data typically don’t receive this perk. There are a limited number of data structures - maps, sets, lists, vectors, to name a few. A dynamic runtime, particularly Clojure’s, gives you very little in the way of static analysis - by design. Developers often compensate for this by using Schema. Schema can be thought of as a light-weight type checker. It allows users to adorn data with a contractual specification of types and semantics, and generally enforces that the “shape” of a data structure is consistent throughout its usage at run-time.

Shortly after Onyx’s open source release, we, too, adapted Schema. Schema is a great step forward for analyzing Onyx jobs for syntactic and semantic errors, such as a malformed workflow, or an incomplete catalog. Schema’s error messages, however, leave much to be desired. When a constraint is violated, Schema throws an exception packed with a data structure describing the failure. You surely recognize something like this:

I’ve been using Schema for a long time. I can decipher what this means, and I’ve gotten used to it. But that doesn’t mean we should get used to it. Onyx has been designed to be the luxury vehicle for distributed data processing. We knew we could improve the usability of displaying and explaining errors if we were careful.

Onyx 0.9.6 is bundled with a new module for static analysis that overlays Schema’s error messages with ones that are significantly more readable. Instead of a stack trace, you’ll now receive a print-out as seen below (and a reference the original exception object that Schema threw, too):

Now when you make an error, you’ll receive a print out of the erroneous data structure, console highlighting with a pointer directly to the problematic area, an explanation of what went wrong, and, if possible, the corresponding documentation for the malformed information.

Here are a few more examples. Notice that under some circumstances, we’ll try to make a guess about what you meant when things are close to right:










All of Onyx’s job data structures use the new static analyzer module. We’ll soon be transitioning all configuration options over too, which include the environment and peer configuration data structures. We also have plans to backport all of this work into a stand-alone library that can be used against arbitrary data structures.

Thanks for using Onyx. We hope you thoroughly enjoy the upgrade as much as we do.