Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Formats

This section contains the data model definition for super-structured data along with a set of concrete formats that all implement this same data model, providing a unified approach to row, columnar, and human-readable formats:

  • Super (SUP) is a human-readable format for super-structured data. All JSON documents are SUP values as the SUP format is a strict superset of the JSON syntax.
  • Super Binary (BSUP) is a row-based, binary representation somewhat like Avro but leveraging the super data model to represent a sequence of arbitrarily-typed values.
  • Super Columnar (CSUP) is columnar like Parquet, ORC, or Arrow but for super-structured data.

Because all of the formats conform to the same super-structured data model, conversions between a human-readable form, a row-based binary form, and a row-based columnar form can be carried out with no loss of information. This provides the best of both worlds: the same data can be easily expressed in and converted between a human-friendly and easy-to-program text form alongside efficient row and columnar formats.