SuperDB - Rich Data Types, Simple SQL Queries, No Tables Required

What is SuperDB?

SuperDB is a fast, analytical, open-source SQL database that eliminates entire classes of difficult problems with its revolutionary “Super-Structured” data model. With this model, your data is both:

Strongly typed like databases
Dynamically typed like JSON

A Superset of JSON and Relational Models

How do I Use It?

The super executable enables you to:

Run SQL queries directly on local JSON/Parquet/CSV files
Convert output into a variety of common formats
Spin up a local data lake to persist data in super-structured format

# Analyze local files
super -c 'SELECT count() FROM data.json'

# Start a lake process, create a pool, load data, and query it
super db serve
super db create my_data
super db load -use my_data data.json
super db query "SELECT count() FROM my_data"

Static Tables Not Required

Your data can be loaded and queried immediately because SuperDB’s underlying model is based on types, not schemas. Input is treated as a sequence of values where each value has a fully-specified type.

To demonstrate what “fully-specified type” means, this example query will return the type of each value in the input. The angle brackets in the results represent a type. SuperDB allows you to work with any of these types in the same way. There’s no need to force everything into a column.

Examples are interactive.

Query

typeof(this)

Input

100
3.14
5m
true
"hello world"
2024-02-01T00:00:00Z
<string>
{project: "superdb", stars: 1400}
[1, "two", 3.0]
error("divided by 0")

Result

Loading...

JSON, JSON, JSON

SQL’s static type system is loved by data practitioners for good reason. JSON’s dynamic type system is equally popular as a data output format. Other databases are trying to put the dynamic type system of JSON into the static SQL architecture.

DB	Dynamic Type
DuckDB	JSON
Postgres	jsonb
ClickHouse	JSON
Snowflake	variant
Databricks	variant

These databases now have two type systems and you often must use an entirely different set of operators on the dynamic columns.

It’s not going so well.

SuperDB turns the problem inside out by making the dynamic type system the architectural foundation.

This has huge implications for how we work with data. You now have control over the strictness of your schema policies. When necessary, schemas can reject non-conforming data, but they can also accept it and flag it for later inspection and repair. No more lost data. No more brittle pipelines.

Classic SQL or Modern Pipeline Syntax

The SuperSQL query language enables authoring queries in classic SQL for familiarity and backward compatibility, but also provides an extended pipe syntax and lots of fun shortcuts to perform data operations that are difficult or impossible in classic SQL.

from users
| where created_at > 2024-01-01
| group by department

Learn More

Start Simple then Scale Up

Query Parquet/CSV/JSON files directly for one-shot analysis. If you need more speed, convert the files into optimized binary/columnar formats. If you need persistence and organization, spin up a local data lake.

All with the same super command.

Learn More

Open Formats for Humans and Machines

SuperDB uses data formats that conform to the super-structured data model. Data loaded into a lake is stored in these formats and allows compute to be efficiently separated from storage.

Super .sup
Super Binary .bsup
Super Columnar .csup

Learn More

Official Desktop Application

SuperDB Desktop is the official graphical user interface for managing and querying SuperDB data lakes. It is also hands down one of the best apps for viewing and inspecting nested JSON data.

Coming soon!

Rich Type System

SuperDB supports 30 primitive types and 7 complex types that can be combined to represent any data you throw at it. These include:

Records (structs)
Arrays
Sets
Maps
Unions
Errors

Learn More

Errors Are Data

In SuperDB, errors can be queryable values just like your valid data. They can even contain other types of data to create structured and stacked errors.

from ingest_pool | is(<error>)

error("divide by 0")
error({message: "not able to ingest", src_row: {...}})

Learn More

What Users Are Saying

We are currently using SuperDB for zero-downtime database transitions with several hundred customers.

– VP, Municipal IT Services Company

Fantastic for JSON Data.

– Network Security Analyst

The best open-source language I’ve found for log querying so far.

– Programmer, Financial Services

I don’t think there’s anything that lets you interact with JSON data the way this does.

– Network Security Analyst

The more I’ve read - it’s very clear this isn’t just another one-off cli tool.

– Data Engineer

You guys are onto something Big.

– Digital Forensics Analyst

Development Roadmap

SuperDB is still under development so there’s not yet a GA release. You’re welcome to try it out in its early form, and we’d love to hear your feedback!

Even in these early days, SuperDB is already being used in production environments by a number of early users. The current focus of our development efforts include utilizing vectors to speed up operators, parallelizing our file readers, and becoming more compatible with the SQL spec. We are committed to grow and improve SuperDB until it one day powers our cloud data lake offerings.

Follow our GitHub repository to keep pace with our progress.

Join the Community

Stay updated and share your experiences by following one of our social accounts in the footer and joining our public Slack workspace.