Classic SQL or Modern Pipeline Syntax
Queries can be authored in SuperSQL for familiarity and backward compatibility, or with SuperPipe, a modern pipeline syntax.
from users
|> where created_at > 2024-01-01
|> group by department
Rich Types. Simple Queries.
No Tables Required
The analytical SQL database that puts JSON and relational tables on equal footing.
brew install brimdata/tap/super
Get Started
SuperDB is a fast, analytical, open-source SQL database that eliminates entire classes of difficult problems with its revolutionary “Super-Structured” data model. With this model, your data is both:
Strongly typed like databases
Dynamically typed like JSON
A Superset of JSON and Relational Models
The super
executable enables you to:
# Analyze local files
super -c 'SELECT count() FROM data.json'
# Start a lake process, create a pool, load data, and query it
super db serve
super db create my_data
super db load -use my_data data.json
super db query "SELECT count() FROM my_data"
Your data can be loaded and queried immediately because SuperDB’s underlying model is based on types, not schemas. Input is treated as a sequence of values where each value has a fully-specified type.
To demonstrate what “fully-specified type” means, this example query will return the type of each value in the input. The angle brackets in the results represent a type. SuperDB allows you to work with any of these types in the same way. There’s no need to force everything into a column.
Examples are interactive.
typeof(this)
100
3.14
5m
true
"hello world"
2024-02-01T00:00:00Z
<string>
{project: "superdb", stars: 1400}
[1, "two", 3.0]
error("divided by 0")
Loading...
SQL’s static type system is loved by data practitioners for good reason. JSON’s dynamic type system is equally popular as a data output format. Other databases are trying to put the dynamic type system of JSON into the static SQL architecture.
DB | Dynamic Type |
---|---|
DuckDB | JSON |
Postgres | jsonb |
ClickHouse | dynamic |
Snowflake | variant |
Databricks | variant |
These databases now have two type systems and you must use an entirely different set of operators on the dynamic columns.
It’s not going so well.
SuperDB turns the problem inside out by making the dynamic type system the architectural foundation.
This has huge implications for how we work with data. You now have control over the strictness of your schema policies. When necessary, schemas can reject non-conforming data, but they can also accept it and flag it for later inspection and repair. No more lost data. No more brittle pipelines.
Queries can be authored in SuperSQL for familiarity and backward compatibility, or with SuperPipe, a modern pipeline syntax.
from users
|> where created_at > 2024-01-01
|> group by department
Query Parquet/CSV/JSON files directly for one-shot analysis. If you need more speed, convert the files into optimized binary/columnar formats. If you need persistence and organization, spin up a local data lake.
All with the same super
command.
SuperDB uses data formats that conform to the super-structured data model. Data loaded into a lake is stored in these formats and allows compute to be efficiently separated from storage.
.jsup
.bsup
.csup
SuperDB Desktop is the official graphical user interface for managing and querying SuperDB data lakes. It is also hands down one of the best apps for viewing and inspecting nested JSON data.
SuperDB supports 30 primitive types and 7 complex types that can be combined to represent any data you throw at it. These include:
In SuperDB, errors can be queryable values just like your valid data. They can even contain other types of data to create structured and stacked errors.
from ingest_pool |> is(<error>)
error("divide by 0")
error({message: "not able to ingest", src_row: {...}})
We are currently using SuperDB for zero-downtime database transitions with several hundred customers.
– VP, Municipal IT Services Company
Fantastic for JSON Data.
– Network Security Analyst
The best open-source language I’ve found for log querying so far.
– Programmer, Financial Services
I don’t think there’s anything that lets you interact with JSON data the way this does.
– Network Security Analyst
The more I’ve read - it’s very clear this isn’t just another one-off cli tool.
– Data Engineer
You guys are onto something Big.
– Digital Forensics Analyst
SuperDB is already being used in production environments by a number of early users. Yet, there is still work to be done. The current focus is utilizing vectors to speed up operators, parallelizing our file readers, and becoming more compatible with the SQL spec. We are committed to grow and improve SuperDB until it one day powers our cloud data lake offerings.
Follow our GitHub repository to keep pace with our progress.
Stay updated and share your experiences by following one of our social accounts in the footer. We can also email you about new blog posts and progress updates when you subscribe below.