SELECT
A SELECT query has the form
SELECT [ DISTINCT | ALL ] <expr>|<star> [ AS <column> ] [ , <expr>|<star> [ AS <column> ]... ]
[ FROM <table-expr> [ , <table-expr> ... ] ]
[ WHERE <predicate> ]
[ GROUP BY <expr>|<ordinal> [ , <expr>|<ordinal> ... ]]
[ HAVING <predicate> ]
where
<expr>is an expression,<star>is a column pattern,<column>is an identifier,<table-expr>is an input as defined in the FROM clause,<predicate>is a Boolean-valued expression, and<ordinal>is a column number as defined in GROUP BY.
The list of expressions followed the SELECT keyword is called
the projection and the column names derived from the AS clauses
are referred to as the column aliases.
A SELECT query may be used as a building block in more complex queries as it
is a <sql-body> in the structure of a
<sql-op>.
Likewise, it may be
prefixed by a WITH clause
defining one or more CTEs and/or
followed by optional
ORDER BY and LIMIT clauses.
Since a <sql-body> is also a <sql-op> and any
<sql-op> is a pipe operator,
a SELECT query may be used anywhere a pipe operator may appear.
Note
Grouping sets are not yet available in SuperSQL.
Execution Steps
A SELECT query performs its computation by
- forming an input table indicated by its FROM clause,
- optionally filtering the input table with its WHERE clause,
- optionally grouping rows into aggregates, one for each unique set of values of the grouping expressions specified by the GROUP BY clause, or grouping the entire input into a single aggregate row when there are aggregate functions present,
- optionally filtering aggregated rows with its HAVING clause, and finally
- producing an output table based on the list of
expressions or column patterns
in the
SELECTclause.
A SELECT query typically specifies its input using one or more
tables specified in the FROM clause, but when the
FROM clause is omitted, the query takes its input from the
parent pipe operator.
If there is no parent operator and FROM is omitted, then the
default input is a single null value.
A FROM clause may also take input from its parent when using
an f-string as its input table.
In this case, the input table is dynamically typed.
Column Patterns
A column pattern, as indicated by <star> above,
uses the * notation to match multiple columns
from the input table. In its standalone form, it matches all columns
in the input table, e.g.,
SELECT * FROM table1 CROSS JOIN table2
matches all columns from table1 and table2.
A column pattern may be prefixed with a table name as in table.* as in
SELECT table2.* FROM table1 CROSS JOIN table2
which matches only the columns from the specified table.
The Projection
The output of the SELECT query, called the projection,
is a set of rows formed from the list of expressions following
the SELECT keyword where each rows is represented
by a record.
The record fields correspond to the columns of the table
and the field names and positions are fixed over the entire
result set. The type of a column may vary from row to row when the
SELECT expressions produce values of varying types.
The names of the columns are specified by each AS clause. When the
AS clause is absent, the column name is
derived
from the expression in the same way field names are derived from
expression in record expressions.
[!NOTE] Column names currently must be unique as the underlying record type requires distinct field names. Names are automatically deduplicated when there are conflicts. SuperSQL will support duplicate column names in a future release.
The projection may be grouped or non-grouped.
Grouped Projection
A grouped projection occurs when either or both occur:
- there is a GROUP BY clause, or
- there is at least one reference to an
aggregate function in the projection,
in a
HAVINGclause, or in anORDER BYclause.
In a grouped projection, the HAVING clause, ORDER BY clause, and
the projection may refer only to inputs that are aggregate functions
(where the function arguments are bound to the input scope and colum
aliases) or to expressions or combination of expressions that appear
in the GROUP BY clause.
Aggregate functions may be organized into expressions as any other function but they may not appear anywhere inside of a argument to another aggregate function.
There is one output row for each unique set of values of the
grouping expressions and the arguments for each instance of
each aggregate function are evaluated over the grouped set of values
optionally filtered with an aggregate function FILTER clause.
Non-grouped Projection
A non-grouped projection occurs when there are no references to
aggregate functions and there is no GROUP BY clause. In this case,
there cannot be a HAVING clause.
The projection formed here consists of the SELECT expressions
evaluated once for each row from the input table that is not
filtered by the WHERE clause.
Examples
Hello world
# spq
SELECT 'hello, world' AS message
# input
# expected output
{message:"hello, world"}
Reference to this to see default input is null
# spq
SELECT this
# input
# expected output
{that:null}
Mix alias and inferred column names
# spq
SELECT upper(s), upper(s[0:1])||s[1:] AS mixed
# input
{s:"foo"}
{s:"bar"}
# expected output
{upper:"FOO",mixed:"Foo"}
{upper:"BAR",mixed:"Bar"}
Column names (currently) must be unique and are deduplicated
# spq
SELECT s, s
# input
{s:"foo"}
{s:"bar"}
# expected output
{s:"foo",s_1:"foo"}
{s:"bar",s_1:"bar"}
Distinct values sorted
# spq
SELECT DISTINCT s ORDER BY s
# input
{s:"foo"}
{s:"bar"}
{s:"foo"}
# expected output
{s:"bar"}
{s:"foo"}
Select entire rows as records using a table reference
# spq
WITH T(x,y) AS (
VALUES (1,1), (2,2), (3,2)
)
SELECT T
FROM T
# input
# expected output
{T:{x:1,y:1}}
{T:{x:2,y:2}}
{T:{x:3,y:2}}
Select entire rows as records using this
# spq
WITH T(x,y) AS (
VALUES (1,1), (2,2), (3,2)
)
SELECT this as table
FROM T
# input
# expected output
{table:{x:1,y:1}}
{table:{x:2,y:2}}
{table:{x:3,y:2}}