from

Table of Contents

Operator

from — source data from pools, files, or URIs

Synopsis

from <pool>[@<commitish>]
from <pattern>
file <path> [format <format>]
get <uri> [format <format>]
from (
   pool <pool>[@<commitish>] [ => <branch> ]
   pool <pattern>
   file <path> [format <format>] [ => <branch> ]
   get <uri> [format <format>] [ => <branch> ]
   pass
   ...
)

Description

The from operator identifies one or more data sources and transmits their data to its output. A data source can be

  • the name of a data pool in a SuperDB lake, with optional commitish;
  • the names of multiple data pools, expressed as a regular expression or glob pattern;
  • a path to a file;
  • an HTTP, HTTPS, or S3 URI; or
  • the pass operator, to treat the upstream pipeline branch as a source.
Note

File paths and URIs may be followed by an optional format specifier.

Sourcing data from pools is only possible when querying a lake, such as via the super db command or SuperDB lake API. Sourcing data from files is only possible with the super command.

When a single pool name is specified without @-referencing a commit or ID, or when using a pool pattern, the tip of the main branch of each pool is accessed.

In the first four forms, a single source is connected to a single output. In the fifth form, multiple sources are accessed in parallel and may be joined, combined, or merged.

A pipeline can be split with the fork operator as in

from PoolOne |> fork (
  => op1 |> op2 | ...
  => op1 |> op2 | ...
) |> merge ts | ...

Or multiple pools can be accessed and, for example, joined:

from (
  pool PoolOne => op1 |> op2 | ...
  pool PoolTwo => op1 |> op2 | ...
) |> join on key=key | ...

Similarly, data can be routed to different pipeline branches with replication using the switch operator:

from ... |> switch color (
  case "red" => op1 |> op2 | ...
  case "blue" => op1 |> op2 | ...
  default => op1 |> op2 | ...
) |> ...

Input Data

Examples below below assume the existence of the SuperDB lake created and populated by the following commands:

{flip:1,result:"heads"} {flip:2,result:"tails"}
{flip:1,result:"heads"}
{flip:2,result:"tails"}
Loading...

The lake then contains the two pools:

The following file hello.jsup is also used.

{greeting:"hello world!"}

Examples

Source structured data from a local file

file hello.jsup |> yield greeting
true
Loading...

Source data from a local file, but in line format

file hello.jsup format line
true
Loading...

Source structured data from a URI

super -z -c 'get https://raw.githubusercontent.com/brimdata/zui-insiders/main/package.json
       |> yield productName'

=>

"Zui - Insiders"

Source data from the main branch of a pool

from coinflips
true
Loading...

Source data from a specific branch of a pool

from coinflips@trial
true
Loading...

Count the number of values in the main branch of all pools

from * |> count()
true
Loading...

Join the data from multiple pools

Error:
super db -lake example query -z '
  from coinflips |> sort flip
  |> join (
    from numbers |> sort number
  ) on flip=number word'

Use pass to combine our join output with data from yet another source

Error:
super db -lake example query -z '
  from coinflips |> sort flip
  |> join (
    from numbers |> sort number
  ) on flip=number word
  |> from (
    pass
    pool coinflips@trial =>
      c:=count()
      |> yield f"There were {int64(c)} flips"
  ) |> sort this'
Next: fuse

SuperDB