TSV Logs
SuperDB can read both of the common Zeek log formats. This section provides guidance for what to expect when reading logs of these formats using the super command.
Zeek TSV
is Zeek’s default output format for logs. This format can be read automatically
(i.e., no -i command line flag is necessary to indicate the input format)
with super.
s
The following example shows a TSV conn.log being read via super and
output as Super (SUP).
conn.log
#separator \x09
#set_separator ,
#empty_field (empty)
#unset_field -
#path conn
#open 2019-11-08-11-44-16
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_state local_orig local_resp missed_bytes history orig_pkts orig_ip_bytes resp_pkts resp_ip_bytes tunnel_parents
#types time string addr port addr port enum string interval count count string bool bool count string count count count count set[string]
1521911721.255387 C8Tful1TvM3Zf5x8fl 10.164.94.120 39681 10.47.3.155 3389 tcp - 0.004266 97 19 RSTR - - 0 ShADTdtr 10 730 6 342 -
Example
super -S -c 'head 1' conn.log
Output
{
_path: "conn",
ts: 2018-03-24T17:15:21.255387Z::(time|null),
uid: "C8Tful1TvM3Zf5x8fl"::(string|null),
id: {
orig_h: 10.164.94.120::(ip|null),
orig_p: 39681::(port=uint16)::(port|null),
resp_h: 10.47.3.155::(ip|null),
resp_p: 3389::port::(port|null)
},
proto: "tcp"::=zenum::(zenum|null),
service: null::(string|null),
duration: 4.266ms::(duration|null),
orig_bytes: 97::uint64::(uint64|null),
resp_bytes: 19::uint64::(uint64|null),
conn_state: "RSTR"::(string|null),
local_orig: null::(bool|null),
local_resp: null::(bool|null),
missed_bytes: 0::uint64::(uint64|null),
history: "ShADTdtr"::(string|null),
orig_pkts: 10::uint64::(uint64|null),
orig_ip_bytes: 730::uint64::(uint64|null),
resp_pkts: 6::uint64::(uint64|null),
resp_ip_bytes: 342::uint64::(uint64|null),
tunnel_parents: null::(null||[string|null]|)
}
Since Zeek provides a richest type system, such records typically need no adjustment to their data types once they’ve been read in as is. The Zeek Type Compatibility document provides further detail on how the rich data types in Zeek TSV map to the equivalent super-structured types.
The Role of _path
Zeek’s _path field plays an important role in differentiating between its
different log types
(conn, dns, etc.) For instance,
shaping Zeek JSON relies on the value of
the _path field to know which type to apply to an input JSON
record.
If reading Zeek TSV logs or logs generated by the JSON Streaming Logs
package, this _path value is provided within the Zeek logs. However, if the
log was generated by Zeek’s built-in ASCII logger when using the
redef LogAscii::use_json = T; configuration, the value that would be used for
_path is present in the log file name but is not in the JSON log
records. In this case you could adjust your Zeek configuration by following the
Log Extension Fields example
from the Zeek docs. If you enter path in the locations where the example
shows stream, you will see the field named _path populated just like was
shown for the JSON Streaming Logs output.