timestats¶
Type of operation: Aggregate
Description¶
Aggregate columns at every point in time, based on (optional) grouping columns. For example, if you have a resource with a primary key “switch id, port id” and you want to calculate values by “switch id” and retaining full temporal resolution, you would use timestats, as opposed to timechart which buckets data into evenly-sized bins.
If groupby
is not specified, the default grouping will be used. The default
grouping for timestats is the set of primary key columns. This means that the
count of events on the default grouping per point in time will usually be one,
so you will usually want to use a grouping other than the default.
Usage¶
timestats [ groupby_1, groupby_2, ... ], groupOrAggregateFunction_1, groupOrAggregateFunction_2, ...
Argument |
Type |
Optional |
Repeatable |
Restrictions |
---|---|---|---|---|
groupby |
storable |
yes |
yes |
column |
groupOrAggregateFunction |
expression |
no |
yes |
none |
Accelerable¶
timestats is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.
Examples¶
// assume input is Process, with primary key:
// - server_name
// - process_id
timestats ProcessCount:count(1), group_by(server_name)
Calculate the number of processes for each point in time per server name, returning a dataset with the 4 columns ‘valid_from’, ‘valid_to’, ‘server_name’, and ‘ProcessCount’. As opposed to timechart, this calculates values that change at any point in time, whereas timechart calculates aggregates per fixed bucket.
// assume input is events, with geographic regions:
// - timestamp
// - action
// - geo_region
timestats count(), group_by(geo_region)
Calculate the number of events for each point in time per geographic region, returning a dataset with the 3 columns ‘timestamp’ (hidden), ‘count’, and ‘geo_region’.