Aggregation Windows

<< Click to Display Table of Contents >>

Navigation:  Understanding Streaming SQL Concepts > Streaming Data and Windows >

Aggregation Windows

Previous pageReturn to chapter overviewNext page

The nature of streaming data requires that calculations, such as SUM, AVERAGE, population variance, and so on, be calculated within a "window"--that is, a specified interval of time or other numeric value, such as "every ten minutes" or "the last thirty seconds" or "every millimeter," and so on.

SQLstream Blaze supports a wide range of windows types, each with distinct advantages, which can be nested and joined to create almost any window eviction policy you would like. For even more customized window types, you can employ User Defined Transforms and User Defined Functions.

SQLstream Blaze offers the following window processing semantics: INNER JOIN, UNION, UNION ALL, INTERSECT, MINUS, EXCEPT

Window Types

SQLstream supports 30 types of aggregation window semantics. SQLstream automatically detects the type of windowing operator to deploy based on the SQL logic utilized and compiles into the appropriate, optimized streaming operator code.

Tumbling Time Windows

Tumbling windows are expressed using the GROUP BY clause of SQL queries. Tumbling windows accumulate a batch of rows of the input to produce a set of aggregated result rows. To implement tumbling time windows in streaming queries, it is required to have at least one group by expression needs to evaluate as a monotonic timestamp value.

Example:

GROUP BY FLOOR(s.ROWTIME TO MINUTE);

 

Tumbling ROW Windows

Instead of timestamp based monotonic expression, one may specify a "monotonic" function based on any arbitrary column to perform the GROUP BY operation. Following example shows a tumbling window that performs aggregation for a batch of 1000 rows. The function MONOTONIC() simply declares to the query planner that the expression passed as a parameter to the function is a monotonic expression.

Example:

GROUP BY MONOTONIC(rowNumber / 1000);

 

Sliding Time Windows

Time based sliding windows are based on maintaining the timebound history of the stream.

Example:

OVER(RANGE INTERVAL '10' MINUTE PRECEDING);

 

Sliding ROW Windows

Sliding windows are expressed using the OVER clause on aggregate functions. Unlike tumbling windows, aggregations based on sliding windows produce a row for each streaming input row without any latency. There are two types of sliding windows. ROW based windows compute aggregation results based the specified number of rows from the history of the stream for each partition key.

Example:

OVER(ROWS 10 PRECEDING);

 

Snapping Time Windows

Snapping time windows start with a zero size and grow until the top of a given time interval. These windows shrink back to zero size at the top of the time interval.

Example:

OVER(... ORDER BY FLOOR(s.ROWTIME TO MINUTE) RANGE INTERVAL '0' MINUTE PRECEDING);

 

Snapping ROW Windows

Snapping ROW windows start with a zero size and grow until n rows then shrink back to zero size and then repeat.

Unbounded Time Windows

Windows that perform aggregations over the entire history of the stream.

Hopping Time Windows

Time windows that hop forward for a given time interval with one row in and one row out with the same aggregation results until the hop interval is reached.

Example:

OVER(... ORDER BY FLOOR(o.ROWTIME TO MINUTE) -- hop per min

    RANGE INTERVAL '60 MINUTE PRECEDING);

 

Tumbling Time Distinct Value Windows

Delivers only distinct values from the input in a given time interval.

Example:

SELECT STREAM DISTINCT FLOOR(s.ROWTIME TO MINUTE), ...

   FROM s;

   -- OR

   SELECT STREAM COUNT(DISTINCT val)

   FROM s

   GROUP BY FLOOR(s.ROWTIME TO MINUTE);

 

Tumbling Time Aggregated Distincts Windows

Delivers aggregations performed on distinct values in a given time interval.

Example:

SELECT STREAM DISTINCT MONOTONIC(rowNumber / 100), ...

   FROM s;

   -- OR

   SELECT STREAM COUNT(DISTINCT val)

   FROM s

   GROUP BY MONOTONIC(rowNumber / 100);

 

Tumbling ROWS Distinct Value Windows

Delivers only distinct values from the input for a given number of rows.

Example:

SELECT STREAM DISTINCT MONOTONIC(rowNumber / 100), ...

   FROM s;

   -- OR

   SELECT STREAM COUNT(DISTINCT val)

   FROM s

   GROUP BY MONOTONIC(rowNumber / 100);

 

Tumbling ROWS Aggregated Distincts Windows

Delivers aggregations performed on distinct values for a given number of rows.

Sliding Time Aggregated Distincts Windows

Delivers aggregations performed on distinct values in a sliding time window with an output row for each input row.

Example:

SELECT STREAM *,

       COUNT(DISTINCT val) OVER (RANGE INTERVAL '1' MINUTE PRECEDING)

   FROM s;

 

Sliding ROWS Aggregated Distincts Windows

Delivers aggregations performed on distinct values in a sliding ROWS window with an output row for each input row.

Example:

SELECT STREAM *,

       COUNT(DISTINCT val) OVER (ROWS 100 PRECEDING)

   FROM s;

 

Hopping Time Aggregated Distincts Windows

Delivers aggregations performed on distinct values in a hopping time window with an output row for each input row.

Example:

SELECT STREAM *,

       COUNT(DISTINCT val) OVER (ORDER BY FLOOR(s.ROWTIME TO MINUTE) RANGE INTERVAL '10' MINUTE PRECEDING)

   FROM s;

 

Partitioned Sliding Time Windows

Time based sliding windows are based on maintaining the timebound history of the stream for each partition key.

Example:

SELECT STREAM *,

       COUNT(val) OVER (PARTITION BY p ROW 100 PRECEDING)

   FROM s;

 

Partitioned Sliding ROW Windows

ROWS based sliding windows are based on maintaining the history of the the given number of rows from the stream for each partition key.

Example:

SELECT STREAM *,

       COUNT(val) OVER (PARTITION BY p ROW 100 PRECEDING)

   FROM s;

 

Partitioned Unbounded Windows

Delivers aggregations on the entire history of the stream for each partition key.

Example:

SELECT STREAM *,

       COUNT(val) OVER (PARTITION BY p RANGE UNBOUNDED PRECEDING)

   FROM s;

 

Partitioned Hopping Time Windows

Time windows that hop forward for a given time interval with one row in and one row out with the same aggregation results for each partition key until the hop interval is reached.

Example:

SELECT STREAM *,

       COUNT(val) OVER (PARTITION BY p ORDER BY FLOOR(s.ROWTIME TO MINUTE) RANGE INTERVAL '10' MINUTE PRECEDING)

   FROM s;

 

Partitioned Tumbling Time Aggregated Distincts Windows

Delivers aggregations performed on distinct values in a given time interval for each partition key.

Example:

SELECT STREAM COUNT(DISTINCT val)

   FROM s

   GROUP BY FLOOR(s.ROWTIME TO MINUTE), p;

 

Partitioned Tumbling ROWS Aggregated Distincts Windows

Delivers aggregations performed on distinct values for a given number of rows for each partition key.

Example:

SELECT STREAM COUNT(DISTINCT val)

   FROM s

   GROUP BY FLOOR(s.ROWTIME TO MINUTE), p;

 

Partitioned Sliding Time Aggregated Distincts Windows

Delivers aggregations performed on distinct values for each partition key in a sliding time window with an output row for each input row.

Example:

SELECT STREAM *,

       COUNT(DISTINCT val) OVER (PARTITION BY p RANGE INTERVAL '1' MINUTE PRECEDING)

   FROM s;

 

Partitioned Sliding ROWS Aggregated Distincts Windows

Delivers aggregations performed on distinct values for each partition key in a sliding ROWS window with an output row for each input row.

Example:

SELECT STREAM COUNT(DISTINCT val)

   FROM s

   GROUP BY MONOTONIC(rowNumber / 100), p;

 

Partitioned Hopping Time Aggregated Distincts Windows

Delivers aggregations performed on distinct values for each partition key in a hopping time window with an output row for each input row.

Example:

SELECT STREAM *,

       COUNT(DISTINCT val) OVER (PARTITION BY p ORDER BY FLOOR(s.ROWTIME TO MINUTE) RANGE INTERVAL '10' MINUTE PRECEDING)

   FROM s;

 

K-Sorted Tumbling ROW Windows

Delivers ordered aggregation results on given number of rows.

T-Sorted Sliding Time Windows

Reorders records in a specified time interval into time order.

Example:

SELECT STREAM event_time AS ROWTIME, *

   FROM s

   ORDER BY event_time WITHIN INTERVAL '1' MINUTE;

 

K-Sorted Partitioned Sliding ROW Windows

Delivers ordered aggregation results on given number of rows for each partition key.

Ranking Time Windows

SQLstream supports GROUP_RANK() UDX to rank a set of rows ordered by a specified "ranking column". For more details, see Using the Group Rank UDX in the Integration Guide.

Edge Detection Windows

Tags a row according to the edge detection logic (continuously comparing current and previous nth or n rows).

Partitioned Edge Detection Windows

Tags a row according to the edge detection logic over a partition key (continuously comparing current and previous nth or n rows).