WINDOW clause

<< Click to Display Table of Contents >>

Navigation:  Streaming SQL Reference Guide > SELECT statement  >

WINDOW clause

Previous pageReturn to chapter overviewNext page

The WINDOW clause lets you define a named window you can then use in analytic function calls and streaming JOIN clauses elsewhere in the query. In most cases, you define windows using ROWTIME, which s-Server uses as the "clock" for the stream. Current ROWTIME is known as the "high watermark" of a stream.

Windows are an important part of streaming analysis, since they let you define the set of sliding or tumbling data that you wish to analyze.

Offset Windows

By default, windows have an upper bound of the current row. For some aggregations, you will want to exclude the current row by using an offset window. Otherwise, a given aggregation may miss additional rows that come in with the same ROWTIME as the current row, since s-Server begins emitting output rows as soon as the current row hits its system. Using offset windows ensures that you have captured all rows with the same ROWTIIME in an aggregation.

However, offset windows introduce some latency, so in cases where it is not important to aggregate rows with the same rowtime, you can use the default upper bound.

The windows so defined are inherited by sub-queries of the current query. Windows are not supported for queries on tables.

<window-clause> :=

   WINDOW <window-definition> { , <window-definition> }...

<window-definition> :=

   <window-name> AS <window-specification>

<window-specification> :=

   <window-name>

 | <query_partition_clause>

 | ORDER BY <order_by_clause>

 | <windowing_clause>

<query_partition_clause> :=

  PARTITION BY <expression> { , <expression> }...

 |  ( <expression> { , <expression> }... )

 

5_2_indicator Version 5.2 Feature

s-Server now features offset and hopping windows.

Window Clause and Endpoints

CURRENT ROW is the default upper bound of a window frame in the WINDOW clause. As of version 5.2, s-Server supports "offset" windows, where the upper bound of a window frame can precede CURRENT ROW.

Windows can have, then, two upper bounds:

CURRENT ROW - The upper bound of the window has an offset 0. This is the default upper bound.

PRECEDING - The upper bound of the window has a negative offset relative to the current row.

Note: s-Server does not support FOLLOWING.

Streaming SQL follows the SQL Standard for windows over a range. This means, for example that the syntax

WINDOW HOUR AS (RANGE INTERVAL '1' HOUR PRECEDING)

 

will include the end points of the hour.

To ensure that the the endpoint of the previous hour is not included, you need to use the following syntax for the window:

WINDOW HOUR AS (RANGE INTERVAL '59:59.999' MINUTE TO SECOND(3) PRECEDING);

 

See Allowed and Disallowed Window Specifications for more details.

Syntax Charts for Window Statement and Window Specification

The chart for Window Statement includes the window-specification, query-partition, and windowing-clause charts that directly follow below it.

WINDOW <window_name> AS (

   [ PARTITION BY <partitionCols> ]

   [ ORDER BY <MonotonicExpression> ]

   -- Window frame

   RANGE [ BETWEEN ] INTERVAL 'x' <timeunit> PRECEDING -- lower bound

         [ AND INTERVAL 'y' <timeunit> PRECEDING ] -- upper bound

 

Window Statement

window_stmt

Window Specification

The "window-specification" is also referenced in the stream-or-table-reference chart that appears within the select-clause chart.

windowSpecnReally

The order_by_clause chart appears after the window examples that appear below.

Syntax Chart for the query-partition clause

query_partition

Syntax Chart for the windowing-clause

In addition to being part of the window-specification, the windowing-clause appears in the FROM clause and the analytic clause of an analytic function.

windowSpecn

Note: Partitions are evaluated before windows.

WINDOW w AS (

   [ PARTITION BY <partitionCols> ]

   [ ORDER BY <MonotonicExpression> ]

   -- Window frame

   RANGE [ BETWEEN ] INTERVAL 'x' <timeunit> PRECEDING -- lower bound

         [ AND INTERVAL 'y' <timeunit> PRECEDING ] -- upper bound

 

Example:

 

WINDOW w AS (ORDER BY FLOOR(s.ROWTIME TO HOUR)

    RANGE BETWEEN INTERVAL '10' HOUR PRECEDING

          AND INTERVAL '1' HOUR PRECEDING)

 

Offset windows enable the exclusion of the “current” row (or the current time interval as expressed in the order by clause) from computation of aggregates such as AVG, VAR, or STDDEV. This makes statistical tools like Bollinger bands, naive bayes classifier, statistically valid.

Hopping Windows

When the ORDER BY clause uses a monotonic expression based on ROWTIME, the window is described as a hopping window. Hopping windows slide forward by an interval of time expressed in the form of a monotonic expression on ROWTIME, such as ORDER BY FLOOR(s.ROWTIME TO HOUR. In this example, the window will slide forward at the top of every hour, that is, "hopping" the window forward by 1 hour every hour.

When you use a hopping window with COUNT, for example, counts will go up for the entire hour, then start at zero at the beginning of the new hour. Data from the previous hour is "dropped" from aggregation.

Hopping windows have significant benefits.

They reduce the amount of system resources required to maintain the window, since the window only needs to maintain aggregated history for each time interval (1 hour in the example above).
When used with an offset, such as RANGE BETWEEN INTERVAL '60' MINUTE PRECEDING AND INTERVAL '1' MINUTE PRECEDING, the results for each input row are computed on a window that excludes data the current row. This is highly desirable since any anomaly in streaming data does not adversely impact computation of Bollinger bands or Naive Bayes classifiers.
When used without an offset, results for hopping windows are computed for each input row with zero latency.

Example

Below is an example based on Bollinger bands. Bollinger Bands are used for sensor readings in oil wells and must be computed with Offset Windows and ORDER BY <MonotonicExpression> to be correct. The example below excludes current sensor readings from the computation of the Bollinger Band.

SELECT STREAM *

FROM

(

  SELECT STREAM *,

      AVG(sensor_value) OVER w +

      STDDEV(sensor_value) OVER w * 2 AS upper_bb

  FROM sensor_readings

  WINDOW w AS (PARTITION BY vendor_id

               ORDER BY FLOOR(s.ROWTIME TO HOUR)

               RANGE BETWEEN INTERVAL '24' HOUR PRECEDING

                         AND INTERVAL '1' HOUR PRECEDING)

)

WHERE sensor_value > upper_bb;

 

 

Allowed and Disallowed Window Specifications

SQLstream s-Server supports nearly all windows that end with the current row or a specified interval preceding the current row.

You cannot define an infinite window, a negative-sized window, or use negative integers in the window specification.

Infinite windows are windows with no bounds. Typically these point into the future, which for streams is infinite. For example "ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING" is not supported, because in a streaming context such a query would not produce a result, since streams are continually expanding as new data arrives. All uses of UNBOUNDED FOLLOWING are unsupported.
Negative windows . For example, "ROWS BETWEEN 0 PRECEDING AND 4 PRECEDING" is a window of negative size and is therefore illegal. Instead, you would use: "ROWS BETWEEN 4 PRECEDING AND 0 PRECEDING" in this case.
Offset windows that end with FOLLOWING are supported only for tables. For example, "ROWS BETWEEN UNBOUNDED PRECEDING AND 4 FOLLOWING" is not supported. (Window spans CURRENT ROW rather than starting or ending there.)
Windows defined with negative integers. For example,  "ROWS BETWEEN -4 PRECEDING AND CURRENT ROW" is invalid because negative integers are disallowed.

Also, the special case of ... 0 PRECEDING (and ... 0 FOLLOWING) cannot be used for windowed aggregation; instead, the synonym CURRENT ROW can be used.

For windowed aggregation, partitioned windows are allowed, but ORDER BY must not be present.

For windowed join, partitioned windows are NOT allowed, but ORDER BY can be present if it sorts by the ROWTIME column of one of the inputs.

Window Examples

The following examples show a sample input data set, the definitions for several windows, and the contents of those windows at various times after 10:00, the time data starts to arrive for this example.

The windows are defined as follows:

SELECT STREAM

 ticker,

 sum(amount) OVER lastHour,

 count(*) OVER lastHour

 sum(amount) OVER lastThree

FROM Trades

WINDOW

 lastHour AS (RANGE INTERVAL '1' HOUR PRECEDING),

 lastThree AS (ROWS 3 PRECEDING),

 lastZeroRows AS (ROWS CURRENT ROW),

 lastZeroSeconds AS (RANGE CURRENT ROW),

 lastTwoSameTicker AS (PARTITION BY ticker ROWS 2 PRECEDING),

 lastHourSameTicker AS (PARTITION BY ticker RANGE INTERVAL '1' HOUR PRECEDING)

 

First Example: time-based windows versus row-based windows

As shown on the right side of the figure below, the time-based lastHour window contains varying numbers of rows, because window membership is defined by time range.

window-examples

Examples of windows containing rows

The row-based lastThree window generally contains four rows: the three preceding and the current row. However for the row 10:10 IBM, it only contains two rows, because there is no data before 10:00.

A row-based window can contain several rows whose ROWTIME value is the same, though they arrive at different times (wall-clock times). The order of such a row in the row-based window depends on its arrival time. The row's arrival time can determine which window includes it.

For example, the middle lastThree window in Figure 1 shows the arrival of a YHOO trade with ROWTIME 11:15 (and the last three trades before it). However, this window excludes the next trade, for IBM, whose ROWTIME is also 11:15 but which must have arrived later than the YHOO trade. This 11:15 IBM trade is included in the 'next' window, as is the 11:15 YHOO trade, its immediate predecessor.

Second Example: zero width windows, row-based and time-based

Figure 2: Examples of zero-width windows shows row-based and time-based windows of zero width. The row-based window lastZeroRows includes just the current row, and therefore always contains precisely one row. Note that ROWS CURRENT ROW is equivalent to ROWS 0 PRECEDING.

The time-based window lastZeroSeconds contains all rows with the same timestamp, of which there may be several. Note that RANGE CURRENT ROW is equivalent to RANGE INTERVAL '0' SECOND PRECEDING.

window-examples2

Third Example: Partitioning applied to row-based and time-based windows

Figure 3 shows windows that are similar to those in Figure 1 but with a PARTITION BY clause. For time-based window lastTwoSameTicker and the row-based window lastHourSameTicker, the window contains rows that meet the window criteria and have the same value of the ticker column. Note: Partitions are evaluated before windows.

window-examples3