Pivot

<< Click to Display Table of Contents >>

Navigation:  Using StreamLab > StreamLab Guides Overview > Guide Commands >

Pivot

Previous pageReturn to chapter overviewNext page

StreamLab's Pivot analytics let you unfold and fold streams of data to make them simpler to process. (See http://en.wikipedia.org/wiki/Pivot_table for more information on pivot tables.) Because streams continually update, in working with streaming data, you always need to choose a window over which to query.

Note: pivot operations are currently experimental. At the moment they will only work if there's an input row containing each of the key values (the columns in the unfolded result) that arrives during the time window. The input should contain all of the keys each time window.

Unfold

Unfold reshapes a table into columns of key-value sets. Selected rows map to keys, and selected columns map to values.

The unfold operation proceeds as follows.

Given a stream of the form:

Rowtime

Browser

Count

1:01

IE

110

1:02

Firefox

230

1:03

Chrome

290

1:04

Safari

100

2:01

IE

90

2:02

Firefox

220

2:03

Chrome

300

2:04

Safari

90

Generates columns like the following:

rowtime window

IE

Firefox

Chrome

Safari

<hour 1>

110

230

290

100

<hour 2>

90

220

300

90

In the example above, "user-agent" serves as the key column, and "count" serves as the value column. For each value in the key column, StreamLab creates a new column and fills in this column with values from the value column. Because you may have multiple values for each key, you need to use an operation to aggregate those values to produce a single value for the unfolded row. For example, if incoming values are counts (as in the example above), you'd want to SUM over the time window. If they're measurements of a continuous value, like a current temperature reading from a sensor, you would likely want to use AVG.

As rows come in, the StreamLab scrutinizer watches the input stream and lists unique values in the key column (in this example, the user-agent column). Once you select the Unfold operation and pick user-agent as the key column, it will make a suggestion that fills in the 'key list' parameter with the list of unique values observed in the stream. You can then set the rest of the parameters either using the widgets up top or by selecting the fields in the suggestion. You can also enter the key list manually.

A window on rowtime is a common use case.

To implement an unfold operation:

1.Select a column to be used as a key column. This column has all the column names for the new columns. This column must contain text. StreamLab will automatically identify values from this column for use as keys (new column names).
2.Select a column to be used as a value column. This column must be numerical. This column provides values for the new columns. This might be counts, temperature readings, pressure readings, and so on.
3.If desired, enter a column suffix for the new columns. This suffix will make it easy to identify the new columns.
4.Select an operation to aggregate values for the new columns. This might be an average temperature, or a running total. See Operations Used in Commands for more details on these operations.
5.Select a length and unit for the aggregation window, such as "1 hour" or "15 minutes."
6.Choose rolling or tumbling window type. See below for more information on these.
7.If desired, either enter the keys manually or let StreamLab fill them in.
8.Click the + icon to add the command to the Guide script.
9.The results of the script appear in the Output View window.

sl_pivot_unfold

Sliding Window

A sliding or rolling window emits an output row with every input row and flushes a small number (often times 1) of rows out of the window for the next calculation. Aside from the small number flushed, sliding windows continue to accumulate rows. Sliding windows are used for windowed aggregation. In contrast, a tumbling window is a collection of rows that are aggregated to produce a fewer number of output rows.  The window is "tumbling" because after the output rows are emitted, the tumbling window is cleared and a new set of future rows are accumulated for the next output. As a result, the total number of output rows are limited.

Tumbling Window

A window that specifies a time frame that continually updates over time. For example, a tumbling window might specify a following hour.

In contrast to a sliding or rolling window,  a tumbling window is a collection of rows that are aggregated to produce a fewer number of output rows.  The window is "tumbling" because after the output rows are emitted, the tumbling window is cleared and a new set of future rows are accumulated for the next output. As a result, the total number of output rows are limited. Tumbling windows can be used with streaming aggregation (SELECT STREAM ... GROUP BY).  In contrast, a sliding or rolling window emits an output row with every input row and flush a small number (often times 1) row out of the window for the next calculation. The main difference between the two is that sliding or rolling windows continue to accumulate rows.  These are used for windowed aggregation

Fold

Fold lets you convert multiple columns with like values into a single row. For fold, select multiple columns with like values (for example, a series of columns that contain counts for different categories, such as product types or employees). Fold creates two new columns: a key column with values drawn from the column names of selected columns, and a value column with values drawn from selected column values.

To implement a fold:

1.Enter a column name for the key column. This will contain the selected columns' names.
2.Enter a column name for the value column. This column will contain values drawn from the selected columns.
3.Choose or enter source columns. These columns will be combined into the new column. All of these columns should contain the same type of values (counts, temperatures, and so on).
4.If desired, enter a column suffix for the new columns. This suffix will make it easy to identify the new columns.
5.Click the + icon to add the command to the Guide script.
6.The results of the script appear in the Output View window.

sl_pivot_fold