<< Click to Display Table of Contents >> Aggregate Functions 
An aggregate function returns a result calculated not from a single row but rather aggregated from data contained in a finite set of rows, or from information about a finite set of rows. An aggregate function may appear in
•  the <selection list> portion of a SELECT clause, 
•  an ORDER BY clause, or 
•  a HAVING clause. 
An aggregate function is different from an analytic function, which are always evaluated relative to a window that must be specified, and so cannot appear in a HAVING clause. Other differences are described in the next topic.
Aggregate functions operate slightly differently in aggregate queries on tables than when used in aggregate queries on streams:
If an aggregate query on tables contains a GROUP BY clause, then the aggregate function returns one result per group in the set of input rows. Lacking an explicit GROUP BY clause is equivalent to GROUP BY (), and returns only one result for the entire set of input rows.
On streams, an aggregate query must contain an explicit GROUP BY clause on a monotonically increasing expression based on rowtime. Without one, the sole group is the whole stream, which never ends, preventing any result from being reported. Adding a GROUP BY clause based on a monotonically increasing expression breaks the stream into finite sets of rows, contiguous in time, and each such set can then be aggregated and reported.
Whenever a row arrives that changes the value of the monotonically increasing grouping expression, a new group is started and the previous group is considered complete. The sServer then outputs the value of the aggregate functions. Note that the GROUP BY clause may also include other nonmonotonicallyincreasing expressions, in which case more than one result per set of rows may be produced.
Performing an aggregate query on streams is often referred to as streaming aggregation, as distinct from the 'windowed aggregation' discussed here. Both are discussed in the topic Application Design Considerations in the sServer Concepts Guide. Streamtostream joins are discussed at Stream to Stream Joins; Timebased and rowbased windows are discussed in the Glossary.
If an input row contains a null in a column used as an input to a data analysis function, the data analysis function ignores the row (except for COUNT).
Differences Between Aggregate and Analytic Functions
Function Type 
Outputs 
Rows or Windows Used 
Notes 
Aggregate Functions 
One output row per group of input rows 
All output columns are calculated over the same window or same group of rows 
COUNT DISTINCT is not allowed in streaming aggregation. That is, statements of the following type are not allowed: SELECT COUNT(DISTINCT x) ... FROM ... GROUP BY ... 
One output row for each input row. 
Each output column may be calculated using a different window or partition 
COUNT DISTINCT can be used as an analytic function and in windowed aggregation. 
Streaming Aggregation and Rowtime bounds
Normally, an aggregate query generates a result when a row arrives that changes the value of the monotonically increasing expression in the GROUP BY. For example, if the query is grouped by FLOOR( rowtime TO MINUTE ), and the rowtime of the current row is 9:59.30, then a new row with a rowtime of 10:00.00 will trigger the result.
Alternately, a rowtime bound can be used to advance the monotonically increasing expression and enable the query to return a result. For example, if the query is grouped by FLOOR( rowtime TO MINUTE ), and the rowtime of the current row is 9:59.30, then an incoming rowtime bound of 10:00.00 will also trigger the result.
See below for examples of rowtime bounds in streaming aggregation, and the Concepts Guide for more information on rowtime bounds in general.
Aggregate function list
SQLstream sServer supports the following aggregate functions:
•  AVG 
•  MAX 
•  MIN 
•  SUM 
See also the Statistical Variance and Deviation Functions.
Examples: Aggregate queries on tables
Assume the following data in table SALES.EMPS:
NAME 
DEPTNO 
AGE 
GENDER 
Amy 
10 
30 
F 
Beth 
20 
25 
F 
Charles 
30 
40 
M 
David 
10 
20 
M 
Eric 
10 
40 
M 
Fiona 
30 


Gregory 
40 
60 
M 
Harriet 
40 
55 
F 
The following SQL uses the AVG aggregate function as part of a query to find the average age of all employees:
SELECT
AVG(AGE) AS AVERAGE_AGE
FROM SALES.EMPS;
Result:
AVERAGE_AGE 
38 
To find the average age of employees in each department, we can add an explicit GROUP BY clause to the query:
SELECT
DEPTNO,
AVG(AGE) AS AVERAGE_AGE
FROM SALES.EMPS
GROUP BY DEPTNO;
Returns:
DEPTNO 
AVERAGE_AGE 
10 
30 
20 
25 
30 
40 
40 
57 
Examples: Aggregate Queries on Streams (Streaming Aggregation)
Assume the following information flowing through the stream WEATHERSTREAM:
ROWTIME 
CITY 
TEMP 
20181101 01:00:00.0 
Denver 
29 
20181101 01:00:00.0 
Anchorage 
2 
20181101 06:00:00.0 
Miami 
65 
20181101 07:00:00.0 
Denver 
32 
20181101 09:00:00.0 
Anchorage 
9 
20181101 13:00:00.0 
Denver 
50 
20181101 17:00:00.0 
Anchorage 
10 
20181101 18:00:00.0 
Miami 
71 
20181101 19:00:00.0 
Denver 
43 
20181102 01:00:00.0 
Anchorage 
4 
20181102 01:00:00.0 
Denver 
39 
20181102 07:00:00.0 
Denver 
46 
20181102 09:00:00.0 
Anchorage 
3 
20181102 13:00:00.0 
Denver 
56 
20181102 17:00:00.0 
Anchorage 
2 
20181102 19:00:00.0 
Denver 
50 
20181103 01:00:00.0 
Denver 
36 
20181103 01:00:00.0 
Anchorage 
1 
If you want to find the minimum and maximum temperature recorded anywhere each day (globally regardless of city), the minimum and maximum temperature can be calculated using the aggregate functions MIN and MAX respectively. To indicate that we want this information on a perday basis (and to provide a monotonically increasing expression as the argument of the GROUP BY clause), we use the FLOOR function to round each row's rowtime down to the nearest day:
SELECT STREAM
FLOOR(WEATHERSTREAM.ROWTIME to DAY) AS FLOOR_DAY,
MIN(TEMP) AS MIN_TEMP,
MAX(TEMP) AS MAX_TEMP
FROM WEATHERSTREAM
GROUP BY FLOOR(WEATHERSTREAM.ROWTIME TO DAY);
The result of the aggregate query is:
FLOOR_DAY 
MIN_TEMP 
MAX_TEMP 
20181101 00:00:00.0 
2 
71 
20181102 00:00:00.0 
2 
56 
Note that there is no row for 20181103, even though the example data does include temperature measurements on that day. This is because the rows for 20181103 cannot be aggregated until all rows for that day are known to have arrived, and that will only happen when either a row with a rowtime of 20181104 00:00:00.0 (or later) or a rowtime bound of 20181104 00:00:00.0 (or later) arrives. If and when either did arrive, the next result would be:
FLOOR_DAY 
MIN_TEMP 
MAX_TEMP 
20181103 00:00:00.0 
1 
36 
Let's say that instead of finding the global minimum and maximum temperatures each day, we want to find the minimum, maximum, and average temperature for each city each day. To do this, we use the SUM and COUNT aggregate functions to compute the average, and add CITY to the GROUP BY clause:
SELECT STREAM
FLOOR(WEATHERSTREAM.ROWTIME TO DAY) AS FLOOR_DAY,
CITY,
MIN(TEMP) AS MIN_TEMP,
MAX(TEMP) AS MAX_TEMP,
SUM(TEMP)/COUNT(TEMP) AS AVG_TEMP
FROM WEATHERSTREAM
GROUP BY FLOOR(WEATHERSTREAM.ROWTIME TO DAY), CITY;
The result of the aggregate query is:
FLOOR_DAY 
CITY 
MIN_TEMP 
MAX_TEMP 
AVG_TEMP 
20181101 00:00:00.0 
Anchorage 
2 
10 
7 
20181101 00:00:00.0 
Denver 
29 
50 
38 
20181101 00:00:00.0 
Miami 
65 
71 
68 
20181102 00:00:00.0 
Anchorage 
2 
4 
3 
20181102 00:00:00.0 
Denver 
39 
56 
47 
In this case, the arrival of rows for a new day's temperature measurements triggers the aggregation of the previous day's data, grouped by CITY, which then results in one row being produced per city included in the day's measurements.
Here again, a rowtime bound 20181104 00:00:00.0 could be used to prompt a result for 20181103 prior to any actual measurements for 20181104 coming in:
FLOOR_DAY 
CITY 
MIN_TEMP 
MAX_TEMP 
AVG_TEMP 
20181103 00:00:00.0 
Anchorage 
1 
1 
1 
20181103 00:00:00.0 
Denver 
36 
36 
36 