EXPLAIN PLAN

<< Click to Display Table of Contents >>

Navigation:  Streaming SQL Reference Guide >

EXPLAIN PLAN

Previous pageReturn to chapter overviewNext page

EXPLAIN PLAN shows how a query will be evaluated.

explain_plan01

Syntax

<explain> :=

    EXPLAIN PLAN

        [ <explain-detail-level> ]

        [ <explain-depth> ]

        [ AS XML ]

        FOR <query-or-DML>

<explain-detail-level> :=

    INCLUDING [ ALL ] ATTRIBUTES

  | EXCLUDING ATTRIBUTES

<explain-depth> :=

  WITH IMPLEMENTATION

  | WITH TYPE

  | WITHOUT IMPLEMENTATION

<query-or-DML> :=

    <query>

  | <insert>

  | <delete>

 

EXPLAIN PLAN FOR <query-or-DML> shows the optimizer plan.

Defaults

INCLUDING ATTRIBUTES

WITH IMPLEMENTATION.

Options

INCLUDING ALL ATTRIBUTES provides any additional attributes, for example, cost.

oEXCLUDING ATTRIBUTES shows relational operator names only.

WITH IMPLEMENTATION, the default, yields an optimized plan with all physical operators.

oWITHOUT IMPLEMENTATION yields the unoptimized abstract relational plan.

AS XML yields detailed output in element-oriented XML.

Results

The default mode is WITH IMPLEMENTATION, for which the optimizer produces a final physical plan.

Specifying WITHOUT IMPLEMENTATION causes EXPLAIN PLAN to stop after translating the SQL statement into logical relational algebra and to return a pure algebraic expression. The explanation is returned as a normal result set with a single VARCHAR column, each row representing a plan node of some kind.

Specifying WITH TYPE causes EXPLAIN PLAN to print the names and datatypes of the result columns (including nullability).

If AS XML is specified, then the plan structure is deeper: nodes may represent individual attributes of plan operators. If AS XML is not specified, then each line represents either

a physical execution stream node (for WITH IMPLEMENTATION)

or

a relational algebra operator (for WITHOUT IMPLEMENTATION).

Specifying EXCLUDING ATTRIBUTES instead causes EXPLAIN PLAN to display only operator names. The default detail level is INCLUDING ATTRIBUTES, whereas pecifying INCLUDING ALL ATTRIBUTES adds additional attributes to the output, such as cost.

Examples

Let's start by seeing how SQL gets converted into relational algebra. If you're using sqlline, you may want to maximize your shell window because lines of the plan can be very long, and setting the output format to csv lets you see full lines of output (or folds lines-too-long into the next line). The output appears immediately below the conclusion of the EXPLAIN PLAN command, set off by single quotes.

0: jdbc:sqlstream://myhostname:9922> !set outputformat csv

0: jdbc:sqlstream://myhostname:9922> explain plan without implementation for

select depts.name as dname,emps.name as ename

from sales.emps inner join sales.depts

on emps.deptno=depts.deptno

order by 1,2;

'column0'

'SortRel(sort0=[$0], sort1=[$1], dir0=[Ascending], dir1=[Ascending])'

'  ProjectRel(DNAME=[$11], ENAME=[$1])'

'    JoinRel(condition=[=($2, $10)], joinType=[inner])'

'      TableAccessRel(table=@@[[LOCALDB, SALES, EMPS]]@@)'

'      TableAccessRel(table=[[LOCALDB, SALES, DEPTS]])'

5 rows selected

 

So far, this is fairly easy to read. The query tree structure is rendered via indentation level, so the two instances of TableAccessRel are the leaves feeding into the JoinRel. The attributes of the join are the join condition ("[=($2, $10)], that is, parameters 2 and 10 equal) and join type (inner, left outer, etc).

The join condition can be a bit confusing. The dollar-sign expressions refer to 0-based column ordinals within the input to the join, in this case corresponding to the DEPTNO column on either side. The join input is the result of concatenating the row from the left side of the join with the corresponding row from right side. Since the first table in the join is EMPS, $2 refers to the third column in the concatenated input, which is the third column in EMPS: emps.deptno. Since there are 10 columns in EMPS, $10 refers to the 11th column of the concatenated input, which is the first column in DEPTS: depts.deptno. The following code shows the table definitions:

CREATE TABLE depts(

    deptno INTEGER NOT NULL PRIMARY KEY,

    name VARCHAR(128) NOT NULL CONSTRAINT depts_unique_name UNIQUE);

CREATE TABLE emps(

    empno INTEGER NOT NULL,

    name VARCHAR(128) NOT NULL,

    deptno INTEGER NOT NULL,

    gender CHAR(1) DEFAULT 'M',

    city VARCHAR(128),

    empid INTEGER NOT NULL UNIQUE,

    age INTEGER,

    public_key VARBINARY(50),

    slacker BOOLEAN,

    manager BOOLEAN NOT NULL,

    PRIMARY KEY (deptno,empno))

CREATE INDEX emps_ux ON emps(name);

 

The same dollar-sign convention is used throughout the plan.

Producing a physical plan

Now, let's tell the optimizer to go all the way and produce a physical plan:

0: jdbc:sqlstream://myhostname:9922> !set outputformat csv

0: jdbc:sqlstream://myhostname:9922> explain plan for

select depts.name as dname,emps.name as ename

from sales.emps inner join sales.depts

on emps.deptno=depts.deptno

order by 1,2;

'column0'

'NetworkRel'

'  FennelSortRel(key=[[0, 1]], discardDuplicates=[false])'

'    AspenCalcRel(expr#0..11=[{inputs}], DNAME=[$t11], ENAME=[$t1])'

'      FtrsIndexSearchRel(table=[[LOCALDB, SALES, DEPTS]], projection=[*],

           index=[SYS$CONSTRAINT_INDEX$SYS$PRIMARY_KEY$DEPTS], uniqueKey=[true],

              preserveOrder=[false], outer=[false], inputKeyProj=[[2]],

                 inputJoinProj=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]], inputDirectiveProj=[[]])'

'        FtrsIndexScanRel(table=[[LOCALDB, SALES, EMPS]], projection=[*],

              index=[SYS$CONSTRAINT_INDEX$SYS$PRIMARY_KEY$EMPS], preserveOrder=[false])'

5 rows selected

 

Now the plan is no longer a tree; instead, it's a linear chain of operators. That's because the optimizer found a plan involving an index join. Reading up from the bottom, the lowest level (FtrsIndexScanRel) scans EMPS, and feeds each row into an index search (FtrsIndexSearchRel) on DEPTS. Matches will append the joined attributes and mismatches will be filtered out. After that, a calculator (AspenCalcRel) projects out only the desired attributes, which are fed into the sorter (FennelSortRel). (Those variables denoted by $t in the calculator line will be explained later on.)

You can see that there are a lot of additional physical attributes, such as which indexes are used, whether unique-key search optimizations are warranted, etc. If you don't want to be bothered with all that, you can get just the bare bones:

0: jdbc:sqlstream://myhostname:9922> !set outputformat csv

0: jdbc:sqlstream://myhostname:9922> explain plan excluding attributes for

select depts.name as dname,emps.name as ename

from sales.emps inner join sales.depts

on emps.deptno=depts.deptno

order by 1,2;

'column0'

'NetworkRel'

'  FennelSortRel'

'    AspenCalcRel'

'      FtrsIndexSearchRel'

'        FtrsIndexScanRel'

5 rows selected

Getting more information

If you want even more information, such as cost, you can get that too:

0: jdbc:sqlstream://myhostname:9922> !set outputformat csv

0: jdbc:sqlstream://myhostname:9922> explain plan including all attributes for

select depts.name as dname,emps.name as ename

from sales.emps inner join sales.depts

on emps.deptno=depts.deptno

order by 1,2;

'column0'

'NetworkRel: rowcount = 100.0, cumulative cost = {1400.0 rows, 13460.51701859881 cpu, 10556.0 io}'

'  FennelSortRel(key=[[0, 1]], discardDuplicates=[false]): rowcount = 100.0,

                cumulative cost = {1300.0 rows, 13460.51701859881 cpu, 10300.0 io}'

'    AspenCalcRel(expr#0..11=[{inputs}], DNAME=[$t11], ENAME=[$t1]): rowcount = 100.0,

                          cumulative cost = {1200.0 rows, 13000.0 cpu, 10200.0 io}'

'      FtrsIndexSearchRel(table=[[LOCALDB, SALES, DEPTS]], projection=[*],

         index=[SYS$CONSTRAINT_INDEX$SYS$PRIMARY_KEY$DEPTS], uniqueKey=[true],

             preserveOrder=[false], outer=[false], inputKeyProj=[[2]],

                 inputJoinProj=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]], inputDirectiveProj=[[]]):

                     rowcount = 100.0, cumulative cost = {1100.0 rows, 10200.0 cpu, 10200.0 io}'

'        FtrsIndexScanRel(table=[[LOCALDB, SALES, EMPS]], projection=[*],

             index=[SYS$CONSTRAINT_INDEX$SYS$PRIMARY_KEY$EMPS],

                 preserveOrder=[false]): rowcount = 100.0,

                      cumulative cost = {1000.0 rows, 10000.0 cpu, 10000.0 io}'

5 rows selected

Revisiting CalcRel in a new query

Now, let's revisit CalcRel by issuing a new query:

0: jdbc:sqlstream://myhostname:9922> !set outputformat csv

0: jdbc:sqlstream://myhostname:9922> explain plan for

select 2*(deptno+1), upper(name)

from sales.depts

where deptno+1 > 5;

'column0'

'NetworkRel'

'  IteratorToFennelConverter'

'    IterCalcRel(expr#0..1=[{inputs}], expr#2=[1], expr#3=[+($t0, $t2)], expr#4=[5],

      expr#5=[>($t3, $t4)], expr#6=[2], expr#7=[*($t6, $t3)], expr#8=[UPPER($t1)],

         EXPR$0=[$t7], EXPR$1=[$t8], $condition=[$t5])'

'       AspenOldToIteratorConverter'

'        FtrsIndexScanRel(table=[[LOCALDB, SALES, DEPTS]], projection=[*],

            index=[SYS$CONSTRAINT_INDEX$SYS$PRIMARY_KEY$DEPTS], preserveOrder=[false])'

5 rows selected

 

This time, for its own mysterious reasons, the optimizer chose to convert from Fennel to Java early and use a Java iterator implementation of the calculator. The same program structure plan rendering is used for both calculators (almost); let's take a closer look. Every calculator program has four parts:

input row (expr#0 and expr#1 in the plan)

common sub-expressions (expr#2 through expr#8 in the plan)

output row (EXPR$0 and EXPR$1 in the plan). For the Fennel calculator, these show up as $f variables instead.

optional condition ($condition in the plan)

The processing semantics are:

1.Evaluate common sub-expressions in terms of the current input.

2.Evaluate the condition.

3.If the condition passes, evaluate the output expressions and produce an output row.

(Calculator implementations may reorder operations to optimize processing as long as these semantics are preserved.)

Within a program, a $t expression refers to the earlier expression with the corresponding number. So in expr#6=[UPPER($t1)], the $t1 refers to expr#1, which is the second input column (NAME). You can see common-subexpression elimination in action for the expression DEPTNO+1. This is computed only once by expr#4, and reused in both expr#5 (referenced from output EXPR$0) and expr#8 (referenced from $condition).

Other plan node interpretations

Finally, let's cover a couple of other plan nodes that require special interpretation.

Index scans

0: jdbc:sqlstream://myhostname:9922> !set outputformat csv

0: jdbc:sqlstream://myhostname:9922> explain plan for

select name from sales.depts where deptno > 20;

'column0'

'NetworkRel'

' IteratorToFennelConverter'

'  IterCalcRel(expr#0..1=[{inputs}], expr#2=[20], expr#3=[>($t0, $t2)], NAME=[$t1], $condition=[$t3])'

'   AspenOldToIteratorConverter'

'    FtrsIndexScanRel(table=[[LOCALDB, SALES, DEPTS]], projection=[*],

        index=[SYS$CONSTRAINT_INDEX$SYS$PRIMARY_KEY$DEPTS], preserveOrder=[false])'

5 rows selected

 

The way this plan works is that the two lowest nodes work together to produce a search key and associated directives. This is fed into the index search, which produces matching rows to be projected by the topmost node. Since there's only one interval to search over, there's a single OneRowRel at the bottom. The calculator above it produces the search key and directives. Here's how to interpret expr#1 through expr#4:

expr#1=['('] is a directive defining the lower bound of the search. Directives use mathematical interval notation, so round parentheses represent open bounds (in this case, strictly greater than) and square brackets represent closed bounds.

expr#2=[20] is the search key coordinate associated with the lower bound

expr#3=['+'] represents positive infinity for the upper bound; if the WHERE clause also had AND DEPTNO < 30, this would be ')' instead

expr#4=[null] represents the fact that infinity has no associated coordinate at all

Aggregates

0: jdbc:sqlstream://myhostname:9922> !set outputformat csv

0: jdbc:sqlstream://myhostname:9922> explain plan without implementation for

select deptno, sum(age), count(*)

from sales.emps

group by deptno;

'column0'

'AggregateRel(groupCount=[1], EXPR$1=[SUM(1)], EXPR$2=[COUNT()])'

'  ProjectRel(DEPTNO=[$2], AGE=[$6])'

'    TableAccessRel(table=[[LOCALDB, SALES, EMPS]])'

3 rows selected

 

This query can be expressed in English as for each department, calculate the sum of the ages of the employees as well as the number of employees in that department. The lower ProjectRel picks out DEPTNO and AGE from the EMPS table. These are fed into the AggregateRel, which has the following semantics:

The groupCount attribute specifies how many attributes from the input row are to be grouped on. These GROUP BY attributes must come first in the input row, so the groupCount just specifies the prefix size. Besides controlling aggregation, these attributes are also passed through into the output in the same positions.

Zero or more agg# expressions define the rest of the AggregateRel output row (appended after the grouping attributes). In this example, there are two of these, so since groupCount=1, the output row has three attributes. (The topmost project is redundant except for giving DEPTNO its name back.)

Aggregate expressions refer to inputs by 0-based ordinal. Confusingly, they don't use the usual $x notation. So SUM(1) means sum over the second input column (AGE). COUNT(*) does not actually depend on any value from the input, so the aggregate expression is just COUNT().

EXPLAIN PLAN WITH TYPE

The WITH TYPE option just prints the names and types (including nullability) of the columns returned by a query. For example,

0: jdbc:sqlstream://myhostname:9922> !set outputformat csv

0: jdbc:sqlstream://myhostname:9922> explain plan with type for

select * from sales.emps;

'column0'

'EMPNO INTEGER NOT NULL,'

'NAME VARCHAR(128) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary" NOT NULL,'

'DEPTNO INTEGER NOT NULL,'

'GENDER CHAR(1) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary",'

'CITY VARCHAR(128) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary",'

'EMPID INTEGER NOT NULL,'

'AGE INTEGER,'

'PUBLIC_KEY VARBINARY(50),'

'SLACKER BOOLEAN,'

'MANAGER BOOLEAN NOT NULL'

10 rows selected

Explaining a streaming query --- examples

Here are a few examples using streams:

Example one: a simple stream select

0: jdbc:sqlstream:> EXPLAIN PLAN

    FOR

       SELECT STREAM

              BIDS."ticker",

              BIDS."price",

              BIDS."shares"

       FROM BIDS;

'column0'

'NetworkRel'

'  AspenCalcRel(expr#0..6=[{inputs}], ticker=[$t2], price=[$t4], shares=[$t3])'

'    StreamSourcePortRel(table=[[LOCALDB, SALES, BIDS]])'

3 rows selected

 

This query uses the SALES schema in the standard SQLstream s-Server software distribution. SALES contains the streams BIDS and ASKS. LOCALDB is the built-in database.

Example two: a stream join to find orders of a certain size

0: jdbc:sqlstream:> EXPLAIN PLAN

    FOR

       SELECT STREAM

              BIDS."ticker",

              BIDS."price",

              BIDS."shares"

       FROM BIDS

       JOIN ASKS

              OVER (RANGE INTERVAL '1' MINUTE PRECEDING) AS TX

              ON BIDS."ticker" = TX."ticker"

       WHERE ( BIDS."price"  > 1000);

'column0'

'NetworkRel'

'  AspenCalcRel(expr#0..14=[{inputs}], ticker=[$t3], price=[$t5], shares=[$t4])'

'    AspenWindowedJoinRel(leftKeys=[[2]], rightKeys=[[2]], joinType=[INNER],

        leftWindow=[(ORDER BY `ROWTIME` ROWS BETWEEN CURRENT ROW AND CURRENT ROW)],

        rightWindow=[(ORDER BY `TX`.`ROWTIME` RANGE BETWEEN

                                              INTERVAL '1' MINUTE PRECEDING AND CURRENT ROW)])'

'      AspenCalcRel(expr#0..6=[{inputs}], expr#7=[1000], expr#8=[>($t4, $t7)],

        proj#0..6=[{exprs}], $condition=[$t8])'

'        StreamSourcePortRel(table=[[LOCALDB, SALES, BIDS]])'

'      StreamSourcePortRel(table=[[LOCALDB, SALES, ASKS]])'

6 rows selected

 

This query matches orders for each ticker in ASKS over the past minute to offers for that ticker in BIDS at a price above 1000.

Example three: a stream join to detect possible fraudulent logins

This code uses the schema and streams defined in the demo directory of the standard SQLstream software distribution under the name "edamame."

> EXPLAIN PLAN

>    FOR

>       SELECT STREAM

>              TX."accountNumber",  >              "loginFailureCount",

>              "transactionType",

>               "amount"

>       FROM "SuspectLoginFailures" AS LF

>       JOIN "Transactions"

>              OVER (RANGE INTERVAL '1' MINUTE PRECEDING) AS TX

>              ON LF."accountNumber" = TX."accountNumber"

>       WHERE ("isDebit");

 IterCalcRel(expr#0..8=[{inputs}], accountNumber=[$t2], loginFailureCount=[$t8],

             transactionType=[$t5], amount=[$t3])

   AspenOldToIteratorConverter

     AspenWindowedJoinRel(leftKeys=[[1]], rightKeys=[[1]], joinType=[INNER],

             leftWindow=[(ORDER BY `TX`.`ROWTIME` RANGE BETWEEN INTERVAL

                         '1' MINUTE PRECEDING AND CURRENT ROW)],

             rightWindow=[(ORDER BY `ROWTIME` ROWS BETWEEN CURRENT ROW AND CURRENT ROW)])

       FennelCalcRel(expr#0..4=[{inputs}], proj#0..4=[{exprs}], $condition=[$t3])

         StreamSourcePortRel(table=[[LOCALDB, TESTL, Transactions]])

       StreamSourcePortRel(table=[[LOCALDB, TESTL, SuspectLoginFailures]])

 

The two StreamSourcePortRels describe the two input streams. The FennelCalcRel is used to filter out rows matching "isDebit". The AspenWindowedJoinRel describes the windowed join. The AspenOldToIteratorConverter marks the boundary between Fennel and Java implementation.