URI Parse UDX

<< Click to Display Table of Contents >>

Navigation:  Integrating SQLstream Blaze with Other Systems > Transforming Data in s-Server >

URI Parse UDX

Previous pageReturn to chapter overviewNext page

The URI Parse UDX parses decorated URIs into name-value pairs. URIParse is called with an input stream and the name of the column containing the decorated URIs. The SQL function declaration's output rowtype determines which input columns and which parsed named values are placed in the output stream.

Parsed values are copied to output columns specified with the same case-sensitive name as the supplied parameter. Output columns whose names do not exactly match any parameter value are discarded.
Pass-through input: Each input column whose name exactly matches (case-sensitive) a specified output column name is copied as a "pass-through".
Input columns that do not exactly match output columns are discarded.

The URI Parse UDX is implemented externally in a Java JAR file, located in the SQLSTREAM_HOME/plugin directory. This external JAR file is loaded into SQLstream s-Server using CREATE JAR syntax.

The following options are passed as arguments to the SQL function invocation:

Parameter

Description

inputRows

The input stream of rows

parseColName

(Optional) Name (case-sensitive) of the column to be parsed.

If this parameter is not supplied, then the first non-ROWTIME column is used.

querySplit

The regular expression to be used for splitting the request and decoration. request refers everything before the first querySplit, usually a question mark, "\?", while decoration refers to everything after the first querySplit, usually a question mark, "\?"

paramSplit

The regular expression to be used for splitting the param-value strings from decoration; paramSplit is usually the character ampersand, "&", or semicolon, ";"

pairSplit

The regular expression to be used for splitting the param/value pairs from param-value strings.

--- pairSplit is usually the character equals, "=", or colon, ":".

results

Name of the output stream of rows

Note: Because the xxxxSplit arguments are used as Java regular expressions (regex), be sure to escape any split character that is also a regex metacharacter. For example, the querySplit is usually a question mark, "?". Since "?" is the conditional-match regex metacharacter, it must be escaped, "\?", when passed in from the SQL.

Consider a stream named "s" with a single varchar column named "Message" where the contents of the column contain:

http://sqlstream.com/index.jsp?fruit=<value>&id=<value>&state=<value>.

Creating a function to call uriParse

The following code calls the function uriParse by creating a function called "uriParseStream" and then defining what the URI will look like:

CREATE FUNCTION "uriParseStream"(

    inputRows CURSOR,

    parseColName VARCHAR(32),

    querySplit VARCHAR(3),

    paramSplit VARCHAR(3),

    pairSplit  VARCHAR(3))

    RETURNS TABLE( "fruit"     varchar(16),

                   "id"     int,

                   "state"     varchar(2)

                  )

    LANGUAGE JAVA

    PARAMETER STYLE SYSTEM DEFINED JAVA

    NO SQL

    EXTERNAL NAME 'class com.sqlstream.plugin.uriparse.UriParse.parseUri';

Use the "uriParseStream" UDX

You then call the function through the following code:

SELECT STREAM "a", "b" ,"c"

FROM STREAM("uriParseStream"(

    CURSOR( SELECT STREAM '\?' || MESSAGE AS "message" FROM "s"), -- input rows

    'message', -- parseColName

    '\?',      -- querySplit

    '\|',      -- paramSplit

    '='));     -- pairSplit.

 

which would return columns for a, b, and c:

 

fruit

id

state

apple

6789

NJ

pear

6791

TX

nectarine

6784

FL

 

and so on.