<< Click to Display Table of Contents >>

Navigation:  Integrating SQLstream Blaze with Other Systems > Transforming Data in s-Server >


Previous pageReturn to chapter overviewNext page

The XMLParser UDX is a User Defined Transform (UDX) that uses an XPATH (the XML Path Language) definition to transform the contents of a single column into distinct columns for each attribute. The basic function is to match XML attributes to SQL column names, extract the data values, and place each attribute as an individual column value.

The inputs to the XMLParse UDX are an input stream supplied as a CURSOR argument, the name of a column to be parsed by the XMLParse, and the XPATH definition of the XML.

The XMLParse output is a new stream containing the individual attribute values as column values.

XMLParse.jar UDX file

The XMLParse UDX is implemented in a Java JAR file located in the $SQLSTREAM_HOME/plugin directory. As with other jar files, it is loaded with the CREATE JAR statement.

Note: $SQLSTREAM_HOME refers to the installation directory for s-Server, such as /opt/sqlstream/5.0.XXX.


 SET SCHEMA '"test"';

 CREATE JAR "test"."XmlParse"

     LIBRARY 'file:plugin/XmlParse.jar'



Alternately, you can create a link to the XMLParse UDX in the autocp directory.

cd $SQLSTREAM_HOME/plugin/autocp

ln -s ../XmlParse.jar


Creating a SQL Function to use the XMLParse UDX

The XMLParse input parameters specify the following information:

The set of input rows,

The column name of the column containing the XML data,

The file name of the XPATH configuration file. The location of this file is

 $SQLSTREAM_HOME/plugin/jndi/xmlparse/<configuration file name>.properties


The Table returned by the XMLParse UDX is a set of columns. Each column in the input stream named in the list of output Table columns receives the corresponding value passed through to the output. The column name of the column containing the XML data is parsed according to the XPATH configuration and any attributes corresponding to XML tags appear in the output for each column named in the list of output columns. XML tags that have no matching column names in the list of output Table columns are silently dropped. If the XML data has an array of repeating values, each repeating value will correspond to one UDX output row.

 CREATE FUNCTION "test"."ParseXml"

           (c CURSOR,

            parseColName VARCHAR(20),

            confName VARCHAR(20))



                 -- Input values passed to the output

                  "location" VARCHAR(30),

                  "city" VARCHAR(32),

                  "region" VARCHAR(32),

                 -- Output columns parsed from the XML

                  id INTEGER,

                  name VARCHAR(60),

                  title VARCHAR(60),

                  years VARCHAR(60)








Calling the Function

The following code calls the UDX.

 CREATE STREAM "InputStream" (

 "location" VARCHAR(30),

 "city" VARCHAR(32),

 "region" VARCHAR(32),

 id int,

      XML_DATA VARCHAR(32000));


     FROM TABLE("test"."ParseXml"(

         -- First parameter is input stream

            CURSOR(SELECT STREAM * FROM "InputStream"),

         -- Second parameter is the name of the column in the input data that

            contains the XML data


         -- Third parameter is the name of the XPath definition file.



Example Properties File and Input

The content of the Properties file passed as the third parameter of the function above is written in XPath. XPath is used for selecting parts of an XML document and is defined by the World Wide Web Consortium, W3C.

An example of the content of the properties file which could be used for the above code would be:





Some Sample Input

 SET SCHEMA '"test"';

 INSERT INTO "InputStream" ("location", "city", "region", id, xml_data)

   VALUES ('US', 'Oakland', 'California', 12345,

    '<testxml><row><name>joe</name><jobfunctions><function title="truck driver" years="1"/>

     <function title="clerk" years="5"/></jobfunctions></row></testxml>');



Set the appropriate trace level in /var/log/sqlstream/Trace.propertiesTrace.properties for:





Error Conditions which disturb successful processing




Logs the mapping input columns and output columns


Logs the mapping between XML attributes and output columns




The maximum size of the XML document is constrained by the maximum row and column size of 65536.