REGEX_LOG_PARSE

<< Click to Display Table of Contents >>

Navigation:  Integrating SQLstream Blaze with Other Systems > Appendix A: Legacy Adapters and Agents > Log File Adapter >

REGEX_LOG_PARSE

Previous pageReturn to chapter overviewNext page

Note: This topic describes a legacy adapter. Unless you have used these in the past, we recommend using the Extensible Common Data Adapter and Agent instead of these legacy adapters and agents.

Parses a character string based on Java Regular Expression patterns as defined in java.util.regex.pattern.

Columns are based on match groups defined in the regex-pattern. Each group defines a column, and the groups are processed from left to right. Failure to match produces a NULL value result: If the regular expression does not match the the string passed as the first parameter, NULL is returned.

Syntax:

REGEX_LOG_PARSE (<character-expression>,<regex-pattern>,<columns>)<regex-pattern> ::= <character-expression>[OBJECT] <columns> ::= <columnname> [ <datatype> ] {, <columnname> <datatype> }*

The columns returned will be COLUMN1 through COLUMNn, where n is the number of groups in the regular expression. The columns will be of type varchar(1024).

Example 1

The following code returns two columns with zero or more of [0-9] of the string 'abcde111fghij22klm'

SELECT t.r."COLUMN1", t.r."COLUMN2" from

(values (REGEX_LOG_PARSE('abcde111fghij22klm', '([^0-9]*)1*([^0-9]*)2*([^0-9]*)'))) t(r);

 

         +----------+-----------+

         | COLUMN1  | COLUMN2   |

         +----------+-----------+

         | 111      | 22        |

         +----------+-----------+

         1 row selected

 

Example 2

The following code returns three columns:

one, labeled "Amount," with one or more digits

one, labeled "Item," with one or more non-whitespace characters, followed by whitespace, followed by one or more non-whitespace characters

one, labeled "Ship Date," with one or more non-whitespace characters

SELECT t.r."Amount", t.r."Item", t.r."Ship Date" from

(values (REGEX_LOG_PARSE('445 light bulbs should be in the basket. The order will ship on 5/4/2014.', '(^(\d+) (\S+\s\S+) should be in the basket. The order will ship on (\S+))'))) t(r);

 

         +----------+-------------+-------------+

         | Amount   |   Item      |  Ship Date  |

         +----------+-------------+-------------+

         | 445      | light bulbs |  5/4/2014   |

         +----------+-------------+-------------+

         1 row selected

 

See also the topic FAST_REGEX_LOG_PARSER in this guide, as well as the topic LogFileAdapter.

Quick Regex Reference

For full details on Regex, see java.util.regex.pattern

[xyz]        Find single character of: x, y or z

[^abc]        Find any single character except: x, y, or z

[r-z]        Find any single character between r-z

[r-zR-Z]        Find any single character between r-z or R-Z

^        Start of line

$        End of line

\A        Start of string

\z        End of string

.        Any single character

\s        Find any whitespace character

\S        Find any non-whitespace character

\d        Find any digit

\D        Find any non-digit

\w        Find any word character (letter, number, underscore)

\W        Find any non-word character

\b        Find any word boundary

(...)        Capture everything enclosed

(x|y)        Find x or y (also works with symbols such as \d or \s)

x?        Find zero or one of x (also works with symbols such as \d or \s)

x*        Find zero or more of x (also works with symbols such as \d or \s)

x+        Find one or more of x (also works with symbols such as \d or \s)

x{3}        Find exactly 3 of x (also works with symbols such as \d or \s)

x{3,}        Find 3 or more of x (also works with symbols such as \d or \s)

x{3,6}        Find between 3 and 6 of x (also works with symbols such as \d or \s)