CREATE FOREIGN STREAM

<< Click to Display Table of Contents >>

Navigation:  Streaming SQL Reference Guide > CREATE statements >

CREATE FOREIGN STREAM

Previous pageReturn to chapter overviewNext page

A foreign stream is an instance of a foreign data wrapper that provides access in s-Server to a flow of data either from or to an external system. Foreign streams are often the site where data enters s-Server.

For example, the Extensible Common Data Adapter supports the creation of a foreign stream that will take rows (from a local stream) and send them by email to a recipient.

Syntax

CREATE [OR REPLACE] FOREIGN STREAM foreign-stream-name (<column_list>) SERVER <server_name> [OPTIONS <options_specification>] [DESCRIPTION string_literal]

create_foreign_stream

Note: The OPTIONS required to create a foreign stream depend upon the server that the foreign stream depends upon. Servers are often defined for an adapter, such as the SQL/MED Adapter or Extensible Common Data Adapter. See the s-Server Integrations page in the Integration Guide for more details.

For detailed examples of using CREATE FOREIGN STREAM and stream options, see the topics Reading from Other Sources and Writing to Other Sources in the Integration Guide.

Example

The following example uses the ECDA file reader to write CSV data to the file system. The foreign stream requires a server, which is why we create this first. It also requires a schema, which is what we create next. This foreign stream declares columns explicitly, but in some cases, you can derive column names from the external source.

See the topic Writing to the File System in the Integration Guide for more information about this example.

CREATE OR REPLACE SERVER "FileReaderServer" TYPE 'FILE'

FOREIGN DATA WRAPPER ECDA;

 

Note: ECD Adapter server definitions need to reference the ECD foreign data wrapper. You do so with the syntax FOREIGN DATA WRAPPER ECDA.

CREATE OR REPLACE SCHEMA "FileWriterSchema"

SET SCHEMA 'FileWriterSchema';

 

CREATE OR REPLACE FOREIGN STREAM "FileWriterStream"

("recNo" INTEGER,

"ts" TIMESTAMP,

"accountNumber" INTEGER,

"loginSuccessful" BOOLEAN,

"sourceIP" VARCHAR(32),

"destIP" VARCHAR(32),

"customerId" INTEGER)

--Columns for the new stream

SERVER "FileWriterServer"

OPTIONS

(directory 'myDirectory',

--directory for the file

formatter 'CSV',

filename_pattern 'myRecord.csv',

--regex for filename pattern to look for

character_encoding 'UTF-8',

skip_header 'true');

 

To start writing, you need to create a pump to insert data into the foreign stream. You do so using code along the following lines. See the topic CREATE PUMP in this guide for more information on pumps.

CREATE OR REPLACE SCHEMA "Pumps";

SET SCHEMA '"Pumps"';

 

CREATE OR REPLACE PUMP "writerPump" STOPPED AS

--We recommend creating pumps as stopped

--then using ALTER PUMP "Pumps"."writerPump" START to start it

INSERT INTO "FileWriterSchema"."FileWriterStream"

SELECT STREAM * FROM "MyStream";

--where "MyStream" is a currently existing stream

 

To start writing data, use the following code:

ALTER PUMP "Pumps"."writerPump" START;

Options

You can specify an OPTIONS clause to provide parameters for a foreign stream or for any column.

Each OPTIONS specification is an option-value pair that names the option and gives its intended value.

No two options in the same clause may have the same name.

Options vary greatly depending on the kind of source for which you are setting up the foreign stream. The uses of foreign streams are not limited to the below, but these do represent common uses for foreign streams in s-Server.

Options for Reading Data

Options for Writing Data

Options for Parsing Files

Options for Formatting Files

File System, Sockets, AMQP, Kafka topics, HTTP, IBM MQ, Web Feed

File System, Sockets, AMQP, HTTP, Web Socket, Kinesis streams, Kafka topics, IBM MQ, Snowflake

Discovery parser, CSV, XML, Key Pair, Avro, ProtoBuf, VCLP, FastRegex, W3C

CSV, XML, JSON

Foreign Stream Options for Reading from the File System

See the topic Reading from the File System in the Integration Guide for more details.

Option

Description

DIRECTORY

Directory in which file resides or to which you are writing.

FILENAME_PATTERN

Input only. Java regular expression defining which files to read. See https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html for more information on Java regular expressions.

STATIC_FILES

Defaults to false. When you set this to true, you indicate that no files will be added or changed after the file reader is started. File reader will exit after the last file is read. This lets you use the file reader as a foreign table, which is finite (as opposed to a foreign stream, which is infinite, and handles files that are continually written to).

Foreign Stream Options for Writing to the File System

See the topic Reading from the File System in the Integration Guide for more details.

Option

Description

DIRECTORY

Directory to which you are writing. s-Server needs permission to write to this directory.

ORIGINAL_FILENAME

Name of file to write before rotation. This will be the name of the file that s-Server is currently writing.

ESCAPE_<column name>

True or false; defaults to true. Causes strings to be escaped before being written.  

FILENAME_PREFIX

Prefix of the final output file name, such as "test-".

FILENAME_DATE_FORMAT

Java time format to use in the final output file name, for example yyyy-MM-dd_HH:mm:ss

Uses java SimpleDateFormat

FILENAME_SUFFIX

Suffix of the final output file name. Needs to include period if desired, e.g. ".csv"

MAX_BYTES_PER_FILE

Maximum number of bytes to write to a file before rotation. Either MAX_BYTES_PER_FILE or MAX_TIME_DELTA_PER_FILE must be specified for an output file set.

MAX_TIME_DELTA_PER_FILE

Maximum difference in rowtime in a specific file. Either MAX_BYTES_PER_FILE or MAX_TIME_DELTA_PER_FILE must be specified for an output file set.

POSTPROCESS_CMD

The POSTPROCESS_CMD option lets you run a script after the file is written. To use this option, enter the path to the script, along with parameters, substituting <input> for the name of the file.

 

When the file is written, the script will execute with parameters, and <input> will be replaced by the name of the file.

 

Example: 'scripts/runStandaloneSystemML.sh scripts/algorithms/l2-svm-predict.dml -nvargs X=<input> Y=data/haberman.test.labels.csv model=data/l2-svm-model.csv fmt="csv" confusion=data/l2-svm-confusion.csv',

Foreign Stream Options for Reading from Sockets

The ECD framework can act as a server or client. When it acts a client, set REMOTE_HOST and REMOTE_PORT.  When it acts a server, set SERVER_PORT and if desired SERVER_HOST. See the topic Reading from Network Sockets in the Integration Guide for more details.

Name

Description

IS_IPV6

Whether or not the socket uses IPV6. Default is false.

IS_TCP

Whether the socket uses TCP (True) or UDP (False). Default is false (UDP).

REMOTE_HOST

Hostname to send tuples to or receive tuples from, when ECDA is acting as a client. You can override this to 'LOCALHOST' to listen to only local connections, or a specific ip address, such as <168.212.226.204>. When you specify REMOTE_HOST and REMOTE_PORT, this tells the ECD socket code to start the network connection as a client.

REMOTE_PORT

Port to send tuples to or receive tuples from when ECDA is acting as a client. REMOTE_* and SERVER_* tells ECDA's socket code how to start the network connection (as a server or a client).

SERVER_HOST

The hostname or IP address to listen upon to send/receive tuples, when ECDA is acting as a server (defaults to 0.0.0.0). When you specify SERVER_HOST and SERVER_PORT, this tells the ECD socket code to start the network connection as a client.

SERVER_PORT

the port to listen upon to send/receive tuples when ECDA is acting as a server.

Foreign Stream Options for Writing to Sockets

The ECD framework can act as a server or client. When it acts a client, set REMOTE_HOST and REMOTE_PORT.  When it acts a server, set SERVER_PORT and if desired SERVER_HOST. See the topic Writing to Network Sockets in the Integration Guide for more details.

Name

Description

IS_IPV6

Whether or not the socket uses IPV6. Default is false.

IS_TCP

Whether the socket uses TCP (True) or UDP (False). Default is false (UDP).

REMOTE_HOST

Hostname to send tuples to or receive tuples from, when ECDA is acting as a client. You can override this to 'LOCALHOST' to listen to only local connections, or a specific ip address, such as <168.212.226.204>. When you specify REMOTE_HOST and REMOTE_PORT, this tells the ECD socket code to start the network connection as a client.

REMOTE_PORT

Port to send tuples to or receive tuples from when ECDA is acting as a client. REMOTE_* and SERVER_* tells ECDA's socket code how to start the network connection (as a server or a client).

SERVER_HOST

The hostname or IP address to listen upon to send/receive tuples, when ECDA is acting as a server (defaults to 0.0.0.0). When you specify SERVER_HOST and SERVER_PORT, this tells the ECD socket code to start the network connection as a client.

SERVER_PORT

the port to listen upon to send/receive tuples when ECDA is acting as a server.

Foreign Stream Options for Reading from AMQP

See the topic Reading from AMQP in the Integration Guide for more details.

AMQP 0.9.1 vs 1.0

There are distinct differences between the way AMQP up to 0.9.1 works and the way AMQP 1.0 works. Roughly, AMQP 1.0  only provides network wire-level protocol for the exchange of messages, while up to 0.9.1, AMQP employed a system of  exchanges, queues and bindings to exchange messages. As a result of this change, you configure the connection URL for AMQP differently for 1.0 than for up to 0.9.1

Name

Description

CONNECTION_URL

Required. Connection URL for AMQP legacy server. This includes the servers hostname, user, password, port and so on. This will differ depending on whether you are using AMQP 1.0 or a legacy version.

AMQP_LEGACY (AMQP protocol Version 0.9, e.g., RabbitMQ)

Format: amqp://<username>:<password>@<clientid>/<profile>?brokerlist='tcp://<hostname>:<portNumber>'&[ <optionsList> ]

Example: amqp://guest:guest@clientid/default?brokerlist='tcp://localhost:5672'

AMQP10 (AMQP protocol Version 1.0) - connectionfactory.localhost:

Format: amqp://<username>:<password>@<hostname>:<portNumber>?<optionsList>'

Example: amqp://guest:guest@localhost:5672?clientid=test-client&remote-host=default

 

Single quotes must be doubled for SQL formatting. You may need to do additional URL escaping depending on your AMQP implementation.Single quotes must be doubled for SQL formatting. You may need to do additional URL escaping depending on your AMQP implementation. The site https://azure.microsoft.com/en-us/documentation/articles/service-bus-java-how-to-use-jms-api-amqp/ offers an example of formatting a connection URL.

DESTINATION

Required. AMQP 0.9 queue or topic identifier.

5_2_indicatorThis can be in the form <destination prefix>{PARITITION}<destination suffix>.

PARTITION_EXPRESSION

You should only use this if DESTINATION includes "{PARTITION}". This should be a dk.brics regular expression, such as <0-3>.

PARSER_QUEUE_SIZE

Queue size for parser. Reading only. Defaults to 2. In most cases, you will not want to change this number.

ACKNOWLEDGE_MODE

Optional. Acknowledgment mode that ECDA communicates to the AMQP 1.0 server. Options are AUTO, MANUAL, or NONE; defaults to AUTO. Details on these modes are available at http://docs.spring.io/spring-integration/reference/html/amqp.html#amqp-inbound-ack

Roughly, AUTO asks the container on the AMQP server to acknowledge once message is completely delivered. MANUAL means that delivery tags and channels are provided in message headers, and NONE means no acknowledgments.

Foreign Stream Options for Writing to AMQP

See the topic Writing to AMQP in the Integration Guide for more details.

AMQP 0.9.1 vs 1.0

There are distinct differences between the way AMQP up to 0.9.1 works and the way AMQP 1.0 works. Roughly, AMQP 1.0  only provides network wire-level protocol for the exchange of messages, while up to 0.9.1, AMQP employed a system of  exchanges, queues and bindings to exchange messages.  As a result of this change, you configure the connection URL for AMQP differently for 1.0 than for up to 0.9.1

Name

Description

CONNECTION_URL

Required. Connection URL for AMQP 1.0 server. This includes the servers hostname, user, password, port and so on. This will differ depending on whether you are using AMQP 1.0 or a legacy version.

Required. Connection URL for AMQP legacy server. This includes the servers hostname, user, password, port and so on. This will differ depending on whether you are using AMQP 1.0 or a legacy version.

AMQP_LEGACY (AMQP protocol Version 0.9, e.g., RabbitMQ)

Format: amqp://<username>:<password>@<clientid>/<profile>?brokerlist='tcp://<hostname>:<portNumber>'&[ <optionsList> ]

Example: amqp://guest:guest@clientid/default?brokerlist='tcp://localhost:5672'

AMQP10 (AMQP protocol Version 1.0) - connectionfactory.localhost:

Format: amqp://<username>:<password>@<hostname>:<portNumber>?<optionsList>'

Example: amqp://guest:guest@localhost:5672?clientid=test-client&remote-host=default

Single quotes must be doubled for SQL formatting. You may need to do additional URL escaping depending on your AMQP implementation.

The site https://azure.microsoft.com/en-us/documentation/articles/service-bus-java-how-to-use-jms-api-amqp/ offers an example of formatting a connection URL.

DESTINATION

Required. AMQP 1.0 queue or topic identifier. In general, the destination qualification syntax may be specific to the AMQP message broker implementation. The examples here are specific to ActiveMQ.

You can fully qualify the AMQP destination by identifying the destination as a topic or a queue. ActiveMQ supports such qualification.

For a topic: "DESTINATION" 'topic://testTopic',

For a queue: "DESTINATION" 'queue://testTopic',

ActiveMQ treats an unqualified destination as a queue. In other words, for ActiveMQ, DESTINATION 'foo' is equivalent to DESTINATION 'queue://foo'

See http://camel.apache.org/activemq.html for more details.

5_3_indicator In s-Server 5.3, DESTINATION can be either an absolute destination or be in the form:

<destination prefix>{PARITITION}<destination suffix>

Example:

/new/ConsumerGroups/$Default/Partitions/{PARTITION}

ACKNOWLEDGE_MODE

Optional. Acknowledgment mode that ECDA communicates to the AMQP 1.0 server. Options are AUTO, MANUAL, or NONE; defaults to AUTO. Details on these modes are available at http://docs.spring.io/spring-integration/reference/html/amqp.html#amqp-inbound-ack

Roughly, AUTO asks the container on the AMQP server to acknowledge once message is completely delivered. MANUAL means that delivery tags and channels are provided in message headers, and NONE means no acknowledgments.

DELIVERY_MODE

Optional. Delivery mode for messages that ECDA communicates to the AMQP 1.0 server. Options are NON-PERSISTENT or PERSISTENT; defaults to NON-PERSISTENT. Details on these modes are available at http://docs.spring.io/spring-integration/reference/html/amqp.html#amqp-outbound-channel-adapter-xml-7a

5_3_indicator Version 5.3 Feature

PARTITION_EXPRESSION

Use only if DESTINATION includes "{PARTITION}".

This should be a dk.brics regular expression, such as "<0-3>".

Foreign Stream Options for Reading from Kafka

See the topic Reading from Kafka in the Integration Guide for more details.

Format Name

Name

TOPIC

Required. Kafka Topic

STARTING_TIME

The time to start reading from specified topic. Options are LATEST, EARLIEST, or a long int representing a timestamp

(milliseconds since epoch). Defaults to LATEST.

STARTING_OFFSET

When to start reading from (default is -1) as a long int representing a timestamp (milliseconds since epoch)

SEED_BROKERS

Name of the Kafka host. Defaults to localhost.

PORT

Port for Kafka server. Defaults to 9092.

PARTITION

Partition number to read from.

If reading from a kafka topic with multiple partitions and PARTITION is omitted or blank all partitions will be read from.

You can specify a single partition with:

PARTITION <partition number>

or a range with:

PARTITION <first partition number>-<last partition number>

Note: Partition numbers are 0 based.

PARTITION_OFFSET_QUERY

Returns the starting offsets for all partitions. Any partition for which the query does not return anything will either use STARTING_OFFSET or STARTING_TIME to determine where to start.

Should be something like "SELECT PARTITION, OFFSET FROM [TABLE NAME]".

PARTITION should be of type INTEGER.

OFFSET should be of type BIGINT.

BUFFER_SIZE

Buffer size in bytes. Defaults to 1048576.

FETCH_SIZE

Fetch size. Defaults to 1000000.

MAX_WAIT

Maximum number of milliseconds to wait on a read attempt. Defaults to 2000. Affects the time needed to cancel a query on the Kafka topic.

CLIENT_ID

Client key for Yammer metrics. CLIENT_ID defaults to client_{TOPIC} or CLIENT_{TOPIC}_{PARTITION} if PARTITION is specified.

CLIENT_ID and METRICS_PER_PARTITION affect Kafka Yammer metrics reporting. CLIENT_ID does not apply unless METRICS_PER_PARTITION is set to true.

See http://docs.confluent.io/1.0/kafka/monitoring.html for more information on Kafka Yammer metrics.

METRICS_PER_PARTITION

True or False.

If METRICS_PER_PARTITION is false, then CLIENT_ID will be used as the client key for all yammer metrics reporting for that adapter. If METRICS_PER_PARTITION is true, then the actual partition number will be appended to each to each client_id (and finer grained metrics will be reported).

OPTIONS_QUERY

Lets you query a table to update adapter options at runtime. You can use this, for example, to set the STARTING_OFFSET option from a table that contains the last offset, as in select lastOffset as STARTING_OFFSET from TEST.committedOffset');

Foreign Stream Options for Reading Over HTTP

See the topic Reading Over HTTP in the Integration Guide for more details.

Format Name

Name

URL

URL for HTTP feed.

POLL_IN_MILLIS

How often to request new data, in milliseconds.

HEADER_<name_of_header>

Tells HTTP reader to look for a header called <name_of_header>.

PARAMETER_<parameter>

Sends parameter <parameter>

Foreign Stream Options for Reading Over Web Sockets

See the topic Reading Over Web Sockets in the Integration Guide for more details.

Format Name

Name

URL

URL for web socket.

POLL_IN_MILLIS

How often to request new data, in milliseconds.

HEADER_<name_of_header>

Tells Web Socket reader to look for a header called <name_of_header>.

Foreign Stream Options for Writing to Kafka

The following options, which are passed to the Kafka broker, are described in more detail at http://kafka.apache.org/documentation.html. Aside from defaults, the information in this table is drawn from this page. See the topic Writing to Kafka in the Integration Guide for more details.

Option name

Description

TOPIC

Kafka topic

metadata.broker.list

hostname:port of the Kafka broker. Defaults to localhost:9092. Used for getting metadata (topics, partitions and replicas). Actual socket connections are established using the information from this metadata. Use commas to separate multiple brokers.

partitioner.class

Fully qualified Java classname of Kafka partitioner

Defaults to

com.sqlstream.aspen.namespace.common.KafkaOutputSink.RoundRobinPartitioner)

serializer.class

Fully qualified Java classname of Kafka serializer, Defaults to kafka.serializer.DefaultEncoder

key.serializer.class

Names a serializer class for keys. If no class is given, Kafka uses serializer.class.

producer.type

Specifies whether messages sent asynchronously in a background thread. Async lets requests be batched. This helps throughput but increases the possibility that a failed client machine results in unsent data. Defaults to async.

compression.codec

Specifies the compression codec for generated data, either "none", "gzip" and "snappy".

compressed.topics

If you have specified a compression.codec (other than "none"), this option lets you limit compression to those topics listed in this option. Empty means apply compression to all topics.

message.send.max.retries

If enabled, producer will automatically retry a failed send request for a set number of retries.

Note: using this option may, according to Kafka documentation "lead to duplicates in the case of network errors that cause a message to be sent but the acknowledgment to be lost."

retry.backoff.ms

Producers refreshes metadata to see if a new leader has been elected. This option specifies the amount of time to wait before refreshing.

request.required.acks

If async is enabled in producer.mode, this sets a max time for buffering data in milliseconds. For example, "100" means "try to batch together 100ms of messages." Like most buffering, improves throughput but adds message delivery latency.

request.timeout.ms

When request.required.acks is enabled, this lets you specify how long the broker should try to bundle the specified number of messages before sending back an error to the client.

topic.metadata.refresh.interval.ms

By default, the producer refreshes topic metadata along two lines:

First, at regular intervals, which defaults to every 600000 ms, or 10 minutes.

Second, any time there is a failure, such as a partition missing or leader not being available.

This setting lets you change the regular polling interval by specifying a new interval. If you set this number to a negative, the producer only refreshes on failure. If you set this number to zero, the producer refreshes ever time a message is sent (this is not recommended).

Note: refresh happen only AFTER the message is sent, so if the producer never sends a message the metadata is never refreshed.

queue.buffering.max.ms

If async is enabled in producer.mode, this lets you specify a maximum number of unsent messages to queue. Once this number is reached, either the producer must be blocked or data must be dropped.

queue.buffering.max.messages

If async is enabled in producer.mode, this lets you specify a maximum number of messages to buffer.

 

queue.enqueue.timeout.ms

If async is enabled in producer.mode, this lets you specify an amount of time to block before dropping messages when the buffer has reached the value specified in queue.buffering.max.messages. If you set this option, to 0 events will be enqueued immediately or dropped if the queue is full. If you set this option to -1, the producer will block indefinitely.

batch.num.messages

If async is enabled in producer.mode, this lets you specify the number of messages to send in one batch. With this option enabled, the producer will wait until either this number of messages are ready to send or queue.buffer.max.ms is reached.

send.buffer.bytes

Socket write buffer size.

client.id

Using this option, you can specify a string to help you identify the application making calls to the Kafka server.

Foreign Stream Options for Reading from IBM MQ

See the topic Reading from IBM MQ in the Integration Guide for more details.

Format Name

Name

QUEUE_MANAGER_NAME

Required. Name of IBM MQ Queue Manager.

HOSTNAME

Required. IBM MQ hostname. Defaults to localhost.

PORT

Optional. Port for IBM MQ server. Defaults to 1414.

USERID

User name for IBM MQ server.

PASSWORD

Password for IBM MQ server.

CHANNEL_NAME

Server connection channel for the IBM MQ server.

TOPIC_NAME

Name of the IBM MQ topic from which you are receiving data. Need to specify either QUEUE_NAME or TOPIC_NAME but not both.

TOPIC_OBJECT

Any IBM MQ administrative topic objects for the topic. See http://pic.dhe.ibm.com/infocenter/wmqv7/v7r0m0/topic/com.ibm.mq.amqnar.doc/ps12490_.htm for more details.

QUEUE_NAME

Name of the queue from which you are receiving data. Need to specify either QUEUE_NAME or TOPIC_NAME but not both.

Foreign Stream Options for Writing to IBM MQ

Format Name

Name

QUEUE_MANAGER_NAME

Required. Name of IBM MQ Queue Manager.

HOSTNAME

Required. IBM MQ hostname. Defaults to localhost.

PORT

Optional. Port for IBM MQ server. Defaults to 1414.

USERID

User name for IBM MQ server.

PASSWORD

Password for IBM MQ server.

CHANNEL_NAME

Server connection channel for the IBM MQ server.

TOPIC_NAME

Name of the IBM MQ topic from which you are receiving data. Need to specify either QUEUE_NAME or TOPIC_NAME but not both.

TOPIC_OBJECT

Any IBM MQ administrative topic objects for the topic. See http://pic.dhe.ibm.com/infocenter/wmqv7/v7r0m0/topic/com.ibm.mq.amqnar.doc/ps12490_.htm for more details.

QUEUE_NAME

Name of the queue from which you are receiving data. Need to specify either QUEUE_NAME or TOPIC_NAME but not both.

Additional Options for IBM MQ

Format Name

Name

OUTPUT_OPEN_OPTIONS

These are a set of options that control the opening of a topic, which can be passed through to the MQSeries server. More information about these options is available here. You pass these through as integers, per the instructions at the MQSeries docs.

CMQC.MQOO_ALTERNATE_USER_AUTHORITY

CMQC.MQOO_BIND_AS_Q_DEF

CMQC.MQOO_FAIL_IF_QUIESCING

CMQC.MQOO_OUTPUT

CMQC.MQOO_PASS_ALL_CONTEXT

CMQC.MQOO_PASS_IDENTITY_CONTEXT

CMQC.MQOO_SET_ALL_CONTEXT

CMQC.MQOO_SET_IDENTITY_CONTEXT

OUTPUT_PUT_OPTIONS

These options control the MQQueue.get() statement, and can be passed through to the MQSeries server.

Valid options are as follows:

MQC.MQGMO_WAIT

MQC.MQGMO_NO_WAIT

MQC.MQGMO_SYNCPOINT

MQC.MQGMO_NO_SYNCPOINT- default

MQC.MQGMO_BROWSE_FIRST

MQC.MQGMO_BROWSE_NEXT

MQC.MQGMO_BROWSE_MSG_UNDER_CURSOR

MQC.MQGMO_MSG_UNDER_CURSOR

MQC.MQGMO_LOCK

MQC.MQGMO_UNLOCK

MQC.MQGMO_ACCEPT_TRUNCATED_MSG

MQC.MQGMO_FAIL_IF_QUIESCING

MQC.MQGMO_CONVERT

Foreign Stream Options for Writing to a Kinesis Stream

See the topic Writing to Kinesis Streams in the Integration Guide for more details.

Option Name

Description

KINESIS_STREAM_NAME

Required. Name of Kinesis stream to write to. No default.

AWS_REGION

Required. Region id of Kinesis region. See http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region for more detalis.

AWS_PROFILE_PATH

See Setting Up an AWS Profile Path  in the topic Reading from Kinesis Streams in the Integration Guide . Must point to a credential file on the s-Server file system with the following format:

[default]

aws_access_key_id = xxx

aws_secret_access_key = yyy

 

This defaults to '' - which goes to ~/.aws/credentials.

Note: You need to have an AWS profile set up, and a configuration file stored on your system, in order to read from or write to Kinesis. See Setting Up an AWS Profile Path  in the topic Reading from Kinesis Streams in the Integration Guide .

AWS_PROFILE_NAME

Optional. Profile name to use within credentials file. Amazon supports multiple named profiles within a configuration file. If you have a named profile, you can reference it here. Defaults to default. See http://docs.aws.amazon.com/cli/latest/userguide/cli-multiple-profiles.html

INITIAL_BACKOFF

Optional. If insert fails due to throttling, how many milliseconds to back off initially. Default to 20.

MAX_BACKOFF

Optional. If insert fails due to throttling, how many milliseconds to back off as a maximum. Defaults to 20480.

MAX_RETRIES

Optional. If insert fails due to throttling, how many retries should be made. Backoff is doubled for each retry up to max. Default is 10.

BUFFER_SIZE

Optional. Maximum number of bytes per update request. Defaults to 4194304.

MAX_RECORDS_PER_REQUEST

Optional. maximum number of records per update request. Defaults to 500.

REPORT_FREQUENCY

Optional. How often to log (in milliseconds) statistics. Defaults to 0, which means "never."

KINESIS_DEFAULT_PARTITION_ID

Optional. Partition id of shard to write to. Defaults to ''.

Foreign Stream Options for Writing to a Snowflake Warehouse

5_2_indicator Version 5.2 Feature

See the topic Writing to Snowflake in the Integration Guide for more details.

Option Name

Description

ACCOUNT

The name assigned to your account by Snowflake.

USER

The user name for your Snowflake account.

PASSWORD

The password for your Snowflake account.

DB

The database to write to in Snowflake. This should be an existing database for which user/password has privileges.

SCHEMA

The schema to write to in the Snowflake database. This should be an existing schema  for which user/password has privileges.

WAREHOUSE

The Snowflake warehouse to write to. This should be an existing warehouse for which user/password has privileges.

DTABLE

The table to write to in Snowflake. This should be an existing table for which user/password has privileges.

Options for Writing to a MongoDB Collection

5_2_indicator Version 5.2 Feature

See the topic Writing to MongoDB in the Integration Guide for more details.

Option

Definition

URL

Fully qualified URL starting with mongodb://         and including, at minimum, a host name (or IP address or UNIX domain socket). URL can also include a username and password (these are passed to the MongoDB instance) and a port number. See https://docs.mongodb.com/manual/reference/connection-string/ for more information.

COLLECTION

MongoDB collection to which data will be written.

Options for Discovery Parser

Option

Definition

MAX_EXAMPLE_BYTES

Limits scanning of sample data in case data files are to big. For example, if MAX_EXAMPLE_BYTES is set to 65536 then only 64 KB data is read from data files to run the discovery process.

DISCOVERY_TIMEOUT

In milliseconds, the amount of time that the discovery parser will collect data from the source. For example, if DISCOVERY_TIMEOUT = 5000, the discovery parser will collect data from the source for up to 5 seconds. The discovery parser will stop collecting sample data if either MAX_EXAMPLE_BYTES of

or DISCOVERY_TIMEOUT period has been met.

AVRO_SCHEMA_FILE

If needed, specifies the path for the schema file to be used for parsing the Avro payload.

 

Foreign Stream Options for Parsing CSV Data

Option

Definition

SKIP_HEADER

True or false; defaults to false. Specifies if the parser should skip the first row of input files. Usually, you would do this if the first row of data contains column headers.

QUOTE_CHARACTER

Lets you specify an expected quotation character. There is no default for quote character.

ROW_SEPARATOR

Lets you specify a character that splits lines in the source. Defaults to /n.

For example, in the case of reading from a DOS format file where line separators are CR/LF rather than just LF, you will need to specify

 

For this you would need to specify the following:

row_separator U&'\000D\000A'`

(This is the hexidecimal Unicode value for CR/LF. See http://unicode.org/standard/reports/tr13/tr13-5.html.)

Foreign Stream Options for Writing CSV Data

Option

Definition

FORMATTER

This needs to be CSV.

WRITE_HEADER

Whether to write the column names into a header row. True or False.

CUSTOM_FORMATTER_<column_name>

Allows overriding of individual column's formatting. Specifies a fully qualified Java classname that

implements com.sqlstream.aspen.namespace.common.TypeFormatter<T> where T is the Java type that matches the SQL type of the column, such as java.lang.String to VARCHAR/CHAR and java.lang.Double to DOUBLE

 

Foreign Stream Options for Parsing XML

Option

Definition

PARSER

This needs to be XML.

PARSER_XML_ROW_TAGS

An absolute XPATH query that finds the XML element that becomes one row. No default.

<column_name>_XPATH

An XPATH that finds the text that becomes the value of the column of the same name. Examples include  "RetailStoreID_XPATH" '/POSLog/Body/tri:RetailTransaction/RetailStoreID',

"WorkstationID_XPATH" '/POSLog/Body/tri:RetailTransaction/WorkstationID',

PARSER_XML_USE_ATTRIBUTES

True/false (default is false). Specifies the default XPATH query to find each column's data in a tuple. If false, the column name is assigned from a child element's tag name. If true, the default is @<column_name>, meaning an attribute of the xml row element.

CUSTOM_TYPE_PARSER_<column_name>

Allows overriding of individual column's parsing. Specifies a fully qualified Java classname that

implements com.sqlstream.aspen.namespace.common.TypeParser

Foreign Stream Options for Writing XML

Option name

Description

FORMATTER

This needs to be XML.

DOC_ELEMENTS

Specifies a list of elements, separated by slashes ( /), to make as the root of the XML document to write. Defaults to "batch".

ROW_ELEMENTS

Specifies a list of elements, separated by slashes ( /), to add for each row of the XML document's DOM. Defaults to "row".

DATA_ELEMENTS

Specifies a list of elements, separated by slashes ( /), to add for each datum in a row/tuple. You must define DATA_ELEMENTS or DATA_ATTRIBUTES  but not both. Using both will produce inconsistent XML.

DATA_ATTRIBUTES

Specifies a name of an attribute to add for each datum in a row/tuple. You must define DATA_ELEMENTS or DATA_ATTRIBUTES but not both. Using both will produce inconsistent XML.

<col_name>_ELEMENTS

Specifies a list of elements, separated by slashes ( /), to add for a specific datum in each row/tuple.

<col_name>_ATTRIBUTES

Specifies a name of an attribute to add for a specific column's datum in each row/tuple.

Foreign Stream Options for Parsing JSON Data

Option

Definition

PARSER  

 

This needs to be JSON.

ROW_PATH

This is the JSON path for the row to be found. The JsonPath parser uses a row path to find JSON objects containing a row, then the path for each column is used to find the value to be extracted.

<COLUMN_NAME>_PATH

Optional for each column in the ECDA column set. This defaults to $..<COLUMN_NAME>. Here, a column named 'FOO' would have an option named FOO_PATH that defaulted to $..FOO which would return the first property named FOO under the JSON object found by the ROW_PATH.

Foreign Stream Option for Writing JSON

Option

Definition

FORMATTER

This needs to be JSON.

CUSTOM_FORMATTER_<column_name>

Allows overriding of individual column type. Specifies a fully qualified Java classname that

implements com.sqlstream.aspen.namespace.common.TypeFormatter<T> where T is the Java type that matches the SQL type of the column, such as java.lang.String to VARCHAR/CHAR and java.lang.Double to DOUBLE

Foreign Stream Options for the Parsing Key Values

Option

Definition

PARSER

This needs to be KV.

QUOTE_CHARACTER

Lets you specify a different quote character, such as a single quote ('). Default is double quote (").

KEY_VALUE_SEPARATOR_CHARACTER

Lets you specify the character that connects keys and values. Default is equals symbol (=)

SEPARATOR

Lets you specify a character that separates key-value pairs. Default is comma (,).

ROW_SEPARATOR

Lets you specify a character that splits lines in the key-value source. Default is \n.

Foreign Stream Options for Parsing Avro Data

5_2_indicator Version 5.2 Feature

Option

Definition

PARSER  

 

This needs to be AVRO.

AVRO_SCHEMA_FILE

 

Required option to specify the path for the schema file to be used for parsing the Avro payload.

SCHEMA_HEADER

 

Required Boolean option to indicate if the Avro schema is embedded in the Avro data. This option needs to be set to FALSE for data sources like Kafka or AMQP, where each message can be one or more serialized Avro records without a schema.

ROW_PATH

This is the Avro path for the row to be found. The Avro parser uses a row path to find Avro objects containing a row, then the path for each column is used to find the value to be extracted.

<COLUMN_NAME>_PATH

Path for each column in the ECDA column set. This defaults to $..<COLUMN_NAME>. Here, a column named 'FOO' would have an option named FOO_PATH that defaulted to $..FOO which would return the first property named FOO under the Avro object found by the ROW_PATH.

Foreign Stream Options for Parsing ProtoBuf

Option

Definition

PARSER

'PROTOBUF'

Required. Indicates that ECD parser will parse files as protobuf.

SCHEMA_JAR

Required. Jar containing compiled java classes created with the Google protocol buffer compiler (protoc command), such as unitsql/concurrent/plugins/common/protobufData/protobufpackage.jar. Locations are relative to $SQLSTREAM_HOME. See https://developers.google.com/protocol-buffers/docs/javatutorial for more details.

Note: $SQLSTREAM_HOME refers to the installation directory for s-Server, such as /opt/sqlstream/5.0.XXX/s-Server.

SCHEMA_CLASS

Required. Class name of outer protobuf record created with the Google protocol buffer compiler (protoc command), such as protobuf.PackageProto.protobuf.PackageProto$protobufPackage. Locations are relative to $SQLSTREAM_HOME. See https://developers.google.com/protocol-buffers/docs/javatutorial for more details.

<column name>_PATH

Not required. Lets you specify a path within the schema that maps to a column in the foreign stream. If these are not specified, s-Server populates a column using a field with the same name in the outer level record .

MESSAGE_STREAM

True or false (defaults to false). Indicates whether or not to treat messages as continuous stream. If MESSAGE_STREAM is true then protobuf messages will be sent to s-Server as they're received. This could make a difference for sources, such as files or sockets, which don't necessarily deliver data in complete chunks.

Note:  If MESSAGE_STREAM is true, then all outer fields used must have an index less than any repeated field used.

MESSAGE_LENGTH_PREFIXED

True or false (default is false). Indicates whether or not are all records prefixed with a length field. Must be specified if MESSAGE_STREAM is set.

Foreign Stream Options for Parsing VCLP Files

Note: Unless you are using the VCLP parser for legacy purposes, the CSV option will most likely suit your needs better than the VCLP option. Because you can use any separator with the CSV parser (not only commas), the CSV option generally offers more flexibility than the VCLP.

Option

Definition

DIRECTORY

Directory in which file resides.

CHARACTER_ENCODING

Any Java Supported Encoding. Default: UTF-8

SEPARATOR

Defines the characters used to delimit the columns in your data. Any character in the list will be used as a delimiter. For example, after specifying SEPARATOR ',;:', every comma, semicolon, and colon in a log record starts a new column.

PARSER_DELIMITERS

Indicates delimiter character, such as a comma or a tab. File is separated into columns using delimiters. s-Server supports unicode character literals. A tab can also be a delimiter, specified using a unicode escape, e.g., u&'\0009', which is a string consisting only of a tab character.

Example: ',',

DELIMITERS

Example: ',\n',

TERMINATION_STRING

String that indicates end of file.

Example: '\n',

PARSER_QUOTE

Lets you include quoted string. Quoted strings appear in one field, even if they contains a delimiter.

Example: '"'

SLEEP_INTERVAL

Sleep interval between tail reads, in milliseconds. Default is 1000

MAX_ UNCHANGED _STATS

Number of sleep intervals without a change in the log file before tail checks for rotation.

Foreign Stream Options for Parsing Files Using FastRegex

Option

Value

PARSER

This needs to be FASTREGEX.

CHARACTER_ENCODING

Any Java Supported Encoding. Default: UTF-8

FILTER_COLUMN

Name of the column to use as a filter.

FILTER_PATTERN

Regex pattern to filter for this column.

FILTER_EXCLUDE_PATTERN

Regex pattern to exclude for column.

FILTER_ALLOWS_NULLS

Boolean true or false.

 

Foreign Stream Options for Parsing W3C Files

Option

Definition

FORMAT

Format specification, such as "%h %l %u %t "%r" %>s %b". See http://httpd.apache.org/docs/current/mod/mod_log_config.html

Web Feed Options for Foreign Streams

Under options, you define the list of xpath expressions to be found in the feed. These expressions have the suffix _xpath_. This list of expressions needs to map onto the list of columns that you define in the foreign stream. The following table gives examples of how column, xpath expression, and XML node correspond.

Example Foreign Stream Column

Corresponding Example xpath Expression (as defined in server options)

Corresponding Example XML Node from Feed

"title" VARCHAR(100),

"title_xpath" 'title',

<title>Dog Bites Man</title>

"link" VARCHAR(1024),

"link_xpath" 'link',

<link>http://news.google.com/news/url?dog-bites-man/2520159/</link>

"category" VARCHAR(20),

"category_xpath" 'category',

<category>Top Stories</category>