To read CSV formatted data, you need to specify a separator character (such as a comma or a pipe) for values. You can also indicate whether or not the file has a header. When the Extensible Common Data Framework parses CSV, each row becomes a row in the stream. Columns are identified by the separator character, which can be a comma, a pipe, or any other character you designate.
s-Server matches data types according to its Ingestion Rules for CSV.
Note: You can also input data in larger chunks and parse it later using the Parser UDX. This UDX calls the parsers listed above in a function. For more information on using functions, see the topic Transforming Data in s-Server in this guide.
The following example sets up a server called "CSVReaderServer", then creates and sets a schema called "SCHEMA_CSV". It then sets up a foreign stream that will parse columns called ts, accountNumber, sourceIP, destIP, and customerID from a file in /tmp/ called transactions.log. To parse CSV over other input/output systems, such as IBM MQ or AMQP, you would need to specify options for these formats. See Reading from Other Sources for more details.
Note: The examples below use the file system as an input system. To parse CSV over other systems, such as IBM MQ or AMQP, you would need to specify options for these formats. See Reading from Other Sources for more details.
create or replace server "CSVReaderServer" type 'FILE'
foreign data wrapper ECDA;
create or replace schema "SCHEMA_CSV";
set schema 'SCHEMA_CSV';
CREATE OR REPLACE FOREIGN STREAM "CSVReaderStream"
"ts" TIMESTAMP NOT NULL,
--Columns for the new stream
To actually begin reading from the file, use a statement along the lines of
select stream * from CSVReaderStream;
To parse CSV files with the ECD Agent, configure the options above using the ECD Agent property file with properties similar to the following:
ROWTYPE=RECORDTYPE(VARCHAR(2040) id, VARCHAR(2040) reported_at, VARCHAR(2040) shift_no, VARCHAR(2040) trip_no, VARCHAR(2040) route_variant_id)