Using the ECD Agent with Hadoop

<< Click to Display Table of Contents >>

Navigation:  Integrating Blaze with Other Systems > Writing Data Out of s-Server > Writing to Other Destinations >

Using the ECD Agent with Hadoop

Previous pageReturn to chapter overviewNext page

Using the HDFS option for the ECD Agent, you can automatically upload a file to a designated Hadoop file system location. (This option is not available for the ECD Adapter). To do so, you invoke the hdfs option for the ECD Agent.

When you do so, the ECD Agent formats the file and uploads it to the location designated by the hdfs_output_dir property.

This file can be formatted as CSV, XML, or JSON. See Writing to Files with the ECD Adapter and Agent for more details. These files do not require further formatting in order to be integrated with Hadoop. (Once in the Hadoop file system, these files will be accessed by second-level systems such as Hive. See https://cwiki.apache.org/confluence/display/Hive/Home for more details on Hive.)

Agent Properties Settings for Hadoop

You configure adapter options through foreign streams/tables. You configure agent options through the ECD agent property file.

Property Name

Description

hdfs_output_dir

hdfs://storm-s3.disruptivetech.com:8020/user/sqlstream/

hdfs_input_dir

hdfs://storm-s3.disruptivetech.com:8020/user/sqlstream

Sample Property File for ECD Agent

Note: Before using the ECD agent, you need to create a source stream for it. In the below example, you would need to create the foreign stream "FILEWRITERSTREAM."

 

SEPARATOR=^A

DIRECTORY=/home/sqlstream/customers/sample

FILENAME_PREFIX=test-

FILENAME_DATE_FORMAT=yyyy-MM-dd-HH-mm-ss

MAX_BYTES_PER_FILE=25000000

FILENAME_SUFFIX=.csv

CHARACTER_ENCODING=US-ASCII

ROWTYPE=RecordType(INTEGER COL1, INTEGER COL2, VARCHAR(256) COL3, VARCHAR(256) COL4)

SCHEMA_NAME=HADOOPWRITERTEST

TABLE_NAME=HDFS_STREAM

HDFS_OUTPUT_DIR=hdfs://storm-s3.disruptivetech.com:8020/user/sample/NOCLOBBER=false

 

to invoke the agent

From the directory $SQLSTREAM_HOME/../clienttools/EcdaAgent/ enter the following at the command line

$ ./commondataagent.sh --output --props sample.test.properties --io hdfs