Big Query Connector
SQLstream Google Big Query Connector is a bulk load script installed as part of Client Tools within the s-Server Installer or the separate Client Tools Installer. Note: SQLstream can work with you to modify this script for use with other platforms such as ParAccel or Kognitio.
The Big Query Connector is used for integrating SQLstream with Google Big Query for Continuous ETL (extract, transport, load). Such an integration consists of several parts.
|1.||Using SQLstream Logfile Agent or Logfile Adapter to continuously process incoming data, condition that data, and transform the data into a form consistent with the definition of tables in Google Big Query.|
|2.||Use options for Writing to the File System to create set of rotating csv formatted files matching the Google Big Query table definition. You can also use the FileWriter Adapter or FileWriter Agent (these are a legacy adapter and agent).|
An annotated properties file for FileWriterAgent and a sample FileWriterAdapter configuration are available in the installed Big Query Connector directory. The annotations describe the desired settings for the File Writer configuration to be compatible with the provided load script.
|3.||Using the Big Query Connector load script to push the rotated files to Big Query. Users clone and modify the load script to set two parameters:|
|a.||BQTABLE is set to the name of the Big Query table.|
|b.||FNB is set to the same value as FILENAME_BASE in the FileWriterAgent or FIleWriterAdapter configuration. FNB should end with a trailing underscore (_) character.|
The Big Query Connector load script will look for newly rotated files, use gzip to compress each rotated file, and then use the Google big query command line tool to append the contents of the rotated file to the Big Query Table.
The load script expects to find the rotated files in <current directory>/destination.
Note: You first need to create the destination folder under $SQLSTREAM_HOME/../clienttools/BQConnector/BQConnector/
After each file is loaded, it is moved to the <current directory>/loaded directory while the results of each load is created in the <current directory>/logs directory.
Before running the load script, perform a bq init to set the proper permissions for loading data.
The bq command line tool is found in the BQConnector directory clienttools/BQConnector/lib/bq. You should check the Google Big Query download site for a newer version.
For applications querying the Google Big Query tables, the Big Query Connector provides the java api library in clienttools/BQConnector/lib/javaapi. You should check the Google Big Query download site for the newest version.
For a file named salesbids_20120704_120000, after loading the file will be named salesbids_20120704_120000.gz and located in the loaded directory.
The results of the load will be stored in a text file named salesbids_20120704_120000.201207041201500.success or salesbids_20120704_120000.<load time>.failed depending upon the results of the load request.