Hbase Agent

<< Click to Display Table of Contents >>

Navigation:  Integrating SQLstream Blaze with Other Systems > Appendix A: Legacy Adapters and Agents > Legacy Agents >

Hbase Agent

Previous pageReturn to chapter overviewNext page

There are currently 2 flavors of Agents.

1. HBaseAgent.HBaseAgent which reads from HBase and inserts to a stream.

2. HBaseAgent.Agent which selects from a stream and writes to HBase

HBaseAgent.sh           --script to run HBaseAgent.HBaseAgent

HBA.sh                  --script to run HBaseAgent.sh

Test.sh                 --script to run HBaseAgent.Agent

lib/HBaseAgent.jar  jar file containing both HBaseAgent and Agent.

sqlstream.application.properties  see Properties below

sqlstream.execution.properties    see Properties below

sqlstream.system.properties       see Properties below

sqlstream.application.properties

This properties file defines the properties for a particular instance of HBaseAgent.Agent.

Properties are defined as:

default.execution.<property>=<value> these are the default values for missing properties when running an instance of the program

These are the default values for missing properties when server.<servername>.<property> is not defined.

When looking up any property a.b.c, a missing value might be available as a default named default.a.c

Option

Option meaning

server.sqlstreamd.reconnectwait=10000

Time in milliseconds to wait between connection retries

Default: 10000

server.sqlstreamd.reconnectwaitmax=120000

Specifies number of times (<value>) activity check fails before checking for file rotation.

Default: 5

server.reconnectwait=10000

Specifies wait time (<value>) in milliseconds between activity checks.

Default: 10000

default.server.querytimeout

0

server.sqlstreamd.username=myusername

Specifies username for connection to SQLstream s-Server.

server.sqlstreamd.password=password

Specifies password for connection to SQLstream s-Server.

server.sqlstreamd.uri=jdbc:sqlstream:sdp

Specifies the JDBC URI for SQLstream s-Server, such as jdbc:sqlstream:sdp://host:port

server.sqlstreamd.host=localhost

 

 

server.sqlstreamd.port=5570

 

stream.<stream name from streams>.<property>=<value>

streams=FirefoxStreamForProductCounters,FirefoxStreamForServerCounter,FirefoxStreamForLocaleCounters,FirefoxStreamForLocationCounters,FirefoxStreamForMultiDimensionCounters

hbases=<>

hbase.<hbase name from hbases>.<property>=<value>

 

define the list of Hbase instances (comma separated list)

comma separated list of logical names hbase clusters

 

agents=<comma separated list of logical names for agents connecting an hbase table to a SQLstream stream>

agent.<agent name from agents>.<property>=<value>


 

 

servers

server.<server name from servers>.<property>=<value>.

<comma separated list of logical names for SQLstream servers>SQLstream servers definition. define the list of SQLstream servers (comma separated list)

servers=sqlstreamd

Connection info for server

define the connection attributes for each server defined in the servers definition


 

 

 

 

server.sqlstreamd.autocommit=false

server.sqlstreamd.autoreconnect=true

 

 

server.sqlstreamd.reconnectlimit=262080

server.sqlstreamd.querytimeout=0  















streams=<comma separated list of logical names for SQLstream streams>.

define the list of SQLstream streams (comma separated list)

Define Individual Streams

Define attributes of each stream defined in the streams definition sufficient to do a select stream * from <stream> from a specific instance of a SQLstream server

stream.FirefoxStreamForProductCounters.schema=mozilla-dw

stream.FirefoxStreamForProductCounters.stream=FirefoxStreamForProductCounters

stream.FirefoxStreamForProductCounters.server=sqlstreamd

Hbase cluster hbase definition

Define attributes of each instance of hbase defined in the hbases definition sufficient to connect to any of the hbase tables defined in the hbase.hbasetables definiiton

hbases=hbase

hbase.hbase.hbasedefault=./plugin/hbase/hbase-default.xml

hbase.hbase.hbasesite=./plugin/hbase/hbase-site.xml

hbase.hbase.hbasetables=dmo_metrics_realtime,dmo_metrics_hourly

hbase.hbase.autoreconnect=true

hbase.hbase.reconnectwait=10000

hbase.hbase.reconnectlimit=10

Agent Definition

You need to define each individual agent in terms of the associated stream which populates its foreign stream, the mappings to/from the hbase columns and to/from the stream.

Agent dmo_product_counters definition

Define Associated Stream

Define the associated stream which populates the foreign stream for this agent, sufficient to determine the source and the destination and whether writing or reading from hbase

agent.dmo_product_counters.source.stream=FirefoxStreamForProductCounters

agent.dmo_product_counters.destination.hbase=hbase

agent.dmo_product_counters.destination.htable=dmo_metrics_realtime

 

#this next one belongs in the execution properties file

agent.dmo_product_counters.function=write

 

Define Mapping to Hbase Table for Write Agent

Define the mapping to the hbase table for this agent in write mode.

Choices for rowkey, columnName, and columnValue are:

literal  a constant
{column} from source column of same name
{+column} add source column to value - only in columnValue
{++column} increment source column value by 1 - only in columnValue

Putmax indicates how often HBase values are flushed, ignored (but required) even if columnValue is using increment

agent.dmo_product_counters.write.putmax=128

agent.dmo_product_counters.write.columnfamily=product

agent.dmo_product_counters.write.rowkey={product_name}:{download_type}:{product_major_version}:{ROWTIME}

agent.dmo_product_counters.write.rowkey={product_name}::{product_major_version}:{ROWTIME}

agent.dmo_product_counters.write.columnName=:{product_name}:{product_major_version}:{product_version}:{product_os}:{product_rebuild}:{upgrade_from}

agent.dmo_product_counters.write.columnName={product_name}:{download_type}:{product_major_version}:{product_version}:{product_os}

agent.dmo_product_counters.write.columnName={product_name}::{product_major_version}:{product_version}:{product_os}

agent.dmo_product_counters.write.columnValue={+count}

 

# define transformation on timestamps

agent.dmo_product_counters.write.rowkey_dateformat.ROWTIME=yyyy-MM-dd'T'HH:mm:00.000

 

For write, define the source column to destination column mapping and the datatypes for each column

agent.dmo_product_counters.write.dimensions=ROWTIME,download_type,product_name,product_major_version,product_version,product_os,product_rebuild,upgrade_from,count

agent.dmo_product_counters.write.dimensions=ROWTIME,product_name,download_type,product_major_version,product_version,product_os,count

agent.dmo_product_counters.write.column.ROWTIME=TIMESTAMP

agent.dmo_product_counters.write.column.download_type=VARCHAR(15)

agent.dmo_product_counters.write.column.product_name=VARCHAR(12)

agent.dmo_product_counters.write.column.product_major_version=VARCHAR(12)

agent.dmo_product_counters.write.column.product_version=VARCHAR(12)

agent.dmo_product_counters.write.column.product_os=VARCHAR(10)

agent.dmo_product_counters.write.column.product_rebuild=VARCHAR(10)

agent.dmo_product_counters.write.column.upgrade_from=VARCHAR(12)

agent.dmo_product_counters.write.column.count=BIGINT

 

# next two are execution time

#agent.dmo_product_counters.source.startRow=

#agent.dmo_product_counters.source.stopRow=

Defining a Read Agent

For read, define the inverse to write

#agent.dmo_product_counters.source.Htable=

#agent.dmo_product_counters.source.hbase=

#agent.dmo_product_counters.destination.schema=

#agent.dmo_product_counters.destination.stream=

# next one is execution time

#agent.dmo_product_counters.function=read

sqlstream.system.properties

This file comes with predefined default properties. You can change these settings to configure Hbase Agent for your environment by editing the sqlstream.application.properties file

default.execution.progress=10000

default.execution.timing=yes

default.execution.verbose=no

default.execution.trace=

default.execution.agent=<unknown>

default.server.uri=jdbc:sqlstream:sdp

default.server.host=localhost

default.server.port=5570

default.server.username=myusername

default.server.password=mypassword

default.server.autocommit=false

default.server.autoreconnect=true

default.server.reconnectwait=10000

default.server.reconnectwaitmax=120000

default.server.reconnectlimit=262080

default.server.querytimeout=0

Application Tracing

To facilitate debugging, Hbase Agent creates log files in a sub-directory named "trace" in the Hbase Agent directory tree. Configuration is located in the file trace/trace.properties. The properties file is similar in use and content to the trace.properties file used by the SQLstream s-Server. The Hbase Agent supports application tracing using the Java Logging package java.util.logging.

For detailed information on what is provided at the various tracing levels:

See the topic Configuring System Parameters in the Administrator Guide.
See the topic Error Handling in the s-Server Streaming SQL Reference Guide

The name and location of the Hbase Agent trace log are specified by the value set in sqlstream.execution.properties

in the trace.properties file located at

<client tools installation directory>/Hbase/trace/trace.properties

The trace directory is located relative to the directory from which you are running the Log File Agent, and this directory must exist before you run the Log File Agent.