Log File Agent

<< Click to Display Table of Contents >>

Navigation:  Integrating Blaze with Other Systems > Appendix A: Legacy Adapters and Agents > Legacy Agents >

Log File Agent

Previous pageReturn to chapter overviewNext page

The Log File Agent lets you process logs on a remote server and send processed output through JDBC to s-Server. It reduces bandwidth requirements by moving processing to the location of the log file. The SQLstream Log File Agent is supplied either as part of the distributed SQLstream product or as part of the ClientTools download from the SQLstream website (via SQLstream-5.1.0-clienttools-linux.run or SQLstream-client-tools-5.1.0-windows.exe).

Like the Log File Adapter, the Logfile Agent never considers a logfile "closed" until a new file appears. This is because log files often have multiple loggers, which another line to the log file at any time.

The Agent installs as a script, either logfileagent.sh (Linux) or logfileagent.cmd (Windows). Configuration works the same way on both platforms.

Logs that are accessible directly from the server that hosts SQLstream can be input to streams directly through the use of the SQLstream Log File Adapter.
Logs that are generated from other servers, termed remote servers, can be input to streams using the SQLstream Log File Agent as described here.

The Log File Agent is installed on a remote server to supply log file records generated on that server as input to a stream on the s-Server.

The diagram below shows the configuration for using the Log File Agent.

 

logfileAgentDwg03at75

Note: On Linux, you can minimize bandwidth usage by supplying only a subset of the log records being generated, by having the Agent read from a filter of Standard Input on that server. A sample filtering command would be

tail logfile_name | grep record_spec | LogFileAgent

 

Requirements

The remote server hosting the Log File Agent must have a Java Runtime Environment (or a JDK) installed.

Directory Structure

The LogFileAgent is installed in the clienttools folder. After installation, the Log File Agent directory contains a jar, a script, and a properties file:

LogFileAgent/
ologfileagent.sh (Linux) OR logfileagent.cmd (Windows)
olfaparams.sh (Linux) OR lfaparams.cmd (Windows)
/lib/:
oLogFileAgent.jar
oLogFileAgentPath.jar
/trace/:
otrace.properties

Configuring and Using the Log File Agent

The Log File Agent is invoked with the logfileagent.sh (Linux) or logfileagent.cmd (Windows) script and can be configured by using the parameters described below. Among other things, parameters specify the location and name of the logfile from which you intend to create a stream, as well as the location and name of that stream.

Options are specified using the following general structure:

logfileagent [options] stream-name [log-file-path]

 

where log-file-path is a positional argument which can be a directory or a single file and stream-name is the name of the stream, which will be qualified by the schema: schema.stream-name.

Default: null

Note that options must appear before the stream-name. They are preceded by a single dash :

 -mu 5

Sample Stream

The stream into which Log File Agent writes should consist of one column, MESSAGE, of format VARCHAR(512):

CREATE OR REPLACE STREAM "RawData" (

   "MESSAGE" VARCHAR(512) NOT NULL

) ;

Sample Command

A sample command on Linux would look like the following:

logfileagent.sh -sn sa -sp pwd

      -su jdbc:sqlstream:sdp://streamserver --schema TEST LOGSTREAM /var/log/syslog

 

where

   sa/pwd specifies the username/password

   streamserver in the URI is the name of a server.  

   -schema TEST LOGSTREAM specifies the schema (TEST) and stream name (LOGSTREAM)

   /var/log/syslog specifies the full path to the log file to be tailed, in this case syslog.

 

If no log file is specified, STDIN is used, which can be filtered as in the following example:

tail -f /var/log/syslog | grep --line-buffered MARK | logfileagent.sh -sn sa -sp pwd

                                --su jdbc:sqlstream:sdp://streamserver --schema TEST

 

In the above sample command (folded here above for display), the tail program follows the syslog and pipes the output to grep, which searches for lines containing the word "MARK". Matching lines are piped in turn to logfileagent. The parameter "--line-buffered" provides correct startup timing between grep and the Log File Agent. (See the topic SQLstream JDBC Driver in this guide for further details on the URI, default port, and so on.)

Option

Expansion

Option meaning

-cw

--connection-wait <value>

Time in milliseconds to wait between connection retries

Default: 10000

-en

--encoding

Specifies character set encoding of log file data

Default: "UTF-8" ------ For the full list of Java-1.7-supported encodings, see

http://docs.oracle.com/javase/7/docs/technotes/guides/intl/encoding.doc.html.

-id

--source-id

<columnName:idString>

Specifies source identification

Default: null

-r

--retries

One integer. Number of times to retry connection after network failure. If value is 0 or less then it retries forever.   If it can't reconnect, the Log File Agent exits.

-fc

--filename-column

<columnName>

Optional column where the name of current log file will be inserted

-preprocess

--preprocess-command <filename>

Shell script and parameters to be invoked before processing.

-postprocess

--postprocess-command <filename>

Shell script and parameters to be invoked after processing.

-mu

--max-unchanged_stats

<value>

Specifies number of times (<value>) activity check fails before checking for file rotation.

Default: 5

-si

--sleep-interval <value>

Specifies wait time (<value>) in milliseconds between activity checks.

Default: 1000

-sn

--stream-username

<username>

Specifies username for connection to SQLstream s-Server.

-sp

--stream-password

<password>

Specifies password for connection to SQLstream s-Server.

-su

--stream-uri <uri>

Specifies the JDBC URI for SQLstream s-Server, such as jdbc:sqlstream:sdp://host:port

-schema

--schema

<case-sensitive schema name>

Specifies the schema where the stream name exists.

-se

--start-eof

Optional:

 

When enabled, the -start-eof setting tells LogFile Agent to seek to the end of the specified file, skipping all existing data and waiting for new data.

 

If this option is not specified, the agent seeks to the start of the file and streams any existing data before waiting for new data.

 

Note: This option is not available for the Log File Adapter.

-t

--timing

Optional:

Enables timing output

-v

--verbose

Optional:

Enables verbose output

-pattern

-- pattern <file-pattern>

Optional:

A set of files defined by <file-pattern> to be found in the [log-file-path] directory.

Default: null

-ap

--agent-properties <properties file name>

Path of the properties file for parsing csv files.

Sample Script to Launch the LogFileAgent on Linux

This is an example of text that could be put in a bash script to launch the LogFileAgent:

#!/bin/bash

# Run multiple Sample Logs through LogFile Agent

../logfileagent/logfileagent.sh \

   -su jdbc:sqlstream:sdp://localhost \

   -sn sa -sp mumble \

   -si 500 -mu 4 \

   -id SourceAgent:LFAg@$HOSTNAME \

   -fc SourceLog \

   -pattern ".*_.*\.log" \

   -ap parser.properties \

   -schema MYSCHEMA INCOMING_LOGS \

   ../MySqlStreamAPP/sample_logs \

   $*

 

Log File Agent, sleep processing, and new files

The Log File Agent reads data continuously from the current log file as long as there are new lines to process. When there is no new data, Log File Agent sleeps for an interval defined with the Sleep Interval option ("-si"). When the agent wakes, it does a Linux "stat" or Windows equivalent to see if the file length has changed since before the sleep. If the length is unchanged it sleeps again. This sleep/stat process repeats for the number defined in the Max Unchanged Stats option ("-mu"). After running through the maximum sleep/stat iterations, Log File Agent takes inventory of the files that match the pattern and looks for the oldest as-yet-unprocessed file. If there is no new file to begin tailing, Log File Agent resumes monitoring the current log file.

As a result, the product of "-si" sleep interval and "-mu" max unchanged stats is the time it will take Log File Agent to switch to a new log file after the current one stops growing.

You should set the sleep interval and max unchanged stats count to balance urgency (latency) against the volatility of the file and the background CPU load imposed by Log File Agent. In other words, if your data provider is slow or spotty, but you want to minimize latency when new data finally arrives, then you'll configure Log File Agent to be checking the file frequently even though that will incur greater background CPU load.

Note the above sleep/stat processing does not apply if the Log File Agent continually finds new data available to be streamed.

Application Tracing

To facilitate debugging, Log File Agent creates log files in a sub-directory named "trace" in the Log File Agent directory tree. Logging configuration is located in the file trace/trace.properties. The properties file is similar in use and content to the trace.properties file used by the SQLstream s-Server. The Log File Agent supports application tracing using the Java Logging package java.util.logging.

For detailed information on what is provided at the various tracing levels:

See the topic Configuring System Parameters in the Administrator Guide.
See the topic Error Handling in the s-Server Streaming SQL Reference Guide

The name and location of the Log File Agent trace log are specified by the value set in java.util.logging.FileHandler.patter in the trace.properties file located at

<client tools installation directory>/LogFileAgent/trace/trace.properties

The trace directory is located relative to the directory from which you are running the Log File Agent, and this directory must exist before you run the Log File Agent.

Tracing channels and logging levels

The Log File Agent supports three tracing channels:

1.For Log File Agent activity, com.sqlstream.tools.logfileagent is the main channel.
2.For the JDBC driver:
§com.sqlstream.aspen.vjdbc.AspenDriver. This channel traces JDBC driver activity. It can provide details on all communication activity between the Log File Adapter and the SQLstream s-Server. JDBC connection and handshaking, query delivery to the server and result delivery to the application are examples of information available from this tracing channel. For detailed information, see the links mentioned above on on tracing parameters and error logging.
§com.sqlstream.aspen.sdp2.StreamingDataProtocol and com.sqlstream.aspen.sdp2.clients. These channels trace activity at the data transport level, providing details on low level TCP connection handling and on interactions with the SQLstream s-Server. For detailed information, see the links mentioned above on tracing parameters and error logging.
3.For the SQLstream s-Server, the correct SDP2 tracing level is called com.sqlstream.aspen.native.sdp

Logging levels

Log File Agent uses two of the possible logging levels.

INFO Reports startup/shutdown operations including operational parameters being used by the application and tracking network connection.

FINEST Is used to report all data handled by the application. Caution: Using the FINEST level may generate a large amount of tracing information, since each line from the log being tailed is echoed to the application trace file.

Sample trace.properties File

The following is a sample trace.properties file for the Log File Agent.

# LogFileAgent tracing properties.

handlers=java.util.logging.FileHandler

java.util.logging.FileHandler.level=FINEST

java.util.logging.FileHandler.formatter=com.sqlstream.aspen.util.IndentFormatter

java.util.logging.FileHandler.pattern=trace/LogFileAgentTrace.log

java.util.logging.FileHandler.count=20

com.sqlstream.aspen.util.IndentFormatter.threads=true

com.sqlstream.aspen.util.IndentFormatter.timestamp=true

# Minimal JDBC driver tracing

com.sqlstream.aspen.jdbc.level=INFO

# suppress SDP client tracing - only warnings and severe failures

com.sqlstream.aspen.sdp.level=WARNING

# LogFileAgent tracing

com.sqlstream.tools.logfileagent.level=INFO

 

The first section at the top of the example describes to the Java Logging system what processing modules to use and where to direct the output. For more information see the documentation for java.util.logging.

The next three entries are the tracing control lines. They instruct the logging system to accept and process messages for the specified modules. Further they set the minimum thresholds for message acceptance. For list and description of the various logging levels see the documentation for java.util.logging.level.

For the details of information provided at the various tracing levels, see Configuring System Parameters, especially Tracing Parameters , and Error Handling, especially Error Logging.