Connect to a log file

<< Click to Display Table of Contents >>

Navigation:  Building Streaming Applications > Tutorial 1: Introduction to StreamLab >

Connect to a log file

Previous pageReturn to chapter overviewNext page

To begin working with data in StreamLab, the first thing you need to do is to connect to a streaming data source. The kind of data source will depend on your system's needs, but might include log files, network feeds, Kafka topics, and so on.

Stream Processing Task

In the following exercise, we will capture data from a streaming log file and use this data to identify buses that exceed the speed limit in the Sydney area. Because this data is streaming, speeders will be identified in real time. We will derive this data by connecting to a simulated streaming log file located on your local machine.In this case, the log file records messages sent out by buses in Sydney, Australia.

Starting StreamLab and the Buses Demo

You first need to start the StreamLab server. If you've installed StreamLab, there should be a folder on your desktop called StreamLab. Open this folder and click the Start StreamLab Server icon:

sbur_tutorial1_1

Now that the StreamLab server is running, open the SQLstream desktop folder and double-click the Stream Bus Data icon.

sbur_tutorial1_2

Once you click the application's icon, an information window opens.

sbur_information_bus_log

Keep this window open for the remainder of the exercise. As long as this window is open, sample data from buses in the Sydney area is flowing into a file located at /tmp/buses.log at 50 rows per second. This simulates a log file being continually updated. We're going to convert this log file into a stream.

This file features data in the following categories:

Logical column in file

Category

Definition

1

id

Identification number for the bus.

2

reported_at

Time location was reported.

3

shift_no

Shift number for the bus's driver.

4

trip_no

Trip number for the bus.

5

route_variant_id

ID number for bus route.

6

waypoint_id

ID number for bus waypoint.

7

last_known_location_state

Location state.

8

lat

Latitude of location

9

lon

Longitude of location.

10

speed

Reported speed of bus.

11

bearing

Navigational bearing for bus.

12

driver_no

Driver identification for number.

13

prescribed

The direction on the motorway, i.e. into Sydney or out of Sydney.

14

highway

Highway number, if available.

15

created_at

Timestamp for when this bus started reporting locations.

16

updated_at

Timestamp for when this bus last reported a location.

You will use StreamLab to capture information from this file.

Launch StreamLab by entering localhost:5590 in your browser. We recommend Chrome on Windows, Safari on Mac, and Chromium on Linux.

Note: If you're running in a virtual machine, use the browser in your host OS for the best performance.

Creating a New Project in StreamLab

After StreamLab loads, click Projects in the top menu.

The StreamLab menu lets you switch between StreamLab projects and dashboards. It is available in the upper right corner of StreamLab.

sl_menu

Each installation of StreamLab features a starter project. The Copy Project button opens the Copy Project dialog box. Here, you enter a name for the new project. The name needs to have been previously unused in StreamLab.

Enter "buses" for the name of the project and click Copy to create the new project and.

Screenshot from 2015-09-07 17_44_00

Click Start. A help page opens. Click Show this help page so that this page won't be shown again.

Dismiss the help page.

Click Sources.

StreamLab sources are files (line, CSV, JSON, XML) streams, tables, log files, web feeds, and views. They capture data from sensors, message buses, network feeds, applications, databases, and other sources. You can parse local log files--files reachable from s-Server--through StreamLab.

With your mouse, click and drag Line, then drop it in the dotted box.

Click StreamLab_Output.line_1 <<need screen shot>>

This source is used to read whole lines from a source, in this case a set of files in a directory.

line source info

Enter:

Directory: /tmp
Filename Pattern: buses\.log

Note: The filename patter is a regular expression that can match multiple files. Here you  want "buses.log" exactly. In Java regular expressions, a period--"."--means "all characters." As a result, you need to preface the period with a backslash, which tells StreamLab that you mean a period.

Test this configuration using the Sample 5 Rows from Source button.

StreamLab reads 5 rows from the source and stops. If no data arrives, check that you are still running the Stream Bus Data script and that you have entered the parameters correctly and try Sample 5 again.

Click Up to exit the source, and up again to exit the Sources page.