A streaming SQLstream application will feel very familiar to anyone with some basic knowledge of SQL and traditional RDBMS applications. SQLstream uses standards-based SQL, except that streaming SQL queries run forever, processing data as they arrive over specified time windows. This blog is the first in a series of tutorials for SQLstream developers, describing how to build a streaming SQL applications. Over the coming months, these tutorials will address the different components of streaming data applications, and provide worked examples and guidance.
Streaming Visualization, Part 1: Setting up
We’ll begin the series by looking at a typical streaming use case – displaying real-time sensor data on a map. We have a source of geo-located data flowing in SQLstream that we’d like to visualize. Using Google Earth and Ruby on Rails, I’ll demonstrate an easily-implemented solution with lots of room for expansion.
For this example, our approach is to connect the SQLstream pipeline to Google Earth using a staging database–a common deployment scenario. We’ll be using PostgreSQL for the staging database, but MySQL or any other database supported by Rails will work. A SQLstream pump will use TableUpdate to write a record of latitude, longitude, and description for each event to a display list in PostgreSQL. When Google Earth places a web request for data, Rails will service the request by rendering the contents of the display list as KML, Earth’s dialect of XML. We’ll start with SQLstream, Ruby, and PostgreSQL already installed and focus on what’s necessary to get them all talking to each other.
Getting the Data
With SQLstream installed, make sure all of the distributed plugins are installed (if you haven’t done this already) and start the server:
linux> cd $SQLSTREAM_HOME/plugin/autocp linux> ln -s ../*.jar . linux> cd $SQLSTREAM_HOME linux> bin/sqlstreamd
We’re going to get our data from a web feed of recent earthquakes provided by the US Geological Survey. In another shell:
linux> cd $SQLSTREAM_HOME/examples/webfeed linux> sqllineClient < webfeed.sql linux> sqllineClient < usgs.sql
We now have several streams available to us within SQLstream, the one we want to visualize is SmallQuakesDay, which includes columns containing the location (‘point’ as lat/lon) and magnitude (‘mag’) of the quake.
Creating the Display List
We’ll use Rails to do all of the work of creating the display list. If you don’t have Rails installed yet, start by installing Ruby’s Gem package management system (in Ubuntu, this is the rubygems package). You’ll also need the development files for Postgres installed (postgres-server-dev in Ubuntu). You can now use gem to install rails and associated tools with this command:
linux> gem install mongrel rails pg rubyzip
I recommend that you add gem’s bin directory to your path (on my system it’s /var/lib/gems/1.8/bin) so that the commands ‘rails’, ‘rake’, and ‘bundle’ are found. Create an empty directory to work in (we’ll call it ‘$QUAKE’ here), cd there, and create a new rails server in the sub-directory ‘quakekml’ with these commands:
linux> cd $QUAKE linux> rails new quakekml -d postgresql
You can test the server by starting it with these commands and visiting http://localhost:3000 in a web browser:
linux> cd $QUAKE/quakekml linux> rails server
Use ^C to shut the server down so we can configure the database access. Edit the file $QUAKE/quakekml/config/database.yml, it should already contain sections describing the development, test, and production databases. Edit the username and password settings in each section to match a user you’ve configured in Postgres who can create databases. The only database we’ll be using is ‘quakekml_development’, but Rails will create all three when you issue this command:
linux> rake db:create:all
Create a display list consisting of a timestamp, lat/lon, and magnitude for each quake with the commands:
linux> rails generate scaffold quake_event when:timestamp lat:float lon:float mag:float linux> rake db:migrate
You now not only have an empty table in Postgres, you also have a full web interface for viewing and editing that table. Start the server again and visit http://localhost:3000/quake_events to see it. Our next steps are to generate KML for Google Earth from this table, and to feed the table from SQLstream. The scaffolding created by Rails is a handy debugging tool we can use to inspect the table and manually add items to test the visualization.
Parts 2 and 3 of the visualization tutorial will be published over the coming weeks. Part 2 focuses on how to render streaming analytics in Google Earth, and the final part of the tutorial will discuss how to get the data flowing.