Introducing SQLstream

<< Click to Display Table of Contents >>

Navigation:  »No topics above this level«

Introducing SQLstream

Return to chapter overview

The Business Case

Reducing business latency is a key driver for many industries today. Examples include

detecting credit card fraud at the point of transaction,
identifying the optimum time for execution of a hedge fund strategy, or
identifying behavior patterns in large volumes of real-time Internet clickstream data.

In addition, technology advances such as cloud computing and Web 2.0 are driving new and more sophisticated business models.

The combination of a need for low latency in data reporting, rapidly increasing data volumes over highly distributed systems has generated a need for real-time business models. This need cannot be addressed by traditional integration and business intelligence solutions, because concepts such as streaming analytics and Relational Asynchronous Messaging (RAM) are central to the solution. Indeed, they are the core of the next generation of IT systems. SQLstream is at the forefront of this revolution, enabling business intelligence to be generated and integrated in real-time across the enterprise.

The real-time business paradigm

Technology and the Internet have changed our expectations of the world around us: On-line banking, e-commerce, and search engines are commonplace. However, the real-time paradigm extends the realm of the possible far beyond on-line transactions and information-on-demand.

For today's businesses, the real-time model means responding immediately to new information as it arrives. Examples include:

hedge fund trading,
real-time product pricing,
targeted Internet advertising,
fraud detection, and
real-time promotions, including loyalty rewards.

Automatically analyzing market conditions in real-time and reacting at the right time, poses a number of challenges:

How to generate business intelligence directly from vast volumes of real-time data?
How to react in time, given the need to integrate business intelligence in real-time across the enterprise?

Improving business agility under the pressure of rapidly increasing data volume has driven performance advances in business intelligence, albeit at a price. Yet true agility and low business latency have proved elusive, since delivering more information faster from historical data does not address the real-time need, and the costs to

What is SQLstream?

The SQLstream s-Server is a distributed platform providing low latency and high performance processing of data and event streams by using continuous (streaming) SQL queries.

Query The Future™

SQLstream generates business answers as soon as input data become available, by querying and transforming the data directly on the wire without any prior staging in a database. SQLstream delivers its results continuously as streams of messages to applications, services, or end users throughout the enterprise. Whereas databases query past, stored data, SQLstream processes future, flowing data. We call our approach: Query The Future™.

Coexistence and synergy with traditional solutions

SQLstream is complementary to traditional ESB, EAI, business intelligence, and data warehousing solutions -- real-time processing and analysis are completed before data are stored. SQLstream therefore enhances your existing databases and data warehouses, by enabling dramatic reductions in the costs for querying and processing rapidly arriving data in real-time. SQLstream is standards-compliant, making existing in-house SQL skills immediately applicable to reducing real-time analysis and application integration headaches.

Similar benefits accrue to integrating real-time data into your mission critical applications. SQLstream's simplicity and power enable you to implement your business ideas much faster, to deliver more value, and to slash costs, not least through applying in-house SQL skills to real-time issues. SQLstream's revolutionary declarative technology also provides automatic optimization and ensures painless management and maintenance.

SQLstream concepts

SQLstream is designed for the low latency, high volume, rapid integration needs of today's real-time businesses. SQLstream processes transactions continuously, providing streaming analytics, generating answers as soon as input data become available. Complex, time-sensitive transformations and analytics are simple to configure, and they execute continuously across multiple input data sources.

SQLstream processes dynamic, flowing data using SQL queries, in contrast to traditional RDBMSs, which process static, stored data with repeated single-shot queries. Thus in SQLstream, integration cost and complexity are reduced dramatically, while retaining the intuitive power of SQL, enabling much faster implementation of business ideas.

Scalability and future-proofing are important considerations for business-critical systems. SQLstream is simple to configure using existing IT skills and provides future-proof scalability through its distributed architecture.

Streaming data

Streaming data arrive when they are ready, and are processed as a continuous flow. Sources of streaming data include application and business transactions, database and file updates, application inputs and outputs, and any of the data feeds processed or generated within an enterprise. Examples include financial trading data, internet clickstream data, sensor data, and exception events. SQLstream processes multiple input and output streams of data, for multiple publishers and subscribers. Data streams are represented as sequences of time-stamped messages.

Streaming queries

A streaming query is a continuous, standing query that executes over streaming data. SQLstream s-Server processes data streams using familiar SQL relational operators augmented to handle time sensitive data. Streaming queries are similar to database queries in how they analyze data; they differ by operating continuously on data as they arrive and by updating results in real-time. They can also be event-driven and can aggregate over rolling or periodic time windows.

Relational Asynchronous Messaging (RAM)

Streaming queries are implemented using a messaging engine that supports Relational Asynchronous Messaging (RAM). RAM is a data messaging approach that presents dynamic data as streams of time-stamped messages. Each message represents one row in a logically infinite stream of rows. For example:

SELECT STREAM * FROM Orders WHERE origin.city = 'New York'

 

is a streaming query that operates upon the Orders message stream, returning only orders that originate in New York.

Streaming queries deliver results continuously and in real-time, in contrast to a traditional RDBMS, wherein an analogous query would return all New York orders currently stored and then terminate. Orders arriving after that RDBMS query executed would not be seen or processed until the next such execution. In SQLstream, however, a streaming query is a standing query - it executes continuously, returning each New York message in the Orders stream immediately as it arrives.

One way to look at RAM is as a relational Publish/Subscribe technology. A message published to a stream is delivered to and processed by all queries subscribing to that stream. Queries can even be used to define further stream views, which in turn have their own subscribers.

Rowtimes, the stream clock, and rowtime bounds

Every message carries a time value called a rowtime. The arrival of a row establishes the current time of the stream, informally called the 'stream clock.' The rowtime for newly arriving rows cannot be less than the rowtime for rows already received; equal, perhaps, or later ... but never earlier. Since streaming queries are typically time-sensitive, these rowtimes often determine when processing can proceed based on thresholds or aggregation criteria. Queries that process multiple streams of input can encounter greatly varying data arrival rates on those input streams.

SQLstream enables producers to publish rowtime bounds that increase efficiency in processing multiple streams that may produce data at greatly varying rates. Publishing a rowtime bound promises that no subsequent row from this producer will have a rowtime earlier than the bound. That certainty frees queries and other processes to proceed with actions that might otherwise have waited to include such a row, that is, a row with a rowtime earlier than the now-known bound.

RAM Management System (RAMMS)

A RAMMS provides the next generation in Messaging Middleware, as a high-level declarative messaging management system. Instead of low-level point-to-point or topic-based Publish-Subscribe operations, SQLstream's RAMMS supports high-level subscriptions to relational queries that are auto-optimized. Instead of operating message-by-message or record-by-record, SQLstream RAMMS optimizes operations over potentially thousands of messages at a time.

Stream data processing with SQL

RAM technology leverages existing SQL skills, enabling SQL developers and users to become experts in continuous data processing and distributed systems. This SQL familiarity lowers the cost and reduces the risks of adopting this new technology, because stream data is processed using familiar relational operators to handle time windows. Time windows vary depending on the application, from milliseconds to many hours, or even days. An application or user can use SQL to create a relational view over various message streams, transforming the data by applying relational operations such as aggregation, correlation, and filtering.

Transparency

In a system relying on streaming data, the SQLstream system handles the streams and all corresponding background processing. By building on existing familiarity with SQL, the SQLstream system facilitates creation of high-throughput, efficient processing for real-time flows of data. Existing applications can participate in a SQLstream-based system that handles streaming data, where SQLstream provides data to those applications or processes data from them. SQLstream streamlines the processing of multiple input and output streams of data, for multiple publishers and subscribers, in both new and legacy systems.

SQLstream and RDBMSs

A RAMMS is the complement to an RDBMS. Both share a common data model centered on processing relational rows, queries, and views. They share common data manipulation and definition languages standardized as SQL. They share a common security model and APIs, such as JDBC, and a common representation of metadata. A RAMMS is used for predetermined queries over arriving data, processing continuously and easy to maintain even during execution. An RDBMS is used for ad hoc queries over historical data, processing each query until it terminates.

The two work well together. A RAMMS can use static predetermined queries to preprocess data for an RDBMS and also respond to incoming messages by triggering dynamic queries on the data stored in an RDBMS. Queries in a RAMMS typically are scoped over explicit time windows based on business rules. The business rules typically specify time windows measured in minutes or hours, but time windows over any duration from milliseconds to months are possible. Both a RAMMS and an RDBMS can be used for transaction processing.

SQLstream and the Enterprise Service Bus (ESB)

The Enterprise Service Bus is an integration pattern initially limited to industrial process control and trading systems, but now in widespread use. The message bus provides a mechanism for de-coupling application components. Source components can 'fire and forget' messages, which can be consumed by either one and only one subscriber (under the queuing paradigm) or by all subscribers (publish/subscribe).

The SQLstream relational messaging engine supports a similar decoupling of source and target systems, but adds the ability to transform, correlate, join, filter, and aggregate the messages while they are in transit.

Summary

Industry is seeking to harness the power of the real-time business model but is restricted by the limitations of the historical, static data model central to the traditional approach to business intelligence. SQLstream s-Server's unique architecture enables real-time business intelligence to be introduced seamlessly and in parallel with existing solutions.

Applications include:

Financial trading

Determining the optimum price given conditions across multiple markets

Behavioral analysis of Internet shoppers

Targeting content and delivering advertisements in real-time

Fraud detection

Identifying and preventing fraudulent activity at the point of transaction

Social networking

Providing dynamic context-driven content across communities

Real-time pricing and promotions

Based on stock level, rate of sales, and consumer spending behavior

Application integration

Change data detection, mediation and propagation to support data replication and synchronization

By building on existing familiarity with SQL, the SQLstream system enables creation of high-throughput, efficient processing for real-time flows of data. Query The Future™ offers a compelling solution to a very broad range of data analysis and application integration problems, spanning many industries with the following benefits:

Reduce business latency. Moving from batch-based integration to continuous data integration with minimal latency, and making information available faster, providing true real-time decision support.
Improve time to market and reduce deployment costs and minimize risk. By using SQLstream's support for SQL and the relational paradigm, enabling use of existing IT skills and providing an intuitive, flexible platform for integrating applications and data, by avoiding the traditional database and data warehousing technology whose real-time data integration and analysis solutions result in complex architectures, inadequate performance, and systems which are expensive and difficult to change.
Follow proven industry standards. Becoming instantly productive using SQLstream's SQL:2008 standards-based language, which supports a host of other relevant industry standards including JDBC, JMS, XMI, Eclipse user interface and many others.
Ensure future-proof scalability for mission critical applications. By using SQLstream to address the limitations of traditional ETL and EAI approaches, providing a solution that will continue to address your real-time data integration requirements both today and in the future.