Distributed stream processing with SQLstream and Kafka

A distributed data management architecture is an essential requirement for real-time Big Data applications such as managing IoT sensor and machine data payloads. Smart services for IoT applications will require low latency answers, multiple servers and distributed processing for scalability, plus built-in redundancy for resilient, 24×7 operations.

It’s also important to offer a range of options for resilience and for implementing the distributed pipework between servers. Customers often have their preferred products and prefer where possible to reuse expertise and to manage their distributed operations on the platforms in which they’ve gained confidence. Different infrastructure tools also have different strengths and weakness. That’s why we’ve added Kafka to our list of deployment options for distributed, resilient stream processing (alternatives include our own native distributed capability plus the use of other message buses including ZeroMQ, AMQP and Storm).

Kafka is an open-source, real-time publish-subscribe messaging framework with the capability to support durable and resilient distributed stream processing applications. Kafka’s support for data stream partitioning over multiple servers offers SQLstream customers an additional resilient infrastructure option for their multi-server, distributed Guavus SQLstream deployments.

  • A complete high performance, resilient and SQL standards-compliant distributed stream processing platform.
  • High performance scale-out architecture for larger data volumes at rates of many millions of records per second.
  • High availability, fault-tolerant operation for mission critical applications.
  • Integrates with existing machine data collection and enterprise systems.

More information about SQLstream’s architecture, performance and distributed stream processing options can be found here.