You know all those generations of computer scientists and data engineers who’ve been focusing on relational databases and crazily using SQL as their, like, query language?
Well, if you own a cell phone, chances are your favorite applications run on SQL- so the said engineers were not only pioneers, but visionaries as well. So much so, that today SQL is all around us: it runs everything from banks to governments, and is underlying businesses large and small, with a continuous (but quiet) outpour of love from its users.
In fact, a developer survey by Stack Overflow found that SQL is one of the most popular—and lucrative—languages today (despite the NoSQL hiatus in the Web 2.0/ relational model marriage). Like an Aston Martin car (or, if you want, a Chanel suit), SQL is never out of style; at the same time, Hadoop is here to stay, IoT is making stream processing run for the money, and pairings with NoSQL are now on the table.
So, why isn’t SQL everyone’s choice cut? Well, let’s just say the feature menu is as colorful as an Iron Chef rundown: some preparations need the hand of a true specialist, some come pre-packaged (hence, offending the more discerning crowds), and some are traditionally paired with older recipes.
That being said, here’s what everyone should remember is available on the SQL menu.
- Throughput. Perhaps SQL has a strong association with space, and most data engineers care as much about cents as they do about bits—so they see volumes, storage, cost. But the truth is SQL performs beautifully for both scale-up and scale-out implementations.
For scale-up, it supports the optimization of continuous queries over a large set of cores on a multicore processor, without bothering the developer to write multithreaded code, understand lock concurrency, or use semaphores.
For scale-out, SQL supports the optimization of distributed queries over any number of servers, with optimization for low latency and high throughput.
Some streaming SQL implementations have benchmarked at 1M events/s/core, which translates directly into dramatic savings; having to pay for, say, 2 servers instead of 120 is lighter on the budget than a diet salad with dressing on the side.
- Streaming analytics– Analytics? Sure. But let’s face it, streaming, continuous, and real time is not exactly what comes to mind when we think SQL.
However, SQL is becoming the analytics language of choice for systems dealing with both structured and unstructured data, and the practice of continuously executing queries over moving data streams in a unified data-in-motion / data-at-rest architecture is on the rise.
Most real-time systems utilize existing stored data for data augmentation and enhancement, and a mix of real-time data, longer term trends, and predictive analytics for better data accuracy. On top of that, stored event and time-series data can be replayed (in fast forward mode), enabling ad-hoc, time-based analysis and scenario analysis over much larger datasets.
No language can do all this seamlessly—except for SQL. The real-time results with millisecond latency make it most attractive for aggregation/continuous ETL, analytics, and alert functions; if you add partition and join capabilities and stir, voila!—a perfect dish.
- NoSQL– with tsunamis of structured and unstructured data being generated by all kinds of transactional applications and architectures becoming increasingly complex, some enterprises forget about SQL as an unifying option. The visionary ones, though, understand that SQL streaming technologies, DBMSs, and Hadoop are friends—not foes.
Think about it: in an unified system, the data-at-rest is updated in real-time, and instantly joined with streaming data. That means developers can:
- execute analytics over stored or streaming data using the same SQL queries,
- keep perfect accuracy of stored data using streaming aggregation/ continuous ETL, and
- build powerful predictive analytics by joining trend and predictive analytics from the DBMS with streaming analytics.
With this come the business benefits: improved decision making through real-time updates of the data driving the business reporting, better operational efficiency through automating actions, and new revenue streams (think about extending existing applications for real-time access/performance).
What does this all mean? Simply that, while technology operators might get excited about the special-of-the-day here and there (fun proprietaries, new open sources, interesting architecture models), they should not forget to check the SQL menu. What they’re looking for might already be served a la carte.