The Mayan vision didn’t pan out so can this still be the Year of Big Data Analytics?

I was listening to the BBC News on the Internet, and the technology section had two experts explaining Big Data to the layman. After some mutterings about it being ‘very large’ and ‘useful in genomics’, most listeners would have been left none the wiser. The BBC’s Science Editor, Susan Watts, had a bit of a better crack at it here. Our industry obviously has a way to go in explaining itself to the wider world. Tackling the idea of ’very large’ first, and how Big is Big Data? Turns out, it isn’t, at least, not yet. EMA Research recently published a report; that debunked the myth of Big Data being Petabytes upwards. Authored by John Myers and Shawn Rogers, the report suggests that Big Data starts at 110GB, with typical use cases between 10 to 30TB.

But even at that size, why is the Big Data market still forecast to grow dramatically, with IDC predicting the market to be $24B by 2016? Surely this can’t all be put down to market hype and engineers’ excitement. Well, the answer lies in the ‘useful in genomics’. Not everyone has data of that scale or complexity, but most want to extract value from their data much more quickly and often in real-time. Big Data technology is aimed at making that easier, by removing many of the constraints of traditional data management and processing. For example, removing fixed schemas, processing all data, and generating answers on the fly, all with low cost, commodity hardware.

So Big Data Analytics is the killer app? Absolutely. Unfortunately, with today’s first generation Big Data platforms such as Hadoop, HDFS and even HBase, many are still struggling to analyze their data at scale and in real-time. That’s where technologies such as streaming and the (re-)emergence of SQL as a Big Data processing language is key. For example, Cloudera’s Impala is the first full frontal attack on mature Big Data market for larger enterprises. Real-time and SQL will help us bridge not only the technology gap, but also the ability to explain in layman’s terms how Big Data really benefit the person in the street.