Mainstream Hadoop for you, no excuses anymore! Time to get on with Big Data

Today the world changes, more rapidly. The face of Hadoop has changed forever as EMC Greenplum released Pivotal HD, a highly parallelised SQL frontend for Hadoop.  (details here) Sounds geeky, so what does it mean?  The skills shortage in Big Data is solved and you can use data to improve efficiency, drive profitability and become a predictive organisation.  No more barriers, wow!

This is a significant announcement as it addresses the major obstacle to enterprises adopting ‘big data’, SKILLS. In the past you needed to have new skills to use Hadoop such as Hive, R, etc. In contrast in the ‘old world’ there is a unifying skill, called SQL. SQL skills are rife within organisations, even outside of IT, as many people have played with some sort of data manipulation tool. Lots of people have an Access database to track something or other, or have used utilities in programmes like Excel which are very SQL like.  More importantly a plethora of tools, such as almost all BI tools, use SQL as the way to access data at the back end. Now all of this accumulated knowledge, all the tools, all the reports, all the analytics can be simply applied to a Hadoop dataset.

(If you are not familiar with Hadoop, you can think of it as an unstructured database. So a Hadoop dataset is any data that exists out there. No relationships required here!)

This announcement is even more significant as Pivotal HD produces a cost optimised parallel query. In layman’s terms its lightning fast with no human effort applied.  In the past being able to use Hadoop was not sufficient, a good Data Scientist needed to be able to work out how to divide up the work to ensure all the available computing resources were humming … this minimised the time it took to obtain results.

HAWQ1

Remembering that this is an iterative discovery cycle and the more times you can go around the cycle the better the results should be. Secondly, if we are going to operationalize analytics by enhancing people’s workflows, response time is critical! To embed real-time analytics into transaction systems means that complex multi-dimensional analytics needs to return results in seconds and that is what Pivotal HD does as shown in the graph above.

Now before you complain and tell me that is not a valid comparison to take a public domain piece of code and match it to a commercial offering. Obviously EMC adds value by making the product more robust and faster, so the graph below compares it to another commercial product, now from 9x to 69x is massively impressive!

HAWQ2

I am excited, (as noted by all the punctuation) but this is a big deal. This is literally what the big data world has been waiting for. Just consider that you can plug in Pivotal HD and immediately make all your existing reports run many times faster. You can then expand the data that those reports are running against and get fine grain analysis done and then expand into new datasets and the world opens to up! The quest for a single customer view becomes an understanding of a single customer journey… now we are talking!

The new world is here and it’s a lot more exciting than making a purchase order execute faster!

Advertisements

One response to “Mainstream Hadoop for you, no excuses anymore! Time to get on with Big Data

  1. This is a great announcement! Nearly 75% of all my recent inquiry calls with IT leaders have highlighted lack of skills and understanding as a major issue. Using traditional tools like SQL not only increases familiarity and hence reduce the effort needed to use Hadoop, but allows a bigger base of users to use the traditonal tools to tun queries. This not only ensure better ROI, but negates issues around training and change management.

    Great work! Look forward to reading your work and keeping in touch!

    Cheers!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s