Unlike all the others, this time it’s not a spelling mistake I am attempting to be puny, let me explain.
Last week I was facilitating a discussion based on how to build ‘consumer grade’ IT. One of the participants said that it was too expensive to even contemplate building analytics into the day to day operation of their organisation. Well, I challenged, how do all these start up internet organisations build their businesses on real-time analytical processing, no one has ever accused a start up of having deep pockets?
For the past 30+ years the realm of data management and data processing has been cantered on relational databases. The organisation once named ‘Relational Software, Inc.’ reigned supreme as the enterprise database as it perfected the architecture to allow a ‘single streamed’ approach to managing data.
Don’t shout, by single streamed I mean to convey the idea of data which is guaranteed consistent, at all points in time. This is a mandatory requirement when dealing with the traditional structured data of business applications like accounts. The stereotypical bank account… where there must be no possibility of timing or sequence delays which would result in ‘incorrect’ balances.
Now why is this approach limiting the applicability of this technology to the ‘big data’ world? Well firstly the architecture does not scale, both size and performance due to the tight-locking architecture. Hence my pun, attempting to use this model for very large, and non structured datasets is going to rein in your results.
So what’s the alternative? Well it’s the idea of multi-threaded where jobs are split up and executed at the same time, and to make it scalable a concept of loose-locking is used. Here at any single point in time you can’t guarantee consistency, however over time you can. That is a hard concept to get your head around at first. Think about 100 people in a room and you tell one person some news, and ask them to pass it on to other people and ask those to people to pass it on as well. Shortly after you start this process there are a lot of people who haven’t heard the news, but then a little while later everyone has. Same thing here!
So this is one area that Big Data technologies dramatically differ to ‘traditional’. Remember no judgement here it’s a matter of horses for courses! But in Big Data you have to release the reins completely!