Does size matter when it comes to “Big” Data?

by Glenn Gore, Chief Technology Officer, Melbourne IT

As Information Technology continues to underpin just about
everything we do on a daily basis, growth in information and data is increasing
at ever increasing rates. Individuals have data spread far and wide as they
interact with multiple service providers from banking to travel to
entertainment services. With Enterprise organisations managing 80% of the
worlds data, the ability to process and extract valuable information from these
large data sets is becoming a new area of investment and innovation to create
new insights, value and opportunities to customers and business owners. Dealing with these large data sets (measured in hundreds of terabytes per set) is
referred to as “Big Data”.

A common misconception around Big Data, is that only the
biggest companies and online sites can work with Big Data concepts. The truth
though is that every company from small to large can take advantage of the
technologies being developed by the “big guys” to get greater insight into
customer behaviour, trends analysis and business intelligence. A more
simplistic definition of Big Data that addresses a broader range of use case
scenarios is where you are trying to run analytics on a data set that doesn’t
fit onto a single computer (whether it be a laptop or a server).

The primary concepts behind Big Data are:

–  The ability to break a data set up into smaller
chunks that can be processed individually

–  The ability to run analysis processes in
parallel against the many chunks of data

–  Working with structured and unstructured data
sets

–  Analysis often requires multiple passes over the
same data-set to generate higher level models

Melbourne IT is working closely with EMC on how Big Data
learnings and the latest in Storage Technology can be used to help analyse
customer behaviour across a platform that generates >150 Million
transactions per day across more than 60,000 customers. Processing a month’s
worth of activity requires pattern analysis across more than 4 Billion customer
interactions comprising more than 60 Billion individual data points. Using
traditional techniques, it would take longer than 24 hours to load and report
on 24 hours of activity.

EMC Inform 2011 will provide insights into how EMC is
helping customers deal with their own Big Data challenge.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s