LinkedIn, (Fairfax and BBC) on Big Data

Michael Harte, CIO Commonwealth Bank, got the opening quote at the Big Data summit late last week, saying something like the bank is moving to real-time analytics as their competitive edge.  Not to be outdone the NAB said they were making ‘significant investments in customer analytics’. Big Data has matured in a big way in the last 18 months in Australia!

A number of interesting statements were made, such as:-

Vincent from the BBC saying they were trying to create an ‘always on’ relationship with their customers by providing value to their fans, having a two way conversation  and saying ‘thank you’.  (Don’t wish to gripe but perhaps service in general in Australia could learn a little about engagement and thanks! Side story, had my car serviced last week and as I drove away found a fault that they had introduced. As they requested returned the car the next day, only to receive a a phone call to ask if I was happy with the service the previous day! Is that customer care or follow the script and procedure service!)

Back to the main attraction. Fascinating graphs and data from Manu Sharma, ex Principal Data Scientist for LinkedIn, some of which I recall..
–          Average CEO’s and Sales person has 4 letters to their names, engineers 6
            and restaurateurs 7!
–          Peter/Bob or Debora/Sally are the top CEO names.
–          Since 2007 Newspapers, restaurants, warehousing, capital markets have been shrinking the most while Internet, online publishing have been growing.

The most interesting visualization was a ‘skills’ cloud which clustered together the different skills that people ‘give’ to others. The drill downs were great, for example Manu showed a cluster around Python/MySQL and PHP is Hadoop, EC2, Google App Engine, HBase, MapReduce, AWS .. as expected .. The punch line – clustered around “Business Development” is – Hard worker, Attentive to detail, Revenue improvement!

The final word from Manu was:-

  • More data is better
  • Raw data is better
  • Standardisation and quality of data is key
  • Don’t compromise accuracy for efficiency
  • KISS – simple models are simpler to troubleshoot over time
  • Fail Fast and iterate
  • Correlation is not causation
  • Intuition trumps everything

(He is a data scientist so you didn’t expect small number of worlds did you?)

There were a few other items but the Intuition one stuck in my head. I keep getting the feeling that people believe that Big Data is the oracle. It is NOT!  You may say strange I say that being the Big Data bigot that I am, but that it the whole point. Big Data provides the data in digestible form for the human mind, it feeds the human brain with information that it can process. Then we do what humans do well, pattern match, create ideas, innovate, etc! These things don’t happen in computers, no matter how good your machine learning algorithm is!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s