Tag Archives: Isilon

EMC Strategy Update 2014 – Gateway to the Future.

ViPR signals the future of computing and now I understand this!

First a confession, sometimes it takes me a while to fully understand the impact of some technologies. I remember seeing the first iPod adverts and pondering why anyone would want to carry a hard disk drive around in their pocket!  Likewise when I first encountered ViPR, I thought neat way to manage storage… but it’s not going to change the world?  Like the iPod I have come to understand that this is industry changing. Big statement let me explain.

ViPR has two major components, a controller and data services.  The controller has had a lot of focus, as it was the most built out at release time. Fundamentally it provides virtualised storage and automated management across your whole environment. This gives you visibility into all your storage and a consistent way to manage it; resulting in lower costs and higher reliability. If you were sceptical you would say this is just the next generation of storage virtualisation, and it would be hard to argue that.

Now before highlighting the revolutionary power of the ViPR data services let’s make sure we are on the same page, with respect to the shift in IT technology that is currently underway. Analyst group IDC puts it succinctly as the movement from the 2nd platform to the 3rd platform. (Depicted below)

Image

This is the movement to an infrastructure that is capable of servicing billions of users, with millions of apps, (driven by social, mobile and big data computing), will look very different to current infrastructures. Enter the Software Defined Datacentre, where we use software to manage and control these elements, (ViPR controller). More importantly, to gain the scale and elasticity required a new hardware construct is required!

One example is illustrated by EMC’s acquisition of ScaleIO.  ScaleIO presents a virtual storage array, that is built from the storage in the servers that participate. Surely this competes directly with EMC’s core storage business today? Yes maybe, but if I need 1000 engines driving a massively parallel workload, I can’t achieve that simply with the hardware resilient architecture of ‘traditional’ storage arrays. While scale out architectures like Isilon scale to the hundreds of nodes, ScaleIO grows to thousands to support the 3rd Platform requirements.

So re-think ViPR in this context, today I am firmly in the 2nd Platform and I implement ViPR to gain control, lower cost and improve availability. Then I get a request to support a 3rd Platform application, let’s say Hadoop. Do I rush out and purchase dozens of servers or how do I plug in the HDFS Data Service into ViPR and support them immediately out of my existing hardware infrastructure?

Here was my ah-ha moment… as I grow my 3rd platform services, I deliver these as data services against existing hardware today and move into specialised or commoditised hardware infrastructures, depending on other factors, but without disruption! Now ViPR becomes a mechanism for me to co-exist in these worlds and move between them as need be. (After all there is still a lot of mainframes/1st Platforms in use today!).

So if I’m right what would you expect to see from EMC? Expect more ‘Data Services’ which will look like virtual versions of the current ‘hardware’ products that exist today!

Advertisements

What’s the value in Big Data? The Push Shopping example (Big Data, Part 3)

By Clive Gold – Marketing CTO, EMC Corporation, Australia and New Zealand

Talking about a Telco and its data might not resonate with you so let me expand my example, before we discuss how to solve the problem.  Go back to Part 1, here, of this series, and consider how a retailer could turn this near shopaphobic into a future shopaholic? As discussed online is changing the shopping experience, more than just as a substitute for physical.

Building on the online shopping experience, I described a substitute for physical shopping, now think beyond that.  We know about online auctions, but I also use Catchoftheday and Scoopon (yes I have a frustrated wife that thinks we get boxes of useless gadgets, and yes… I’m in the process of re-doing my open water divers licence – it’s been years).  These are interesting as it’s an example of push selling, not new but much faster and group purchasing.  Now think beyond that, the offer I get from these systems is not tailored to me, so I ignore almost all of them! However if they were in line with what I was interested in today, well that’s another story.

Firstly a health warning, don’t panic! As you live you leave a digital trail. Ignore for now the privacy and security issues. What if all this data was in one place, (Isilon-read Part 2), here, then a system could analyse this and tell me stuff I didn’t already know.  Here is an interesting example: I purchase an iPad and I comment on Facebook that I hate the soft-keyboard. Imagine if you sold iPad cases and you had these two bits of information and you could see I purchase a lot of electronic gadgets; you send me an offer of an iPad case with a Bluetooth keyboard built in… and you have me.

The problem is we are talking about two big problems, lots of data and looking for connections I don’t know I’m looking for. Traditional databases just can’t hack this, not in size and not in structure, so enter Greenplum. Designed to do this work it not only scales, (performance) but does not require you to pre-define the structures, which then limits you to finding things that you know you are looking for.

The example I like to use is a case study where a major Telco in the USA, was analysing their mobile call data records. They found that when a person cancelled their service, within a short period of time, up to six of the top 10 people they used to call would also cancel (not surprising, they are the people they talk to).

So, finding the un-known, un-known’s is an interesting topic for another day. For now, being able to store and analyse BIG DATA has so much potential.

Perhaps now having this digital trail can enhance our lives!

What’s the problem with storing BIG DATA? (Big Data, Part 2)

In part one, here, I teased that the telco could not get the data they needed and they could not analyse it.

Let’s drill down into the first issue, if the telco really wants to ‘know’ me as a customer they would need to pull all my call data records together, (perhaps the call data records of the people I called) and internet usage (URLs visited)! Now doing this for me and then the rest of their customers individually, would not be practical. They would need this data for all their customers. That is a lot of data, 100’s of Terabytes or maybe even Petabyte(s) of data, and keeping this in one place is hard.

Data storage is not like a big box you just throw stuff into; you need a structure to make it work. The structure is created through software that creates RAID groups, file systems and volume managers, which result in limitations at scale. The size is limited by software that is created with a fixed data word size. You do the math and the word size limits the amount of space that can be addressed. So in a 32 bit world the biggest single box that can be created is in the TB range, not big enough for our Telco.

One answer is to just re-compile onto a 64bit system, after all Windows 7 has a 64 bit version! Yes you can do this and publish a new competitive spec sheet, but it won’t perform at scale! The layers of software, (described above) cause an architectural bottleneck, (queues and messaging busses). Also there are structures that you need to keep in memory which creates a cost problem.

Perhaps you throw masses of hardware at it and cluster a number of systems together, using a coordinator or manager system to make it run. Let’s not go into the messaging and management issues with this approach!

So what is needed is essentially a petabyte scale USB memory stick! That is what Isilon delivers – the company EMC just bought. Technically Isilon is elegant in its simplicity, in that it does the job of the RAID, Filesystem and Volume manager in one piece of software which was designed from scratch for scale! This system that is branded OneFS, is the magic. It is fast, as its one layer, it’s scalable as it is duplicated in each node, giving a peer network of nodes which scale linearly.

Now with this giant USB drive, the telco can put all this data in one place and if it could analyse it in almost real-time they could have an incredible competitive weapon…

To be continued in Part 3…!

Consider ‘Push’ shopping! (Big Data, Part 1)

By Clive Gold – Marketing CTO, EMC Corporation, Australia and New Zealand

I hate physical shopping, yet I kind of like shopping online (I’ll even confess to enjoy it, if the wife is not around). Why? There are a couple of reasons – transparency and control.

Transparency because within a few minutes I can compare prices and features from all over the globe and become an instant expert. Perhaps not expert but at least well informed, compared to the last time I asked a shop assistant a question. They simply read the box, which I had done, and then tell me they don’t know.

Everyone loves giving their opinion, (your reading this!). So having decided on my most likely purchase, I look for reviews, problems, issues, etc. Now I get to hear hundreds of user’s opinions about the product that either reinforce my decision or change it for me. The other day a media-player with all the features and at a reasonable price looked good. However, on the chat forums it had more complaints than a plane load of Poms…  so I went to my second choice!

In the end it’s probably that I’m totally in control, I decide on my terms and in my time. There are no extraneous other factors affecting my decision. A perfect situation… well there is no such thing.

The downside of shopping online is you instantly become a target for the retailer, and essentially they begin spamming you. The problem is I don’t know if I should take offence or not… I’m interested in their stuff, I bought some of it, but I’m not interested in all their stuff!

Last piece of this puzzle is I don’t like getting ripped off! For example last week I re-jigged my home telecommunications setup. Combining two teenagers with a 12GB broadband limit was causing a great deal of pain. So I compared my $65/month set up with what was available in the market. Oops, for about $10 more a month we could get 10x the data, and by the way the landline with calls included that will save us about $100 per month. Very difficult decision to make!

So why would this major Telco provider not keep me as a customer and suggest a new plan? The reason is not profiteering as it will now cost them far more to win me back, but data! Today they are literally clueless, they don’t have all the data they need in one place, and they can’t analyse it! But that is changing…

To be continued in part 2…

So what is ‘Big Data’?

By EMC Marketing CTO – Clive Gold

I know most of you are familiar with Cloud, so with the theme for EMC Inform being “Where Big Data meets Cloud”, let me expand on the rapidly growing, ‘Big Data’ subject!  (No apologies for the pun/dad joke)

I recently read a whitepaper from the Kimball Group entitled
The Evolving Role of the Enterprise Data Warehouse in the Era of Big Data
Analytics
. I found it fascinating as well as an insightful analysis of how Big Data is an issue that needs addressing and the key shifts that have resulted in this new opportunity.

Interestingly it also shows the expanded use cases for businesses looking to capitalise on their Big Data. To me most importantly, it reveals that not a single organisation will be untouched by its effect, because so often these discussions centre around ‘genomic research’ and ‘geophysical analysis’.

It said “the use cases come in all shapes and sizes and formats, and require many specialised approaches to analyze. Up until very recently all these use cases existed as separate endeavors, often involving special purpose built systems. But the industry awareness of the “big data analytics challenge” is motivating everyone to look for the architectural similarities and differences across all these use cases. Any given enterprise is increasingly likely to encounter one or more of these use cases. That realization is driving the interest in system architectures that addresses the big data analytics problem in a general way.”

If you want to read the entire paper – I highly recommend it! You can download it here.

So what has this to do with EMC? Well the paper also talks to the sheer magnitude of new opportunities in this space and makes it clear that systems to support big data analytics have to look very different than the classic relational database systems from the 1980s and 1990s.  Not only this but the task of storing massive amounts of data in also a challenge for traditional scale up technologies. EMC has been investing in solving the issues and providing these new generation capabilities. Developments such as ‘cloud optimised storage’ ATMOS, and acquisition like Isilon and Greenplum, results in EMC offering in impressive and interesting range of technology to help you realise your ‘Big Data’ potential.

Hopefully I have sparked your interest, if so read on about what Chuck Hollis’ as been saying about the subject, just a few quick links:-
–   Big Data: http://emc.im/i0aV49
–  Are you ready for big data? http://bit.ly/hVH1B1
–  More Isilon Big Data Magic” http://bit.ly/hVW35B