Tom Petrocelli's take on technology. Tom is the author of the book "Data Protection and Information Lifecycle Management" and a natural technology curmudgeon. This blog represents only my own views and not those of my employer, Enterprise Strategy Group. Frankly, mine are more amusing.

Tuesday, March 02, 2010

Tiers of a Clown

I've been following the debate about automated storage tiering with amused interest. The various marketing operatives of data storage companies (and a few C-Level folks to boot) are all lining up into one of two camps – tiering is necessary or tiering is unnecessary. There has been dueling animations (very clever) from The Storage Anarchist and 3Par's Marc Farley as well as commentary from a host of industry bigwigs. I love the animations but then again, I always loved cartoons.


Automated storage tiering or automated tiered storage (or data lifecycle management, or whatever else it used to be described as) is using different types of physical storage for different classes of data mostly to save money and maintain performance. The promise of storage tiering is that you can move less important, unchanging, or less frequently accessed data to cheaper slower, storage. You can keep the most important, frequently changing, and most accessed data in a really expensive array that combines high performance with heavy duty data protection features. For data that you don't need quite so often and doesn't change, you can move it to something slower and not as rigorous. And so on until you finally archive it to an archive system or deletion. This has been the bread and butter of folks like Compellent and has been picked up by most of the bigger storage companies since. The ultimate goal is high levels of efficiency in your data storage systems. The more important the data is the more resources it can consume. Less important data consumes fewer resources and balance in the universe is maintained.

A great example of where one might use tiered storage is with a check image. For a short while a check image has to be available to online customers and tellers immediately. Then it has to be stored for seven years and only moderately available. Then it is deleted. Chances are good that after 90 days you won't care to see the actual image so moving it to slower storage is not much of a burden but it saves money.

Three things about tiered storage that are important to consider. These considerations are what fuel the debate. First, automating it is tough. You have to get the software right or you lose data and have diminished efficiency. The second consideration is the ever dropping cost of storage. As data storage continues to become even more stupid cheap, it raises the question of whether you need to be all that efficient in the first place. If a high performance array is inexpensive then everything can have high performance storage without moving data around to a bunch of arrays. Finally, it's hard to decide what data belongs on what resource. Do I base it on age? Class of data? How do I decide what data is what class? These are not technical problems. They are business problems which are much harder to overcome. Wrangling with your organization is hard work. You have to put a lot of effort into deciding what goes where and hope that your vendor supports your criteria.

To me, the problem of storage tiering is that it is a good idea that can be tough to execute. It's like the old joke about teenage sex – everyone talks about it, no one really does it, those that do it don't do it well. I'm sure that lots of folks will say that they have products that allow folks to do this well. However, technology doesn't solve the organizational problem which makes it hard for folks to want to implement it. That doesn't effect the bread and butter customers that top tier storage companies (sorry – couldn't resist) who tend to be huge companies. They have the business process resources to pull it off. It might explain why automated storage tiering is not generating a huge following in mid-sized and smaller companies. They have other things to do with their limited resources then try and squeeze a bit more efficiency out of their storage system. The ROI for them is simply not big enough. Heck, many are still struggling with the blocking and tackling of doing backups and security.

So, where do I weigh in on this debate. I agree with both sides. If that sounds a bit weasel-like then sorry. For some companies there are mission critical applications that would benefit from an automated tiered storage system. For others, it's hard see how there would be benefit enough to warrant the time and effort. For me, the debate is a non-debate. It's not about whether automated storage tiers is beneficial or not. What matters is whether it's beneficial to you. If you think in terms of customers, instead of products and technology, it becomes clear. What applications do you have that need this approach? Does your organization need it at all? Can you decelerate the pace of your storage buying enough to justify the costs and time involved in implementing this? Will you be able to decide what data should go where and when?

In the end, it's a feature like all other features. If it has value for you then it's a winner. If it doesn't then find something that does. But watch the debate. It's quite entertaining.

No comments: