Sunday, November 29, 2009

Flat Tiering - it doesn't suck, it's just your configuration that sucks.

I hear so much anti-flat-tiering noise lately, it's past time I spit out my thoughts on it.
First, what is flat tiering? That's any storage system where you have one or two types of disk, and that's it. Some examples would be the Compellent, Pillar Data Systems, Data Domain Networks, and 3par T and F series. All of these systems use one or two disk types in the system, and that's it. Why is this a good thing? One, it reduces servicing complexity - you don't have to identify which type of disk or controller needs replaced, because they're all the same. Point two goes back to the subject above - configured right, there's no reason to mechanically tier everything.
Flat tiering is not one-size-fits-all. It is not right for every configuration. It is not right for every application mix. But it fits most setups well - when it's configured correctly.

Certainly, this is far from specific to flat tiering. Do not ask how many badly configured mechanically tiered setups I've seen. Bad configurations are more common than people want to admit. Usually it's holdover configurations from initial deployments done in a rush, where there's too much loading or not enough spare resources to perform a reconfiguration on. Either way, bad configurations are bad configurations wherever they are.

Here's the thing with flat tiering though - when your configuration sucks, you REALLY know it. It's not like a DS5k where your bad configuration only pushes your response times up to around 20ms, or an SVC where you reduce theoretical peak throughput by 50% (which still gives you about 4.5GB/sec.) A bad configuration on a flat tiered system can push your response times over 50ms and drop your IOPS through the floor. And it's much easier to badly configure a flat tiered system than a mechanically tiered system.

The first thing you have to do, and I really do mean have to do when you work with a flat tiering system is throw everything you know about configuring mechanically tiered systems away. Forget all of it. It's only going to wreck your configuration. The second thing you have to do is document and test more first. I don't mean figure out what needs high priority and how many terabytes it needs - I mean really document and test. What's your typical IOPS for this application? What's your maximum acceptable response time? Is this random disk seeks or sequential loading?
Let's look at mechanical tiering. Basically, it goes something like this: if it has to be fast, put it on fast disk. If it can be slow, put it on slower disk. If it can be really slow and it's mostly sequential, put it on SATA. Flat tiering does away with all of that. Every last bit. There is no fast and slow disk, there is just disk.
This is where the biggest mistakes happen. People assume that this means they simply shift their entire design philosophy to capacity and nothing else. Capacity becomes the commodity and all performance is equal.

Can I get a show of hands from 3par, Compellent, DDN, and Xiotech who agree that flat tiering means capacity is the sole concern of their product, and all performance for all LUNs presented to hosts will always be 100% equal at all times?
Huh. I didn't see any hands there. Maybe I didn't ask loud enough? Or maybe that's because it's just not true. While performance across all LUNs on most of these systems will tend towards equal that does not mean it is equal, or should be equal on all LUNs.

Think about it. Do you want your PeopleSoft database to have the same access priority as your customer service file shares? Of course not. That's just silly. You need that PeopleSoft database to be on the fastest possible disk you can get. But you're on a flat tiered system, so you don't have faster disk. And there's only one product where 'flat disk' means 'flat performance' - and I won't even mention which, because it's just a bad storage system period. Ask any of the vendors I've mentioned above if flat disk means flat performance, and they'll tell you bluntly "no." It does tend toward flatter performance, but that isn't the

This comes back to the point of better documentation and understanding of storage requirements. If you attempt to treat all storage as equal, you are much more likely to get burned. A performance hit to your Windows shares becomes a performance hit to your ERP systems. Obviously this is the last thing you want to have happen in your production environments. That's why there needs to be greater focus on the storage, and a greater attention to detail. Storage complexity doesn't reduce, it just moves. You need to have a greater attention to detail than you're used to.
Nor can you just apply traditional best practices. They don't apply here. With different RAID modes, different technologies, and different methodologies of achieving storage performance, simply going with what you're used to isn't an option. You can't just throw together 4+1 RAID5's and call it done. You need to look at how each system does RAID, and how it applies to performance from the host perspective. You can't just throw more spindles at a slow database - it may even further hurt performance. You may need to increase controller priority, or adjust how the arrays are caching.

The other thing I see is a stiff resistance to SATA in traditionally FC spaces. This is right and wrong. Whenever any vendor is trying to sell you SATA instead of FC, you should be extra critical. That doesn't mean throw it out. That means you test, test, and test again. You test on real hardware with loads as close to real as you can get. You don't buy a thing until they prove every single claim. The fact is that SATA sucks for random, period. Every single vendor I've named knows and acknowledges this - that's why they all offer FC or SAS drives and SATA drives. If the salesperson is trying to push you to go ALL SATA, chances are they don't know what the hell they're talking about, or they see something in your workloads that you missed. Understanding your workloads is something you need to do before you even start talking to sales.

Flat doesn't necessarily mean really flat. Sometimes it means replacing 6 controllers and 32 arrays of FC disks with one or two controllers with far less arrays, achieving equal or greater performance. It does not need to mean limiting yourself to a single disk type or speed.
And let's go back to making them prove their claims. 3par can brag about some of the best SPC-1 results around, the F400 pulled off 93K IOPS with a respectable response time. That's a great demonstration of what their technology can do, using 146GB 15K FC disks. It is not a demonstration of what it will do in the configuration they're selling you. Your workloads might push their controllers harder than they expect. Your workloads may be poorly suited to the disks they're suggesting. Test, test, and test again. Make them back up their promises in writing. That's true of all storage, sure, but doubly so in this space.

The thing is, I know for an absolute fact that at least two vendors I've named in here (no, I can't say who) have backed up their claims of offering better performance and reliability in writing, in full. I know this because one, I was involved directly or peripherally in it, and two, they made good on those promises. I seriously can't say who, because it is NDA'd and proprietary information regarding customer agreements.
But I will tell you right here, right now, that if you ask any one of the vendors listed above to back up their performance and reliability claims in writing, with a guarantee to cover costs of removing them or going back to your existing storage there are two who can and will look you right in the eye, say they can do it, and will put it down in writing. And the requirement that you test, and work with support to achieve it? If that isn't standard operating procedure for you on traditional arrays, you need to reexamine your SOPs. (Again, test test test! You cannot test enough!) Anybody who won't back up their claims in writing, either shouldn't be making claims, or should be axed from your shopping list fast.

Oh, and that flat array I won't mention? IBM XIV. My recommendation? Don't even touch that thing. It is dangerous, and not ONE of the claims I have heard from sales in the past is true. IBM's own presentations show that XIV can barely beat the DS3400 in random IOPS and can't even match the DS3400's response times. XIV is complete crap for anything that isn't purely sequential read with <40% write and <20% random. If that isn't true, why hasn't IBM backed their claims on XIV with SPC results? DS3400 has 'em, DS5300 has 'em, XIV still doesn't over two years after it's introduction.


  1. Phil, Marc Farley here from 3PAR. I just saw this post from last year. Sorry I missed it previously.

    You asked for a show of hands regrading equal performance across all LUNS, the answer is hell no. With a 3PAR array there are many ways to configure the parameters for each and every volume you create that impact performance, such as:

    RAID levels, choose 5, 1, 6 or 0. They all have different performance profiles. BTW RAID 5 and 6 both use hardware acceleration. You mix all these RAID types together on the same disks.

    You can use different classes of disk drives: type and speed.

    It's definitely not all about flat, uniform capacity.

    Check out Nate's posts on his 3PAR T400 with SATA drives here:

  2. This comment has been removed by a blog administrator.