Wednesday, August 19, 2009

SSD in Storage Arrays

The latest buzz word in Storage world is SSD. What is SSD and how can it help storage systems? Solid State Storage is made from Flash Memory (or rather NAND flash). The beauty of this is that there are no moving parts like in disk devices and hence performs better than regular disk devices. Access times are faster and application performance will be going through the roof and storage will not be a bottleneck anymore. Hold on to that thought while we discuss the use of SSDs in storage arrays.

At the heart of any storage device is Data Integrity which should not be compromised for any kind of performance increase. Let us take a deep dive into this new technology (SSD) and see what it can do and where it can help and where it cannot.

As mentioned earlier, there are no moving parts, and that is a great thing for SSDs. The bad part to it is that these have short-life-span and high bit-error-rate as well as low-capacity compared to disk drives (and expensive too). There could also be data retention problems at higher temperatures in SSDs. These short-comings can be addressed with a 'controller' that can add more reliability into SSDs.

SSDs come in disk drive package with either SATA or FibreChannel interface. There are also SSD solutions with PCI cards as well that can be used as high performing local storage. Media performance is great (IOPS) when compared to HDDs. Also, lower power consumption per IOP makes a great ROI point. When you see vendors advertising their SSD solutions, you need to look for both read and write IOP numbers. Reads are always great on SSD but writes have a very huge impact and so the write IOP numbers will always be less. Unlike in HDD, a write is not just write to a block rather the controller has to do programming to erase, transfer and program and hence the imbalance between read / write IOPS. You also need to be wary of who made the SSD as some cannot do the erase, modify, write i.e write once only.

Performance drivers for SSDs include the number of NAND flash chips (also called as Die), number of buses, Data Protection - ECC, Data Integrity Field (DIF), RAID etc and whether it is a Single-Level Cell (SLC) or a Multi-Level Cell (MLC), effective block size and lots of other factors.

Of these, the most impacting factor is whether it is SLC or MLC. Let us take a quick look at what these mean. SLC and MLC flash memory are designed in a similar way except that MLC devices costs less and can have more storage capacity. SLC fares well with high write (erase) performance and greater reliability (even at higher temperatures). Due to the nature of MLC, high capacity flash memory cards are available to the consumer at low-prices (the ones in your cell phone/pda/cameras etc). Where it is required to have high performance, SLCs are used (and hence expensive). SLC make a good fit in embedded systems.

In order to make high capacity flash (SSD) for storage systems, MLCs are used. And since they are not so reliable, a RAID mechanism has to be implemented to take care of B-E-R (bit error rate) and dedicate half the capacity of SSD for failed/dying cells to have data integrity. The average life of an SSD in a storage array with 75/25 R/W ratio is 5yrs. Since it is almost sure to have these replaced at the end of 5-years, the replacement costs are factored into the selling cost as well.

Finally let us see if we really need SSD in our storage arrays! Requirements for storage can be accounted for in two forms, either by IOPS or by Capacity. Usually it is a capacity requirement for most applications and some have a high IO requirement (OLTP). There are few cases where an application required high IO as well as large capacity (again, OLTP, email etc). Naturally, when you have a huge capacity requirement you will have a rather larger spindle count and can take care of your IO requirements. If proper capacity planning (both IO and size) is not done, and you keep adding new applications to your existing pool of disks, then you will certainly run into performance issues.

Let us take a use case of high IO requirement where 2,000 IOPS are required for a DB of 1TB size. If we do the HDD way, we will need approx 8 disks (with 300G drives, including raid parity disks 6+2 / 7+1 etc) for capacity but that count will not satisfy the IO requirement and hence you will need to double or triple that count (~150 IOPS / 15K drive). Now, if that requirement were to have a sustained random read IO of 2000, the number of HDDS required would be even more. Maybe 5 times. So, for 1TB of data you are now using 40 disks (raw capacity is 12TB). It will be hard for you to explain to management why you cannot use the rest of the capacity for other projects (good luck with that).

If we do it the SSD way, you will need approx 6 SSD (Raid5, 4+2P/5+1P etc) from a capacity storage view and way over your IOPS requirement. This would seem economical ONLY if those 6 SSDS were to be less expensive that 40 HDDs. In today's storage arrays, where most licensing applies to installed raw TB, with 40x300G, you will be paying more for licenses on that array and less if you go with SSD. It all depends on the price-per-requirement.

Not many applications generate that kind of load which really needs SSDs. Maybe if there were to be a tiered storage array which can assign Tier-0 (SSD) blocks for highly active data and then move them off to Tier1 (15k HDD) or Tier2 (10k HDD) that would be an ideal world for Storage Engineers to be able to dynamically create solutions with very less price-per-IO or price-per-gb.

No comments:

Post a Comment