Scaling databases with cheap SSD hard drives

Potential Issues

I have a couple points of issue with using SSDs for production databases at the present time

  • The majority of database transactions on a the majority of websites are reads not writes. As Dave Markle said, you maximize this performance with RAM first.
  • SSDs are new to the mainstream and enterprise markets and no admin worth his salt is going to move a production database that currently requires 15K RPM U320 disks in RAID5 communicating via fibrechannel to an unproven technology.
  • The cost of the research and testing of moving to this new technology, vetting it in their environment, updating the operating procedures, and so forth is a larger up front cost, both in terms of time and money, than most shops have to spare.

Proposed Benefits

That said, there are a number of items, at least on paper, in favor of SSDs in the future:

  • Lower power consumption compared to a HDD
  • Much lower heat generation
  • Higher performance per watt compared to a HDD
  • Much higher throughput
  • Much lower latency
  • Most current generation SSDs have on the order of millions of cycles of write endurance, so write endurance is not an issue as it once was. See a somewhat dated article here

So for a given performance benchmark, when you factor total cost of ownership including direct power and indirect cooling costs, the SSDs could become very attractive. Additionally, depending on the particulars of your environment, the reduction in the number of required devices for a given level of performance could also result in a reduction of staffing requirements, reducing labor costs.

Cost and Performance

You've added that you have a cost constraint under $50K USD and you really want to keep it under $10K. You've also stated in a comment that you can get some "cheap" SSDs, eluding that the SSDs will be cheaper than the DBAs or consultants. This may be true depending on the number of hours you would need a DBA and whether it is a reoccuring cost or not. I can't do the cost analysis for you.

However, one thing you must be very careful of is the kind of SSD you get. Not all SSDs are created equal. By and large the "cheap" SSDs you see for sale in the $200-400 dollar (2008/11/20) are intended for low power/heat environments like laptops. These drives actually have lower performance levels compared to a 10K or 15K RPM HDD - especially for writes. The enterprise level drives that have the killer performance you speak of - like the Mtron Pro series - are quite expensive. Currently they are around:

  • 400 USD for 16GB
  • 900 USD for 32GB
  • 1400 USD for 64GB
  • 3200 USD for 128GB

Depending on your space, performance, and redundancy requirements, you could easily blow your budget.

For example, if your requirements necessitated a total of 128GB of available storage then RAID 0+1/10 or RAID 5 with 1 hotspare would be ~$5600

If you needed a TB of available storage however, then RAID 0+1/10 would be ~$51K and RAID 5 with 2 hotspares would be ~$32K.

Big Picture

That said, the installation, configuration, and maintenance of a large production database requires a highly skilled individual. The data within the DB and the services provided from that data are of extremely high value to companies with this level of performance requirements. Additionally, there are many things that just cannot be solved by throwing hardware at the problem. An improperly configured DBMS, a poor database schema or indexing strategy can /wreck/ a DB's performance. Just look at the issues Stackoverflow experienced in their migration to SQL Server 2008 here and here. The fact of the matter is, a database is a strenuous application on not only disk but RAM and CPU as well. Balancing the multi-variate performance issue along with data integrity, security, redundancy, and backup is a tricky bit.

In summary, while I do think any and all improvements to both the hardware and software technology are welcomed by the community, large scale database administration - like software development - is a hard problem and will continue to require skilled workers. A given improvement may not reap the labor reduction costs you or a company might hope for.

A good jumping point for some research might be Brent Ozar's website/blog here. You might recognize his name - he's the one who has assisted the stackoverflow crew with their MS SQL Server 2008 performance issues. His blog and resources he links to offer quite a bit of breadth and depth.

Update

Stackoverflow themselves are going the consumer SSD-based route for their storage. Read about it here: http://blog.serverfault.com/post/our-storage-decision/

References

  • http://www.storagesearch.com/ssdmyths-endurance.html
  • http://www.brentozar.com/archive/2008/10/sql-2008-upgrade-tuning-for-stackoverflowcom/
  • http://blog.stackoverflow.com/2008/11/sql-2008-full-text-search-problems/

If you have a really, really high-traffic site which can benefit from an SSD for increased write performance, you will probably have an issue with the lifetime of the SSD, so I'm not sold on them yet for that.

With that in mind, what to do with databases which have high levels of reads? The answer is simple: jam the server with as much RAM as you can stomach. You'll find that the hottest tables are almost always kept in RAM cache anyway, and any large hit to disk will probably be due to a big table or index scan, which can often be optimized away with proper indexing.


I've worked as a DBA for 5+ years and thinking about ways to improve DB performance is always at the back of my mine. I've been watching the SSD space and I think that they are definitely becoming more and more of a viable option.

Check this out;

http://i.gizmodo.com/5166798/24-solid-state-drives-open-all-of-microsoft-office-in-5-seconds

There is also a new product produced by Acard called the ANS-9010 which is an improved version of the GC-Ramdisc which allows you to use DDR2 ram to create a SATA Drive (up to 64gig) using DDR2 sticks with a 400MB/s theoretical maximum.

http://techreport.com/articles.x/16255/3

^^ But the other thing that's useful in that article is that it compares the ANS-9010 against all the players on the SSD market and it turns out that Intel have 64GB x25-E SSD that's pretty much comparable to having a hardware ramdisk.

The thing that would worry me about the SSD is wearing them out with all the stress that a large DB would put them through, and so you'd have to use raid to mirror the drives which means that you're paying twice as much;

And the downside with the hardware ramdisk is that the battery, in the case of a power cut only powers it for so long so you'd have to work out some fancy way to back it up. I believe that you can also purchase a mains plug for them but then that still relies on your UPS.

I suggest that you use the hardware ram disk for the temp DB and windows swap file - and put the database on the Intel X25-E Extreme (approx 600 USD for 64 gig).

Anyway it would scream and make all the rest of us very jealous.

(Also consider using another ANS-9010 for hosting the website)

Cheers, Dave


We just put together a w2k3 r2 64bit Sql 2008 server on dual 2.5in Seagate Momentus XT hybrid mirror - 1/4 stroke for OS, and 1/4 stroke for DB. So were using 125GB for OS and 125GB for DB. were getting 1500MB/s to 1900MB/s seq reads. On an Intel i7 2600K 3.4Ghz 8GB