Best practice disk usage for SQL Server

I've been reading that it is best practice to place the following on separate disks for performance and safety:

  • tempdb
  • database file
  • log file
  • backup files

When you consider RAID using multiple disks this becomes an enormous amount of physical disks being used.

I understand log files benefiting from being on their own physical disk due to their sequential write pattern but can the other files be separated onto "logical" disks instead of physical ones?


Solution 1:

When choosing a solution first you should know the maximum workload that your DB should be ready for.

If you have a product that you are installing, that product will have a recommendation for the setup for different workloads.

If the product is designed internally, then benchmark it, create performance limits and cost. Deploy your application gradually. Remember the GMail invites? They were created to test their application in the wild, but with a controlled number of users.

Separating the data on partitions is almost useless. The only benefit is that the fragmentation will be a little bit more localized, but the DB requests are mostly random access.

Do NOT use RADI5. Use RAID1 or RAID10. May be RAIDZ (ZFS on a iSCSI SAN) if the data integrity is critical.

Solution 2:

...can the other files be separated onto "logical" disks instead of physical ones?

You can put them on different logical disks which are on the same physical disks. That won't give you all the advantages of using separate physical disks, but if you do later decide that you need more physical disks, you can more easily move the logical disks to the new physical disks.

Solution 3:

You're looking at a three-way tradeoff.

  • Speed
  • Risk
  • Cost

To get the fastest speed and the lowest risk will usually cost you the most. In my experience, most CEOs and CFOs want to maximize speed and minimize cost. It takes a while to get them to appreciate the cost of accepting the risks that go along with maximizing speed and minimizing cost.

On the other hand, DBAs tend to want to minimize risk.

If you store the database and its backups on the same server, then whatever might destroy the server might also destroy your backups. That's usually not an acceptable risk, but it's application-dependent. It takes a fairly big event to destroy a server; I don't think I've ever seen one destroyed.

If you store a database and its backups on the same physical disk, then whatever might destroy the disk might destroy both the database and your backups. I've seen disks destroyed quite a few times. This is almost always an unacceptable risk.

Some speed issues can be mitigated with solid-state disks, but these come with different costs and with some new risks.

Personally, I wouldn't like to keep backups on the same server as the database, but I might be willing to write the backups to the same server (anticipating high speed and low cost), and copy or move them somewhere else later.

Solution 4:

There are two main reasons to distribute database files across multiple disks: performance and availability.

Availability boils down to this: you just don't want to lose data files, log files and backup files at the same time; losing one of them is acceptable, losing two of them is a big mess, losing all of them is a complete disaster. Thus, you put them on different disks in order to minimize this risk (and you really should have backups on another server or external storage; sometimes a server can and will be completely destroyed). For this to make sense at all, the key is to use different physical disks: having multiple logical volumes on the same physical storage will not help you at all when that physical storage dies.

Performance is a bit more tricky, and for big databases will also require more than two or three places where to store files; but this depends strongly on the database design and workload. However, in all possible scenarios, this still makes sense only if different physical disks are involved: what use is dividing two disk operations... if they end up being performed by the same disks? The disks will still have to handle concurrent requests with different access patterns and acting on different sectors, just like they would have to do if the files were in the same volume.