SQL Server Transaction Log RAID

We have three SQL Server servers, and each server has a about five or six databases on it. We are in the process of moving these servers to a new SAN and I am working on the best RAID configuration. Currently all of the log files for all of the databases share a RAID array, there is nothing else on this RAID array except for the log files, but all of the databases use this same array for their log files.

I have read that it is best to have log files on separate disks. But in our case I am not sure whether it would be best to have one big array with about 8 drives that all the log files are on. Or would it be better to create four two disk arrays and give some of the larger databases their own dedicated disks for their log files?


Logs

Logs are a mostly sequential access structure. Simplistically, you can view them as a ring buffer of entries that say 'Write this data to this block'. When the DB engine issues a write, it actually writes one of these entries to the log. A log reader process then follows up asynchronously and writes the blocks out to the disk.

Because there is relatively little disk head activity, the logs are relatively quick. However, if the log writing activity is in contention with random access activity on the same disk the contention can significantly affect log writing performance, which affects overall DB performance.

Additionally, writing the logs to a separate disk gives you a measure of redundancy. If you back up the data, log entries since the backup can be rolled forward on a restored database. This means that a disaster has to take out both the log and data volumes to cause data loss.

For these reasons, you should have logs on a separate array (different physical disks) to the data volumes. A single mirrored pair can handle quite a large amount of log traffic if there is no contention, so you probably don't need anything more than that unless you have very large transaction volumes. However, make sure nothing else is generating disk activity on the log volumes.

TempDB

If you have processes that make heavy use of TempDB, you will get a performance benefit from having this on a RAID-10 volume, as this provides better write performance than a RAID-5. (see the notes on write-back caching below). If you are putting your data and inedexes on RAID-5 volumes and have processes that make heavy use of TempDB you may get a benefit from putting TempDB on a separate RAID-10 volume. If you are using RAID-10 on the data volumes as well this won't matter.

Data and Indexes

If you have really large data volumes you may put data on a RAID 5, 6, 50 (Striped RAID-5 blocks) or 60. If your data volues are smaller you might go with a RAID-10. If you have RAID-5 (or similar) on your data volumes you will probably want a separate TempDB volume, otherwise the system will probably work OK with data, indexes and TempDB sharing a single RAID-10 volume.

Write-Back Caching

If the RAID controllers on your SAN support battery back up, you can set up write-back caching on them. This means that the controllers cache writes in RAM and optimise writes back to the disk. Write-back caching can be quite a performance win on RAID-5 volumes, but adds extra failure modes to the system.

If power fails, the battery back-up will retain the cached data for a few days. When you reactivate the SAN, it will write out the cached entries. In theory this is fairly reliable and probably will work. However, there are plenty of anecdotal stories about write-back caches failing.

If you can activate write-back caching on a per-volume basis, consider having it inactive on the logs. That way, a cache failure can be recovered without data corruption by restoring the database and rolling forward the logs, as described above. Benchmark your system to make sure the performance is satisfactory.

Conclusion

Separate log and data volumes. If you are using RAID 4, 5, 6, 40, 50 or 60 and have heavy activity in TempDB, consider putting TempDB on a separate RAID-10 volume. Unless you have extremely large data volumes, you can share data, indexes and TempDB on a RAID-10 volume. Log volumes should not share disks with sources of heavy disk activity.


The transaction Log array must be at least RAID 1 (mirrored), and 1+0 is recommended(mirrored + striped) which requires a minimum of 4 physical drives.

RAID 5 should NEVER be used for Transaction Logs, only for the data files due to the nature of the sequential write pattern for Transaction Logs, and for performance you should always seperate the Log and Data files onto different physical arrays.


I would only split it into 2 arrays, otherwise you are wasting to many disks. If you have one array with 8 drives, you should have 7 drives capacity. However if you have 4 arrays, you have only 4 drives capacity.

Another thing to think about is that the Read and Write performance of RAID 1 over RAID 5. In theory RAID 1 as 1x write, 2x read and 2x seek, while RAID 5 has (N-1)x write, Nx read and 1x seek. That's in theory, in practice your read/write/cache patterns and the controller make a difference.