How to shape size of SQL server resorces (RAM, CPU, and others)

It is standard story, there is fight between developers and administrators. One denunciation that database design and queries are bad while others says It is lack of hardware and amount of data.
So I am asking You is my IBM x3400 with 2 xenons 2GHz and SCSI raid 5 and 4GB of RAM adequately shaped for MSSQL database of 53 GB with tables main details is around 6,5 million of records in details and 2 million in headers of documents while others is around 100K (such as items).

We are constantly suffering with lack of performance while getting data from SQL, this server is only dedicated to be only SQL server and act as subscriber of Replicated data from other SQL server.

Other question is How does database administrators planing size of hardware for DataBase servers ? Is there some standard approach or it is just experience and sense ?


  • 4 GB ram is a joke these days. So, no - sorry. I think you are done on that end. It can work, but that requires specific usage patterns. I would not run a db server on a 4 GB hardware out of principle - 16 GB of RAM costs pretty much nothing to start with.
  • SCSI RAID 5 is not optimal. Depending on usage patterns, you should have a minimum of TWO groups - one fast for writes (log), one fast for reads (data). I have had good success using a RAID 10 of 4+ discs for OS and LOG and another one for data. Mind you, though, the db was a lot larger. In your case throwing out the RAID 5 and just putting in TWO mirrored SSD will make sense, given that your data is only 53 GB. A mirror of two SSD will probably blow up your IO performance by a factor of 100. You are likely IO bound with the help of your RAM being - by today's standards - pathetic. Sorry if that sounds rude, but a db server should have MORE RAM than a developer workstation, and depending on what company you are in you are on par or WAY below that.

Is there some standard approach or it is just experience and sense ?

Experience and sense. You also think forward and check what makes sense over a couple of years. For example, SuperMicro has NICE servers with place for 24-72 discs in a SAS configuration. So, you get possibly one to avoid using a SAN (more expensive) and fill in discs as you need. Others get a small server and then run out of options. You also can get some ideas from testing on a normal workstation.

It is standard story, there is fight between developers and administrators.

No. It is not.

One denunciation that database design and queries are bad while others says It is lack of hardware and amount of data.

No again. Db design can be measured pretty objectively. As in: there are certain documented and known approaches (that a LOT of developers are basically totally ignorant of). Ever heard of 5th normal form?

Same with queries. I can actually SEE whether a query is efficiently executed. There is no real grey area here. That said, there may be tradeoffs, but if that goes into a blaming game, then I can be pretty sure there IS something wrong.

Pretty often developers don't know anything beyond "this is a simple select" and have no clue how to deal with a database and then try to throw hardware on the problem. Been there, seen that. Not always, but it is a likely guess.


With RAM sizing it is important to know what your likely "working set" is, and make sure you have enough RAM for that multiplied a couple of times as a minimum. Your normal working set is all the index and data pages that are commonly used and being able to hold it all in RAM with sufficient space to spare for "more exceptional but still not uncommon" queries will reduce disk I/O needed for read operations considerably.

For instance a 10Gb database (smaller than yours by a considerable amount, but the theory holds what-ever your data size) for one of out client's applications contains about 1Gb of active data pages: indexes and table rows that are likely to be accessed routinely as users go about their normal business. Another ~4Gb is older data that isn't read in most normal sessions (as most views default to only showing at most a month or three of data, and the data, indexes and queries are well planned enough that older data pages aren't going to be read unless the user asks to see something further back in time (which is relatively rare in day-to-day use). The final ~5Gb is blobs - documents that the users have attached to records and that are rarely accessed more than a short while after being added except during an audit. Even with this size of data, I doubt 2Gb RAM would be enough to maintain speedy access (the DB I'm talking about lives on a 4Gb server dedicated to running MSSQL, another machine acts as web server and other services related to the app) so you might need to rethink you RAM size for your data size - RAM is relatively cheap these days so assuming your server can manage the extra bumping it up to 8, 12, or even 16Gb may give your a good return on investment performance wise.

For when disk I/O can not be avoided, I would move away from RAID5 and RAID6 if you data sees a lot of write activity. Standard RAID10 (or slightly less standard RAID10 arrangements like those offered by the Linux RAID10 driver and some hardware solutions) will perform as well for most read loads, noticeably better for most write loads, and offers similar redundancy (any one drive can fail). If you don't want to jump to four drives, you can try three-drive RAID10 (called RAID1E by most IBM controllers) if supported by your environment. Also it is very much worth considering splitting your array into two as TomTom suggests. For write operations you are likely to find that having the transaction log on on RAID1 array of two drives and the data files on another will perform significantly better than using RAID10 - the near-RAID0 bulk-read performance of RAID10 can quickly be killed by the random access nature of database writes (updating data and index pages that may be spread over the filespace, updating the transaction log before committing data to the data files and marking it as committed in the log when done, and so on). Separating log and data file activity over different spindles can significantly boost database performance in many cases. If you have enough room for the required drives, keeping tempdb (or what-ever your RDBMS's equivelant to MS's tempdb) and such on a third array can make a significant difference too if your queries and stored procedures make heavy use of temporary tables (it can sometime be surprising in how many circumstances the query planner will consider using tempdb behind your back!). Of course using SSDs can be an answer to random access performance too - whether you'd get better a better price/performance ratio with those or other array arrangements (or those and other array arrangements) depends significantly on your DB(s) and typical access patterns.

One other thing: before investing time+effort+money in rearranging your I/O subsystem, make sure you run some performance metrics at busy times to make sure you don't have any bad procedures that are heavily CPU bound. Sometimes complex procedures (particularly those that use cursors in less than optimal ways) can peg the CPU+memory subsystem for lengthy times (adding more RAM and better I/O capability will make little difference here) and can often be optimised significantly by rethinking the cursors/loops or managing to remove them altogether. A mix of custom SQL trace logging and Windows Performance Monitor logging (or equivalent monitoring tools for people using a different OS+DB combination) can be a great help in finding where your key bottlenecks really are (memory starvation, I/O performance, less than optimal code, ...) and you should trying to fix a problem until you are relatively certain you are fixing the right problem.


I would imagine that hardware and code are both at fault to at least some degree.

First, you can use SQL Profiler and dynamic management views to prove which queries are slow, and why they are slow. Using these tools is fairly easy at a high level, you can easily see if you're CPU-bound, disk-bound, memory-bound, etc. But understanding complex query plans and optimization isn't something you're going to learn without putting in some time. There is a nifty missing index and slow query report in the SQL Performance Dashboard that can help you find some low-hanging fruit.

On the other hand, that server hardware is positively wimpy by modern standards. My personal workstation at the office is more powerful in every way (including SSD instead of mechanical disks). We were deploying 64-bit database servers with >16 GB of RAM five years ago. Hardware is the cheapest part of any IT operation - certainly cheaper than the dozens to hundreds of man hours you are spending on this issue.

Suggestions:

  1. Use performance monitor to figure out where your bottlenecks truly are. Hard numbers, not guesses. There are lots of good MSDN and other article on which counters to monitor. Usually it's a Disk or CPU bottleneck. If you're seeing high disk queue length, you usually need more memory, not faster disk! Buy much as RAM as you can afford. This means you need a 64-bit OS and 64-bit version of SQL server. You can essentially keep the whole database in RAM on 64 GB server. Your Buffer cache hit ratio performance counter should stay above 99% during peak loads.
  2. Only add CPU when you're sure you've indexed the database properly for your most frequent queries. A properly designed and inexed query can save a factor of 1000 (not a typo) in execution time over a poorly written one. Avoid nesting views, which developers love because of code re-use, but suck for performance because indexes often become useless. Any developer who users cursors claiming they're faster or "the only way to do it" must be fired immediately.
  3. Consider SSDs instead of mechanical disk if your working set is larger than the largest memory available in 2-socket server. That's currently 192 GB I think, so you're data sizes are nowhere near that yet.
  4. Essentially, do this.