Current wisdom on SQL Server and Hyperthreading?
Lots of articles out there (see Slava Oks's original SQL 2000 article and Kevin Kline's SQL 2005 update) recommend disabling hyperthreading on SQL servers, or at least testing your specific workload before enabling it on your servers.
This issue is gradually becoming less relevant as true multi-core processors replace hyperthreaded ones, but what's the current wisdom on this issue? Does this advice change any with SQL 2005 64-bit, or SQL 2008, or Windows Server 2008?
Ideally, this should be tested in advance in a staging environment, but what about for servers that have already made it into production with HT enabled? How can I tell if performance issues we're experiencing might be related to HT? Is there some specific combination of perfmon counters that might point me in that direction, as opposed to all the other things I normally pursue when working on improving SQL performance?
Edit: This is especially attractive because of the potential for an across the board improvement for some of my high-cpu servers, but the client is going to want to see something concrete that helps me identify which servers really could benefit from disabling hyperthreading. Of course, conventional performance troubleshooting is ongoing, but sometimes any little bit helps.
Solution 1:
In the SQLOS a scheduler is created for each logical processor that SQL Server sees. With hyperthreading enabled this equates to double the schedulers. One of the purposes for the SQLOS is to minimize and prevent context switching from occuring which is why only one scheduler is created for each logical processor. Once the SQLOS creates the schedulers, the total number of workers is divided amongst the schedulers. The SQLOS implements a form of cooperative scheduling in that the workers yield the scheduler as it requires unavailable resources, or reaches its execution quantum allowing other workers to execute on the scheduler. This keeps context switching to a minimum since the scheduler is doing the work and they are bound one for one.
Understanding that background, hyperthreading somewhat works opposite of how SQLOS is specifically designed to function. Specifically, parallelism can be problematic with hyperthreading and can result in high CXPACKET waits since SQLOS may try to run a query at DOP 8 on what is reality a DOP 4 system. If your CPU utilization is low, you might not notice, but the higher your CPU utilization goes the more problematic it could become. I recently had a discussion on twitter regarding this, and the consensus was "It Depends" as to whether or not it would help or hurt.
If you have a lot of signal waits on your server, but you have low CPU utilization, you may see benefit enabling hyperthreading which will double your internal schedulers and spread the workers out more which means they won't wait to execute in the runnable queue as long. However if your workload utilized parallelism heavily you get heavy CXPACKET waits in sys.dm_os_wait_stats, you might look at disabling hyperthreading to see if it reduces your waits.
Solution 2:
Actually hyperthreading hasn't gone away, Intel's new Nehalem quad-core chips have hyperthreading. As for whether it's recommended or not, "it depends" as always, each situation is likely different.
It's unlikely HT is the root cause of any performance problem, but without more details, it's difficult to say what is. Do some perfmon sessions, with fairly long sample rates, like once every 30 - 60 seconds, and getting cpu, disk queue length on the data and log disks, page life expectancy is a good measure of whether you have enough memory. If you're consistenly over 80% CPU, or you disk queues are up in the triple digits, you found your problem.
Solution 3:
SQL Server 2000 on an HP DL360 G3:
I ran a report six times and I threw the initial run away and recorded the last five. I then rebooted the database again and turned Hyperthreading back on. I ran the report six more times, retaining the data from the last five runs.
Observations:
- Turning off Hyperthreading does not cost 2x in apparent CPU usage, only about 1.5x.
- Turning off Hyperthreading makes a random long running report run 5.2% faster (on average).
Repeated runs of the report:
With Hyperthreading: 20.95%, 21.1%, 21.9%, 20.8%, 20.5% Run times in ms: 20282, 21141, 20188, 22297, 25282. Average 21838 Without Hyperthreading: 29.4%, 28.2%, 29.1%, 28.2%, 27.1% Run times in ms: 20125, 20156, 19937, 21656, 21656. Average 20706