Will disabling hyperthreading improve performance on our SQL Server install

I still feel that testing your specific workload, as per the original answer, is the only way to be sure. It's not an ideal answer when you're trying to tune a production system (so I'd ask if it was possible to get an identical testbed in systems where both performance and availability really matter) but it's the only one I'm really comfortable with.

We can talk about the theory of whether or not Hyperthreading should hurt or improve things in general (I find it to be more likely to hurt than help on servers so for a "generic" deployment I'd probably disable it), but there is only one way to see for sure if it's going to make a difference in your specific case, and that is try it and see.


I agree that

  • at best the recommendation is "try HyperThreading on your workload and see what happens". We are doing this right now as I type, and.. it ain't good!
  • you should probably always start with HyperThreading disabled, as that is safest

It looks like we should be tuning two things:

  1. MAXDOP (Maximum Degrees of Parallelism). Everything I read indicates that having this unbounded is probably a bad idea, and the Microsoft documentation says:

    Setting this option [MAXDOP] to a larger value [than 8] often causes unwanted resource consumption and performance degradation.

    anything higher than 8 is not generally recommended .. so I set it to 4 for now. It was zero (unbounded) initially.

  2. Cost Threshold for Parallelism. Apparently the default of 5 here is considered a pretty low default according to a few SQL MVP posts I've found -- we can tune it up to reduce how much parallelism is even attempted by the scheduler.

But honestly these feel like workarounds; I think the true solution for our workload (full-text index heavy) is to disable HT.


Anandtech found that with the pure read load, it hurt a little, and with a write heavy load, it was a bit of a win. I've not seen anything to make me think it is going to get you a hit much worse than -5%, or a win much better than 15%. Note what with a Atom, it is a huge win, but that is a very odd cpu.

All you changed was the cpu? You went from 12MB cache and 4 threads, so 3MB of cache per thread, to 8 MB of cache, and 8 threads, so 1MB per thread. Now, that is oversimplifying, but I bet that is what is killing you, you used to run queries in cache, and now run them from RAM because they need more than 1MB but less than 3MB. Turning off HT will probably help, but I'd go back to the old CPU. Turn off HT, and you get 2MB per thread, but if your workload thrashes with that much, it will not help. It may well be that the old 12MB cache cpu is hugely faster for your workload.

I would try turning HT off, and see if that is an improvement, but I suspect that cache is king for your work load, and you may well need to go back to the 12 MB chip.


Hyperthreading is, at best, just a way of abstracting task switching away from the operating system and placing it on-die, with direct access to the L1 and L2 cache, which makes task switching a crapload faster.

Testing with VMWare have indicated that disabling HT made no discernable difference under standard load, and a 5% increase under heavy load, due to the fact that ESXi is smart enough to know the difference between the "real" thread and the "fake" thread (there's a lot more to it than that, but that's in laymens terms). SQL Server 2005 isn't quite that smart, but it combined with an up-to-date operating system there should be little advantage to disabling HT.

All that said, I agree with Ronald that it's most likely going to be your L2 cache. A 33% drop in cache size is substantial, and when we spec our SQL Servers we always go for cache over raw clock speed every time.


Based on my experience, HT was making I/O operations take forever on my of my active nodes on a Windows 2008 R2 Cluster (running SQL Server 2008 R2). An interesting fact was that it was neither reflected in the wait stats nor in the pssdiag I ran for Microsoft support.

The way I noticed low I/O was just by watching the OS counters for physical disk. As Sam pointed out, I wrote about it here and here

If you do NOT experience I/O problems and are CPU bound I suggest you start this way:

Pinpoint which processes and T-SQL blocks are causing the most CPU utilization. In our experience, after we fixed the problem with I/O (by turning HT off) we identified code that was performing horribly in 2008 R2 and doing fine in 2005. I wrote about it here.

While under high load, run Adam Machanic's sp_whoisactive. You can download it from here. We were experiencing a very high CPU utilization due to the excessive amount of logical reads (20 million per query) due to a really bad plan. Our processes were performing anti-semi joins with tables that were partitioned.

My next recommendation is to run profiler to identify a set of T-SQL code that are both high in CPU and I/O logical reads.

With the steps above we were able to tune the offending processes and go from 85% sustained CPU utilization to almost nil.

Good Luck and please feel free to drop me a line if you find a fix as I would like to add the case to my blog.

Thanks

Oscar