Performance-impact of Hyper-Threading
I just read an article on Heise Online (look at the table, the rest is German) which claimed, that Hyper-Threading slows down single-threaded programs although they don't use the second thread of a core. I.e. if you disable HT in the BIOS, the single-threaded app runs slightly faster.
Is this true or is this a measurement-error? Does anyone has sources about benchmarks which assert the same?
Solution 1:
It is likely not a measurement error. In fact, this is an eternal debate on the performance of games, since they are usually designed to have the maximum amount of single-core performance. According to this article from Intel article from Intel the Hyperthreading is:
Hyper-Threading Technology from Intel allows one physical processor package to be perceived as two separate logical processors within the operating system. Processor resources enabled for Hyper-Threading Technology duplicate, tag, or share the majority of resources. Sharing resources allows a more efficient use of the processor for a significant performance increase, at less than 5% die size and power consumption increase compared to a single processor package. However, Hyper-Threading Technology cannot have performance expectations equivalent to that of multiprocessing where all the processor resources are replicated.
In the table that you have shown, Cinebench tests one single core of the processor. In short, HT (HyperThreading) enables two virtual cores for one physical core (the one that will be evaluated in the test). If the test is based on launching a single process that does not need to be divided, sharing resources between two cores degrades the test result, since the balance that occurs when it's active doesn't happen when it's disabled (Windows and Cinebench only see a single processor).
If we add another test from Tom's Hardware to compare it with the table you have shown (Cinebench R11.5):
And multi-threaded:
The results on single-thread performance are not so different from the ones that you have shown in your page. It is important to note that the two logical processors that have separate execution states share resources such as the system bus or cache so they can not always parallelize the tasks, and it can happens sometimes thread stalling mentioned in this article that means that in the single-thread stress test, the resource sharing could tend to enqueuing some threads delivering a slightly worse performance result.
You can also see here how different scenarios in different games in the article of overclock.net were the results claims that in some cases the performance is hurt. I do not believe that this has to be taken as "disable HT improves the single-thread performance" but as "the game is optimized for a maximum of 4-cores" or "is not taking advantage of the HT". The first assumption can be validated reading some articles like this, which shows how the single-core performance of an i3 improves the performance if the HT is enabled comparing with i7 that it doesn't.
To sum up, we have seen that there are small cases that disabling HyperThreading has minimal improvements over the single thread performance, but the overall cost-benefit ratio it isn't enough to claim disabling HyperThreading. As far as the OS and the software it is designed to HT architecture, it is not worth to disable it.
Solution 2:
Yes and it should be obvious. When you enable HT you advertise twice as many cores as there are.
This is designed to let more parallelization happen on the basis that most programs are not sufficiently multi-threaded. However, if you fully multi-thread a program, then you overcommit resources and there is a performance drop just because of the extra overhead per thread. However small this may be, with an application than managed to use 100% CPU over any number of cores and processor, enabling HT resulted in a roughly 2-3% drop in performance.
Now in the case of an isolated single-threaded program, it sounds like it should not matter since the program itself cannot overuse resources but remember than the OS also thinks there are extra cores and that can overcommit resources. Even if there are still unused cores, one can measure overhead caused by the scheduler which does not optimally place the thread and lock it to a single real core.
These observations are based on over a decade of real-time software development and benchmarks. There is clearly an observable difference, although a very small one, when one tries to maximize the performance of a system.