Best practice on Linux servers and CPU/power throttling?
I am running a couple of Debian 6 (2.6.32) and 7 (3.2) Linux servers and all of them have energy saving settings enabled in their BIOS. Furthermore Linux shows that the CPUs are throttled if the servers are idling.
I wonder if this could cause any harm - could there be e.g. performance impacts because Linux would not be able to handle throttling correctly?
Is there a best practice for Linux servers and power/CPU throttling? Do you guys switch your energy profiles to "performance" or do you leave both the BIOS and the OS with their default settings?
The reason I am asking is that I encountered several performance issues on physical Dell servers although all values (CPU/load, memory, I/O, network etc.) seemed to be normal. After changing the BIOS power settings to "performance" in those specific cases, I was able to get rid of the performance issues.
Interesting question...
In general, I base the system performance profile on the application and intended use of the server. I typically work with:
- Low-latency transaction-heavy systems.
- Virtualization hosts (VMware).
- Linux-based ERP servers.
The systems that require deterministic performance and low-latency are typically set to a high-performance profile, disabling all C-States/P-States and any power throttling.
The Virtualization hosts can follow the same model, but if I'm power-constrained (like in a co-location facility) or the workload is minimal across the hosts/cluster, I will leave the default balanced power/performance profile enabled. That's typically because I'm charged for power and cooling in a data center, and may need to be able to consolidate more physical servers into a given footprint.
The ERP servers are typically standalone. Lighter workloads get the default balanced profile. Systems that require more specific tuning and have a heavier workload (24x7 operation) may see high-performance power profiles applied.
-- edit --
Again, performance tends to be more deterministic under the high-performance power profiles. It really depends on your specific application and what your users are experiencing (we can't tell you what to do). You state yourself that disabling the BIOS power-saving features corrected a performance problem you were having.
For Linux, download the PowerTop utility and experiment to understand what your CPUs are doing under realistic workloads.
I think it might be a benefit for others stumbling accross this question if I post my latest knowledge gain here as a reply.
I talked to Dell and Intel since in my specific situation, Linux is not able to scale up throttled CPUs again in certain situations. Dell replied that this issue is known and occurs both with VMWare hypervisors and many Linux variants, so it is not Debian or Dell model specific. As far as I can tell, all Dell systems using Intel CPUs can be affacted and of course it is also possible that other hardware vendors share this problem.
Dell claims in a mail written in German:
- Linux fails to negotiate power settings with the hardware in my specific cases
- Updating both the OS and firmware might help
- Using the "Performance" profile is a known workaround
Looking at Dell's mail it seems that there is no way to fix this, only a workaround. Therefore the reply to my own question would be:
In order to prevent possible performance or CPU scaling issues with your servers, I highly recommend to put all your servers in the datacenter into "max. power" mode.