My cpu slows down after a while and does not recover
Sometimes, and I can't reproduce it (but it happens often enough - a few times a week at least) my cpu slows down below its prescribed minimum. This is an example cpufreq-info
output from a minute ago:
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to [email protected], please.
analyzing CPU 0:
driver: intel_pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: 0.97 ms.
hardware limits: 800 MHz - 3.30 GHz
available cpufreq governors: performance, powersave
current policy: frequency should be within 800 MHz and 3.30 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency is 610 MHz.
analyzing CPU 1:
driver: intel_pstate
CPUs which run at the same hardware frequency: 1
CPUs which need to have their frequency coordinated by software: 1
maximum transition latency: 0.97 ms.
hardware limits: 800 MHz - 3.30 GHz
available cpufreq governors: performance, powersave
current policy: frequency should be within 800 MHz and 3.30 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency is 615 MHz.
analyzing CPU 2:
driver: intel_pstate
CPUs which run at the same hardware frequency: 2
CPUs which need to have their frequency coordinated by software: 2
maximum transition latency: 0.97 ms.
hardware limits: 800 MHz - 3.30 GHz
available cpufreq governors: performance, powersave
current policy: frequency should be within 800 MHz and 3.30 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency is 590 MHz.
analyzing CPU 3:
driver: intel_pstate
CPUs which run at the same hardware frequency: 3
CPUs which need to have their frequency coordinated by software: 3
maximum transition latency: 0.97 ms.
hardware limits: 800 MHz - 3.30 GHz
available cpufreq governors: performance, powersave
current policy: frequency should be within 800 MHz and 3.30 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency is 589 MHz.
The problem is that it really slows everything down. Firefox becomes slower, vim's startup time grows from 150-250ms to above 700ms, g++
compilations become three times slower, etc.
Restart fixes everything.
Some error line from the past couple of hours from my syslog:
May 17 16:10:53 lati kernel: [ 1421.872755] ACPI Error: Index value 0x0000000000000083 overflows field width 0x7 (20140424/exfldio-343)
May 17 16:10:53 lati kernel: [ 1421.872758] ACPI Error: Method parse/execution failed [\NEVT] (Node ffff88040e047258), AE_AML_REGISTER_LIMIT (20140424/psparse-536)
May 17 16:10:53 lati kernel: [ 1421.872761] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPCB.ECDV._Q66] (Node ffff88040e044b90), AE_AML_REGISTER_LIMIT (20140424/psparse-536)
May 17 16:10:56 lati kernel: [ 1425.907749] ACPI Error: Index value 0x0000000000000083 overflows field width 0x7 (20140424/exfldio-343)
May 17 16:10:56 lati kernel: [ 1425.907765] ACPI Error: Method parse/execution failed [\NEVT] (Node ffff88040e047258), AE_AML_REGISTER_LIMIT (20140424/psparse-536)
May 17 16:10:56 lati kernel: [ 1425.907794] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPCB.ECDV._Q66] (Node ffff88040e044b90), AE_AML_REGISTER_LIMIT (20140424/psparse-536)
May 17 16:12:09 lati kernel: [ 1.925333] EXT4-fs (sda5): re-mounted. Opts: errors=remount-ro
May 17 16:12:09 lati kernel: [ 2.421037] systemd-udevd[331]: Error calling EVIOCSKEYCODE: Invalid argument
May 17 16:12:21 lati gnome-session[2251]: WARNING: Could not parse desktop file tracker-store.desktop or it references a not found TryExec binary
May 17 16:12:21 lati gnome-session[2251]: WARNING: Could not parse desktop file tracker-miner-fs.desktop or it references a not found TryExec binary
May 17 16:12:51 lati gnome-session[2251]: GLib-CRITICAL: g_environ_setenv: assertion 'value != NULL' failed
May 17 17:48:19 lati kernel: [ 5769.576717] systemd-hostnamed[6983]: Warning: nss-myhostname is not installed. Changing the local hostname might make it unresolveable. Please install nss-myhostname!
I am using Ubuntu 14.04.2, fresh install, 64bit, on Dell E7440, bios version A14.
By the way, even the execution of lsb_release
, when I'm on this mode, is taking about 400ms.
Extra info
- My processor model name: Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
- My processor model number: 69
- It probably occurs only after a suspend, but usually it doesn't happen (for example, it has happened only once since I've asked this question).
Extra info (2)
Output from grep -r . *
in /sys/class/thermal
:
cooling_device0/type:Processor
cooling_device0/power/control:auto
cooling_device0/power/async:disabled
cooling_device0/power/runtime_enabled:disabled
cooling_device0/power/runtime_active_kids:0
cooling_device0/power/runtime_active_time:0
cooling_device0/power/runtime_status:unsupported
cooling_device0/power/runtime_usage:0
cooling_device0/power/runtime_suspended_time:0
cooling_device0/cur_state:0
cooling_device0/max_state:3
cooling_device1/type:Processor
cooling_device1/power/control:auto
cooling_device1/power/async:disabled
cooling_device1/power/runtime_enabled:disabled
cooling_device1/power/runtime_active_kids:0
cooling_device1/power/runtime_active_time:0
cooling_device1/power/runtime_status:unsupported
cooling_device1/power/runtime_usage:0
cooling_device1/power/runtime_suspended_time:0
cooling_device1/cur_state:0
cooling_device1/max_state:3
cooling_device2/type:Processor
cooling_device2/power/control:auto
cooling_device2/power/async:disabled
cooling_device2/power/runtime_enabled:disabled
cooling_device2/power/runtime_active_kids:0
cooling_device2/power/runtime_active_time:0
cooling_device2/power/runtime_status:unsupported
cooling_device2/power/runtime_usage:0
cooling_device2/power/runtime_suspended_time:0
cooling_device2/cur_state:0
cooling_device2/max_state:3
cooling_device3/type:Processor
cooling_device3/power/control:auto
cooling_device3/power/async:disabled
cooling_device3/power/runtime_enabled:disabled
cooling_device3/power/runtime_active_kids:0
cooling_device3/power/runtime_active_time:0
cooling_device3/power/runtime_status:unsupported
cooling_device3/power/runtime_usage:0
cooling_device3/power/runtime_suspended_time:0
cooling_device3/cur_state:0
cooling_device3/max_state:3
cooling_device4/type:intel_powerclamp
cooling_device4/power/control:auto
cooling_device4/power/async:disabled
cooling_device4/power/runtime_enabled:disabled
cooling_device4/power/runtime_active_kids:0
cooling_device4/power/runtime_active_time:0
cooling_device4/power/runtime_status:unsupported
cooling_device4/power/runtime_usage:0
cooling_device4/power/runtime_suspended_time:0
cooling_device4/cur_state:-1
cooling_device4/max_state:50
thermal_zone0/mode:enabled
thermal_zone0/temp:25000
thermal_zone0/type:acpitz
thermal_zone0/power/control:auto
thermal_zone0/power/async:disabled
thermal_zone0/power/runtime_enabled:disabled
thermal_zone0/power/runtime_active_kids:0
thermal_zone0/power/runtime_active_time:0
thermal_zone0/power/runtime_status:unsupported
thermal_zone0/power/runtime_usage:0
thermal_zone0/power/runtime_suspended_time:0
thermal_zone0/trip_point_0_temp:107000
thermal_zone0/trip_point_0_type:critical
thermal_zone0/policy:step_wise
thermal_zone0/passive:0
thermal_zone1/temp:47000
thermal_zone1/type:x86_pkg_temp
thermal_zone1/power/control:auto
thermal_zone1/power/async:disabled
thermal_zone1/power/runtime_enabled:disabled
thermal_zone1/power/runtime_active_kids:0
thermal_zone1/power/runtime_active_time:0
thermal_zone1/power/runtime_status:unsupported
thermal_zone1/power/runtime_usage:0
thermal_zone1/power/runtime_suspended_time:0
thermal_zone1/trip_point_0_temp:0
thermal_zone1/trip_point_0_type:passive
thermal_zone1/trip_point_1_temp:0
thermal_zone1/trip_point_1_type:passive
thermal_zone1/policy:step_wise
Solution 1:
Does the problem still occur?
I am eagerly looking for confirmation or denial of what I think is happening.
The theory is that somehow (a BIOS issue is suspected), after a suspend Clock Modulation has become enabled. The current version of the intel_pstate driver is incompatible with any use of Clock Modulation, always resulting in driving the target pstate to the minimum, regardless of load. The result is the apparent CPU frequency stuck at
minimum * modulation percent. The acpi-cpufreq driver works fine with Clock Modulation, resulting in desired frequency * modulation percent. (i.e. the issue is less obvious with the acpi-cpufreq driver.)
Please do the following tests:
1.) (needed once per boot)
sudo modprobe msr
2.) before any suspend:
sudo rdmsr -a 0x19a
3.) after a suspend that results in the low CPU frequencies:
sudo rdmsr -a 0x19a
4.) If the result from step 3 is not 0, then:
sudo wrmsr -a 0x19a 0x0
and check it:
sudo rdmsr -a 0x19a
5.) Are the CPU frequencies O.K. now?
Post back here all the outputs.
Note: rdmsr and wrmsr are contained in the msr-tools package, which I do not recall if it is installed by default or not.
EDIT:
If you can, the intel subject matter expert on thermal interactions and pstates also wants the output from:
cd /sys/class/thermal
grep -r . *