Bash script to sleep at given CPU temperature ~ update for 16.04

This is a radically updated version of the question. I have to update this because this question is flagged as duplicate of this one, while this has no valid answer anymore.


  • Initial version of this question, which had the 11.04 tag:

I am new to scripting and linux, my comp gets too hot sometimes and I want to make a script to detect temp1 and if it's over 65 C, it must put it to sleep. I have a diffuculty at comparing values in the script, I couldn't manage to define figures correctly, would anybody fix it please? Here is my stab at it so far

#!/bin/bash

max=65

val=$   sensors | grep '^temp1:' | sed -e 's/.*: \+\([+-][0-9.]\+\)°C.*$/0\1/'

while true; do
    if [[ "$val" > "$max" ]]; then
        sudo /etc/acpi/sleep.sh force
        sleep 1
    else
        sleep 10
    fi
    clear
    sensors
done

The above got an answer with a script that according to the comments was at some point updated to work in 14.04:

 #!/bin/bash

 while true; do
    val=$(sensors | awk '/temp1/ {print $2}')
    max="+75.0"
    if [[ "$val" > "$max" ]]; then
        dbus-send --system --print-reply --dest="org.freedesktop.UPower" /org/freedesktop/UPower org.freedesktop.UPower.Suspend
    fi
    sleep 10
    clear
    sensors
 done
 exit 0

As indicated in the linked question, the above script doesn't work in 16.04.

That question got an answer with a simplistically modified version of the script:

 #!/bin/bash

 while true; do
    val=$(sensors | awk '/temp1/ {print $2}')
    max="+75.0"
    if [[ "$val" > "$max" ]]; then
        systemctl suspend
    fi
    clear
    sensors
 done
 exit 0

But while it does the job (the system goes to sleep when going above 75), it takes more CPU power than expected and pushes the temperature up while running up to 10 degrees celsius; this is more useful, more cooling when not running!

I don't know if the problem is with the initial 11.04 script or with the last change but this needs a fresh answer for 16.04.


I managed to make my own script.

#!/bin/bash
while true; do
   val=$(sensors | awk '/temp1/ {print $2}')
   max="+75.0"
   if [[ "$val" > "$max" ]]; then
       dbus-send --system --print-reply --dest="org.freedesktop.UPower" /org/freedesktop/UPower org.freedesktop.UPower.Suspend
   fi
   sleep 10
   clear
   sensors
done
exit 0

For 16.04 (also here):

#!/bin/bash
while true; do
   val=$(sensors | awk '/temp1/ {print $2}')
   max="+75.0"
   if [[ "$val" > "$max" ]]; then
       systemctl suspend
   fi
   sleep 10
   clear
   sensors
done
exit 0

Introduction

In order to facilitate both pre-systemd and post-systemd switch users, I've written a script that will take appropriate method of suspending based on your os version number. Essentially it does the same thing that OP's updated script does, except it forces to suspend despite inhibitors with -i flag. There are multiple considerations and improvements that can be done to the script, but as for right now this version does 90% of the job.

I tested it briefly on 16.04 LTS, works perfectly fine. In future I might rewrite this in Python just because I can or upon user request.

Script Source

#!/usr/bin/env bash

suspend_system(){

    os_version=$(awk -F'["=]' '/VERSION_ID/{print substr($3,1,2)}' /etc/os-release)
    if [ $os_version -ge 15   ]
    then
        systemctl suspend -i
        # Alternative way is to call login manager method via dbus
        # qdbus --system org.freedesktop.login1 /org/freedesktop/login1 \
        #       org.freedesktop.login1.Manager.Suspend True 

    else
        dbus-send --system --print-reply --dest="org.freedesktop.UPower"\
                  /org/freedesktop/UPower org.freedesktop.UPower.Suspend
    fi

}

is_critical_temp(){
    local temp=$(sensors | awk '/temp1/ {print substr($2,2,2)}')
    if [ $temp -gt 75  ]
    then
        return 0
    else
        return 1
    fi
}

main(){

    while true
    do
        if is_critical_temp
        then
           # optional dialog if running from GUI, not necessary if running form /etc/rc.local
           #zenity --info --text "Reached critical temperature. Suspending in 10 seconds" &
           sleep 10
           suspend_system
        fi
    sleep 3
    done
}
main "$@"

Alternative approach

A while back, I found pm-suspend utility, from pm-utils package. This program works regardless of the OS version. The small disadvantage is that it requires root access, but it is easy to get around of that inconvenience.

What I personally would do is the following:

  1. sudo apt-get install pm-utils
  2. sudo visudo and add yourusername ALL = NOPASSWD: /usr/sbin/pm-suspend at the end of the file.
  3. edit your script to call sudo pm-suspend instead of the dbus command.

Suspending an Ubuntu machine in general

You've requested that answers draw upon credible sources. In fact, Ask Ubuntu does have canonical post about suspending from command-line: How can I suspend/hibernate from command line? Depending on the approach and level of permissions you want your script to have, there's multiple ways to skin a cat there. Some work better than others. In my answer I provided the dbus and systemctl approach, since those work with screen-locking as well. If you write to /sys/class/power/state, it won't lock the screen, although it is possible to get around that with some scripting magic. For now, I think, the better approach is to simply determine the OS version and select appropriate method as in my script, or use pm-suspend alternative.


On my system every time sensors is run there is a stutter in video streaming. Having this happen every 10 seconds or however often proposed script is run would drive me bat crazy. A better solution to suspend would be using Intel's thermald and Powerclamp to slow down the CPU in order to reduce heat. I've written this answer for another question (Stop cpu from overheating) and am copying here for convenience.

Additionally the above script relies on temp1 which is often corrupted on my Ubuntu 16.04 and only temp3 is 100% reliable which doesn't show up on sensors. ie:

$ cat /sys/class/thermal/thermal_zone*/temp
27800
29800
58000

and from sensors:

acpitz-virtual-0
Adapter: Virtual device
temp1:        +27.8°C  (crit = +106.0°C)
temp2:        +29.8°C  (crit = +106.0°C)

This happens after suspend/resume. The REAL temperature is +58.0°C but is falsely reported as +27.8°C after resuming. So the heat protection would only work once to suspend and never work again until a reboot. So the system would hit critical (+106.0°C) at which point a hard power off is performed and data can be corrupted.

So here's my recommended solution to prevent overheating and utilizing CPU slow down rather than outright system suspend.

Slow down CPU to reduce heat

This works for Ubuntu 16.04+ with Intel Sandy Bridge and newer processors.

From (wiki.debian.org -thermald) is Debian's (used by Ubuntu) write up about thermald, a Linux daemon for cooling tablets and laptops. Once the system temperature reaches a certain threshold, the Linux daemon activates various cooling methods to cool the system.

Linux thermal daemon (thermald) monitors and controls temperature in laptops, tablets PC with the latest Intel sandy bridge and latest Intel CPU releases. Once the system temperature reaches a certain threshold, the Linux daemon activates various cooling methods to try to cool the system.

It operates in two modes:

Zero Configuration Mode

  • For most users, this should be enough to bring the CPU temperature of the system under control. This uses DTS temperature sensor and uses Intel P state driver, Power clamp driver, Running Average Power Limit control and cpufreq as cooling methods.

User defined configuration mode

  • This allows ACPI style configuration in a thermal XML configuration file. This can be used to fix the buggy ACPI configuration or fine tune by adding more sensors and cooling devices. This is a first step in implementing a close loop thermal control in user mode and can be enhanced based on community feedback and suggestions.

How to install

apt-get install thermald

Intel Powerclamp

Intel's Powerclamp driver is defined here (kernel.org - Intel Power Clamp.txt) and is part of thermald described above. A direct quote for Powerclamp from the link:

Consider the situation where a system’s power consumption must be reduced at runtime, due to power budget, thermal constraint, or noise level, and where active cooling is not preferred. Software managed passive power reduction must be performed to prevent the hardware actions that are designed for catastrophic scenarios.

Currently, P-states, T-states (clock modulation), and CPU offlining are used for CPU throttling.

On Intel CPUs, C-states provide effective power reduction, but so far they’re only used opportunistically, based on workload. With the development of intel_powerclamp driver, the method of synchronizing idle injection across all online CPU threads was introduced. The goal is to achieve forced and controllable C-state residency.

Test/Analysis has been made in the areas of power, performance, scalability, and user experience. In many cases, clear advantage is shown over taking the CPU offline or modulating the CPU clock.


How do you know Powerclamp is running?

Powerclamp might only show itself once a year when your fan vents get too much dust & lint. So how do you know it's actually running in the background? Use:

lsmod | grep intel

And you should see a list similar to this:

btintel                16384  1 btusb
bluetooth             520192  29 bnep,btbcm,btrtl,btusb,rfcomm,btintel
intel_rapl             20480  0
intel_powerclamp       16384  0
   (.... more intel drivers ....)
snd                    81920  18 snd_hwdep,snd_timer,snd_hda_codec_hdmi,snd_hda_codec_idt,snd_pcm,snd_seq,snd_rawmidi,snd_hda_codec_generic,snd_hda_codec,snd_hda_intel,snd_seq_device

If you see intel_rapl and intel_powerclamp you know it's working and simply waiting temps to exceed 85C.


Powerclamp in action displayed by Conky

Here is a screen shot when Powerclamp injects sleep cycles:

Kidie Injection

Normally on this system CPU clock speed is 2400 Mhz to 3400 Mhz when watching HTML5 video and 10 Chrome tabs open. Normally CPU utilization is about 9% to 12% across 8 CPUs. When things get too hot (86C) Powerclamp kicks in and this happens:

  • CPU speed is reduced to 1200 Mhz.
  • CPU utilization spikes up to 80%. This is misleading because the extra 70% is sleeping time.
  • The top 9 CPU processes are usually 5 or 6 Chrome processes plus Xorg, Conky, Pulse Audio and an occasional kworker. However now 8 of the top 10 are the kidle_inject/x process where x is from 0 to 7. For the first 8 CPUs.

The Powerclamp driver runs until temps drop below 85C again. While the driver is running you might have split second pausing in your videos and possibly split second keyboard and mouse lag.


Disable Intel Turbo Boost

Back in the "cool old days" of Ubuntu 14.04 Intel Turbo Boost was broken so my processor speed fluctuated between 1200 Mhz and 2400 Mhz. After upgrade to Ubuntu 16.04 it would go up to 3400 Mhz (3.4 Ghz) because Turbo Boost was finally working. But it also raised the heat.

To disable Intel Turbo Boost use:

echo "1" | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo