CPU temperature spike in 90c+ only when plugged in

My Asus Vivobook K571GT dual booting in Ubuntu 20.04 is recently started shutting down due to high temperature (reaching 99c+). These temperature are reached only when the laptop is plugged in.

The BIOS is updated to the latest, Ubuntu updated to the latest kernel. I've seen it might be due to nvidia driver not installed properly, so I tried a bunch of different nvidia drivers (460, 470 & 495). Tried disabling nvdia altogether running only with the integrated GPU. They all had the same results, when plugged in the temperature spike from a respectable 40c-45c to 95c in a second (without that much CPU load, i.e. running the apt update command will make the CPU temperature rise to 90c+), if I don't stop what I am doing or a command is running & I can't stop it in time the CPU will hit the 100c mark which trigger the shutdown. Interestingly if I unplugged while I get a high temperature warning the temperature goes back down to 45-50c in a second.

Has anyone experience something similar? The only thing I can think of for the rapid CPU temperature spike when plugged in but not on battery is the CPU getting "overclocked" when somehow. I'm not sure how I can verify this & if it somehow does how to prevent this from happening? An hardware issue like the AC adapter providing too much power?

Edit

grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver

/sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu10/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu11/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu1/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu2/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu3/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu4/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu5/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu6/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu7/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu8/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu9/cpufreq/scaling_driver:intel_pstate

grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu10/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu11/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu1/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu2/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu3/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu4/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu5/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu6/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu8/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu9/cpufreq/scaling_governor:powersave

grep "model name" /proc/cpuinfo

model name  : Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz

cat /sys/devices/system/cpu/intel_pstate/no_turbo

0

Edit

ps auxc | grep -i therm

root         167  0.0  0.0      0     0 ?        I<   10:18   0:00 acpi_thermal_pm
root        1049  0.0  0.0 128808  9456 ?        Ssl  10:18   0:00 thermald

sudo dmidecode -s bios-version

X571GT.311

ls -al /etc/thermald

total 28
drwxr-xr-x   2 root root  4096 Sep  8 13:48 .
drwxr-xr-x 148 root root 12288 Nov  2 12:01 ..
-rw-r--r--   1 root root  4605 Jan 14  2019 thermal-conf.xml
-rw-r--r--   1 root root   508 Jan 14  2019 thermal-cpu-cdev-order.xml

The laptop is just a year or two old. The latest BIOS update was release just a couple of weeks ago.

cat /etc/thermald/thermal-conf.xml

<?xml version="1.0"?>

<!--
use "man thermal-conf.xml" for details
-->

<!-- BEGIN -->
<ThermalConfiguration>
<Platform>
    <Name>Generic X86 Laptop Device</Name>
    <ProductName>EXAMPLE_SYSTEM</ProductName>
    <Preference>QUIET</Preference>
    <ThermalSensors>
        <ThermalSensor>
            <Type>TSKN</Type>
            <AsyncCapable>1</AsyncCapable>
        </ThermalSensor>
    </ThermalSensors>
    <ThermalZones>
        <ThermalZone>
            <Type>SKIN</Type>
            <TripPoints>
                <TripPoint>
                    <SensorType>TSKN</SensorType>
                    <Temperature>55000</Temperature>
                    <type>passive</type>
                    <ControlType>SEQUENTIAL</ControlType>
                    <CoolingDevice>
                        <index>1</index>
                        <type>rapl_controller</type>
                        <influence> 100 </influence>
                        <SamplingPeriod> 16 </SamplingPeriod>
                    </CoolingDevice>
                    <CoolingDevice>
                        <index>2</index>
                        <type>intel_powerclamp</type>
                        <influence> 100 </influence>
                        <SamplingPeriod> 12 </SamplingPeriod>
                    </CoolingDevice>
                </TripPoint>
            </TripPoints>
        </ThermalZone>
    </ThermalZones>
</Platform>

<!-- Thermal configuration example only -->
<Platform>
    <Name>Example Platform Name</Name>
    <!--UUID is optional, if present this will be matched -->
    <!-- Both product name and UUID can contain
        wild card "*", which matches any platform
     -->
    <UUID>Example UUID</UUID>
    <ProductName>Example Product Name</ProductName>
    <Preference>QUIET</Preference>
    <ThermalSensors>
        <ThermalSensor>
            <!-- New Sensor with a type and path -->
            <Type>example_sensor_1</Type>
            <Path>/some_path</Path>
            <AsyncCapable>0</AsyncCapable>
        </ThermalSensor>
        <ThermalSensor>
            <!-- Already present in thermal sysfs,
                enable this or add/change config
                For example, here we are indicating that
                sensor can do async events to avoid polling
            -->
            <Type>example_thermal_sysfs_sensor</Type>
            <!-- If async capable, then we don't need to poll -->
            <AsyncCapable>1</AsyncCapable>
        </ThermalSensor>
        <ThermalSensor>
            <!-- Examle of a virtual sensor. This sensor
                depends on other real sensor or
                virtual sensor.
                E.g. here the temp will be
                 temp of example_sensor_1 * 0.5 + 10
            -->
            <Type>example_virtual_sensor</Type>
            <Virtual>1</Virtual>
            <SensorLink>
                <SensorType>example_sensor_1</SensorType>
                <Multiplier> 0.5 </Multiplier>
                <Offset> 10 </Offset>
            </SensorLink>
        </ThermalSensor>

    </ThermalSensors>
    <ThermalZones>
        <ThermalZone>
            <Type>Example Zone type</Type>
            <TripPoints>
                <TripPoint>
                    <SensorType>example_sensor_1</SensorType>
                    <!-- Temperature at which to take action -->
                    <Temperature> 75000 </Temperature>
                    <!-- max/passive/active
                        If a MAX type is specified, then
                        daemon will use PID control
                        to aggresively throttle to avoid
                        reaching this temp.
                     -->
                    <type>max</type>
                    <!-- SEQUENTIAL | PARALLEL
                    When a trip point temp is violated, then
                    number of cooling device can be activated.
                    If control type is SEQUENTIAL then
                    It will exhaust first cooling device before trying
                    next.
                    -->
                    <ControlType>SEQUENTIAL</ControlType>
                    <CoolingDevice>
                        <index>1</index>
                        <type>example_cooling_device</type>
                        <!-- Influence will be used order cooling devices.
                            First cooling device will be used, which has
                            highest influence.
                        -->
                        <influence> 100 </influence>
                        <!-- Delay in using this cdev, this takes some time
                        too actually cool a zone
                        -->
                        <SamplingPeriod> 12 </SamplingPeriod>
                    </CoolingDevice>
                </TripPoint>

            </TripPoints>
        </ThermalZone>
    </ThermalZones>
    <CoolingDevices>
        <CoolingDevice>
            <!--
                Cooling device can be specified
                by a type and optionally a sysfs path
                If the type already present in thermal sysfs
                no need of a path.
                Compensation can use min/max and step size
                to increasing cool the system.
                Debounce period can be used to force
                a waiting period for action
            -->
            <Type>example_cooling_device</Type>
            <MinState>0</MinState>
            <IncDecStep>10</IncDecStep>
            <ReadBack> 0 </ReadBack>
            <MaxState>50</MaxState>
            <DebouncePeriod>5000</DebouncePeriod>
            <!--
                If there are no PID parameter
                compensation increase step wise and exponentaially
                if single step is not able to change trend.
                Alternatively a PID parameters can be specified
                then next step will use PID calculation using
                provided PID constants.
            -->>
            <PidControl>
                <kp>0.001</kp>
                <kd>0.0001</kd>
                <ki>0.0001</ki>
            </PidControl>
        </CoolingDevice>
    </CoolingDevices>
</Platform>
</ThermalConfiguration>
<!-- END -->

top

top - 13:16:27 up  1:37,  1 user,  load average: 0.85, 1.32, 1.11
Tasks: 487 total,   2 running, 484 sleeping,   1 stopped,   0 zombie
%Cpu(s):  5.1 us,  2.0 sy,  1.5 ni, 90.6 id,  0.1 wa,  0.0 hi,  0.7 si,  0.0 st
GiB Mem :     15.5 total,      4.5 free,      5.0 used,      5.9 buff/cache
GiB Swap:      2.0 total,      2.0 free,      0.0 used.     10.1 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                       
  35883 root      39  19   84636  68132  12616 R  19.8   0.4   0:00.60 apt-check                     
   4842 haleks    20   0 4487900 483220 120988 S   2.6   3.0   1:49.49 gnome-shell                   
   7291 haleks    20   0  923372  60172  45804 S   2.3   0.4   1:34.25 psensor                       
  32705 haleks    20   0   24.5g 130676  77652 S   2.3   0.8   0:14.20 brave                         
    975 message+  20   0   40380  34872   4068 S   1.0   0.2   0:31.14 dbus-daemon                   
   1002 root      20   0 2332860  32620  16456 S   1.0   0.2   0:05.98 snapd                         
   4555 haleks    20   0   24.7g 147872  79744 S   1.0   0.9   1:10.25 Xorg                          
   5229 haleks    20   0 2258744 131912  45796 S   1.0   0.8   1:16.97 keybase                       
  35782 root      20   0  287276  16044  14104 S   1.0   0.1   0:00.03 packagekitd                   
    663 root     -51   0       0      0      0 S   0.7   0.0   0:38.09 irq/152-nvidia                
  21473 haleks    20   0  819496  53768  39012 S   0.7   0.3   0:07.86 gnome-terminal-               
  32564 haleks    20   0   16.6g 410380 190120 S   0.7   2.5   0:42.65 brave                         
  32596 haleks    20   0   16.6g 182632  87372 S   0.7   1.1   0:47.20 brave                         
  34076 root      20   0   25368  13280   7900 S   0.7   0.1   0:00.16 apt                           
    357 root      19  -1   68944  30764  29000 S   0.3   0.2   0:01.12 systemd-journal               
    387 root      20   0   24164   7796   4236 S   0.3   0.0   0:02.20 systemd-udevd                 
    517 root     -51   0       0      0      0 S   0.3   0.0   0:00.73 irq/148-iwlwifi               
    992 root      20   0  235188  10276   6928 S   0.3   0.1   0:02.17 polkitd                       
   1065 root      20   0  716580  12360   9072 S   0.3   0.1   0:01.60 canonical-livep               
   1349 gdm       20   0  317300   9004   7968 S   0.3   0.1   0:00.28 goa-identity-se               
   1864 root      20   0 2432052 150584  31964 S   0.3   0.9   0:07.40 lxd                           
   4545 haleks    20   0    8748   5860   4012 S   0.3   0.0   0:01.37 dbus-daemon                   
   5448 haleks    20   0 2370936 172572  33964 S   0.3   1.1   0:27.26 kbfsfuse                      
   7473 haleks    20   0  503408 143448  66476 S   0.3   0.9   0:35.84 Keybase                       
   7575 haleks    20   0  463344  40076  32528 S   0.3   0.2   0:00.39 update-notifier               
  10111 haleks    20   0  582224 166968  80480 S   0.3   1.0   0:37.21 gitkraken                     
  32662 haleks    20   0   24.4g 121680  81520 S   0.3   0.7   0:03.68 brave                         
  35783 root      20   0   24164   5228   1652 S   0.3   0.0   0:00.01 systemd-udevd                 
  35784 root      20   0   24164   5228   1652 S   0.3   0.0   0:00.01 systemd-udevd                 
  35786 root      20   0   24164   5228   1652 S   0.3   0.0   0:00.01 systemd-udevd                 
      1 root      20   0  168176  12092   8296 S   0.0   0.1   0:08.88 systemd                       
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.02 kthreadd                      
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp                        
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp                    
      6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-kblockd          
      9 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_percpu_wq                  
     10 root      20   0       0      0      0 S   0.0   0.0   0:00.11 ksoftirqd/0                   
     11 root      20   0       0      0      0 I   0.0   0.0   0:09.66 rcu_sched                     
     12 root      rt   0       0      0      0 S   0.0   0.0   0:00.02 migration/0                   
     13 root     -51   0       0      0      0 S   0.0   0.0   0:00.00 idle_inject/0                 
     14 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/0                       
     15 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/1                       
     16 root     -51   0       0      0      0 S   0.0   0.0   0:00.00 idle_inject/1                 
     17 root      rt   0       0      0      0 S   0.0   0.0   0:00.18 migration/1                   
     18 root      20   0       0      0      0 S   0.0   0.0   0:00.06 ksoftirqd/1                   

Solution 1:

Your /etc/thermald/thermal-conf.xml is incorrect. It's two example files tacked together.

Try this somewhat generic .xml file shown below.

Note: You may end up customizing the following line...

<Temperature>60000</Temperature>

Then restart thermald with:

sudo systemctl restart thermald

<?xml version="1.0"?>
<ThermalConfiguration>
  <Platform>
    <Name>Override CPU default passive</Name>
    <ProductName>*</ProductName>
    <Preference>QUIET</Preference>
    <ThermalZones>
      <ThermalZone>
        <Type>cpu</Type>
        <TripPoints>
          <TripPoint>
            <Temperature>60000</Temperature>
            <type>passive</type>
          </TripPoint>
        </TripPoints>
      </ThermalZone>
    </ThermalZones>
  </Platform>
</ThermalConfiguration>