Black screen, nvidia-modeset : ERROR: GPU:0 Idling display engine time out

How do I troubleshoot/fix this issue?

Using a nvidia GPU, the system is able to show the motherboard firmware logo and Grub screen. After a Ubuntu version is selected, either 18.04 or 20.01, Ubuntu login screen can't be shown. Instead, I see a black screen appears complaining that nvidia-modeset : ERROR: GPU:0 Idling display engine time out three times (see attached photo), followed by a pure black screen with the GPU revving up to full fan speed continuously and becoming very hot. I had to press the power button to shut down the system.

error

This GPU had worked well prior to this incident. The Ubuntu system is able to properly boot up when the GPU is removed and when the Intel CPU's integrated graphics is plugged in to the monitor. IGPU is disabled.

Installed nvidia packages:

$ dpkg -l | grep nvidia
ii  libnvidia-cfg1-470:amd64                   470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-470                       470.57.02-0ubuntu0.18.04.1                       all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-470:amd64                470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA libcompute package
ii  libnvidia-compute-470:i386                 470.57.02-0ubuntu0.18.04.1                       i386         NVIDIA libcompute package
ii  libnvidia-decode-470:amd64                 470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-470:i386                  470.57.02-0ubuntu0.18.04.1                       i386         NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-470:amd64                 470.57.02-0ubuntu0.18.04.1                       amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-470:i386                  470.57.02-0ubuntu0.18.04.1                       i386         NVENC Video Encoding runtime library
ii  libnvidia-extra-470:amd64                  470.57.02-0ubuntu0.18.04.1                       amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-470:amd64                   470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-470:i386                    470.57.02-0ubuntu0.18.04.1                       i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-470:amd64                     470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-470:i386                      470.57.02-0ubuntu0.18.04.1                       i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-ifr1-470:amd64                   470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA OpenGL-based Inband Frame Readback runtime library
ii  libnvidia-ifr1-470:i386                    470.57.02-0ubuntu0.18.04.1                       i386         NVIDIA OpenGL-based Inband Frame Readback runtime library
ii  nvidia-compute-utils-470                   470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA compute utilities
ii  nvidia-dkms-470                            470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA DKMS package
ii  nvidia-driver-470                          470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA driver metapackage
ii  nvidia-kernel-common-470                   470.57.02-0ubuntu0.18.04.1                       amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-470                   470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA kernel source package
ii  nvidia-prime                               0.8.16~0.18.04.1                                 all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                            470.57.01-0ubuntu0.18.04.1                       amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-470                           470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA driver support binaries
ii  xserver-xorg-video-nvidia-470              470.57.02-0ubuntu0.18.04.1                       amd64        NVIDIA binary Xorg driver

 

Solution 1:

I had this GPU tested on a Windows system, which was able to display boot screen, login screen and desktop. However, the display artifacts persisted. Also, I suspect Windows was able to step down the resolution.

I came across this youtube video showing the same display artifacts and using NVidea MOD and MATS found the issue originated at one of the GPU VRAMs. Replacing the VRAM fixed the display issue.

As this GPU has been well maintained, I wondered if the GPU display fault was due to faulty interconnects. I came across this other youtube video that showed that reheating the GPU board with a heat gun for 6 to 8 mins had a 10% success rate of fixing the GPU card. He recommended this treatment as a last resort. I heated the GPU side of the card with a heat gun for around 4 mins. Thereafter, I flipped the card over and heated it for another 2 mins or so. After the GPU card cooled down, I tested it and found that its functionality is restored. The reheating procedure fixed the GPU card. Earlier, the GPU card was cleaned but was not heated treated; that procedure alone did not fix the GPU.