Hard Drive suddenly became unresponsive, recovered on reboot. What to do next?

So I have a desktop PC with an SSD and an HDD. Windows 10 is installed on the SSD, all my data is on the HDD. Today while I was playing a game, it suddenly became unresponsive. At the same time, the video I was watching on my second monitor froze as well. The game is installed on the hard drive, the video file was on that hard drive as well. Both programs became unresponsive, while all other programs that are installed on the SSD continued to run fine. After about 2 minutes, the game crashed and the window disappeared, along with VLC player (which was playing the video file). I checked the Explorer, and the E: and D: partitions on that hard drive didn't show up anymore.

So I rebooted and the drive shows up again. . I ran CHKDSK on the drive, which ran fine and returned no errors or bad sectors (CMD output below). I also installed SeaTools and ran a couple of the simple tests on the drive, all of which ran fine and didn't show any error.

So now I'm not sure what to do. All my data is continuously backed up, so if the drive crashed and didn't come back, I wouldn't lose much. Since the SSD doesn't seem to be affected, I could probably just install a new hard drive without even having to reinstall Windows. But of course I don't want to throw away a perfectly good 2 TB hard drive, and since everything seems to be fine ...

What could have caused this problem? How likely is it that it was a software defect rather than a hardware problem? I noticed earlier that the video froze a couple of times, but I was copying some files and downloading some other stuff, so that's not that unusual ... are there any other diagnostics I can perform to see if there's a problem with the hard drive? Any suggestions are welcome. Thanks!

C:\WINDOWS\system32>chkdsk D: /F
The type of the file system is NTFS.

Chkdsk cannot run because the volume is in use by another
process.  Chkdsk may run if this volume is dismounted first.
ALL OPENED HANDLES TO THIS VOLUME WOULD THEN BE INVALID.
Would you like to force a dismount on this volume? (Y/N) Y
Volume dismounted.  All opened handles to this volume are now invalid.
Volume label is Data.

Stage 1: Examining basic file system structure ...
  220416 file records processed.
File verification completed.
  360 large file records processed.
  0 bad file records processed.

Stage 2: Examining file name linkage ...
  273218 index entries processed.
Index verification completed.
  0 unindexed files scanned.
  0 unindexed files recovered to lost and found.

Stage 3: Examining security descriptors ...
Security descriptor verification completed.
  26402 data files processed.
CHKDSK is verifying Usn Journal...
  37481040 USN bytes processed.
Usn Journal verification completed.

Windows has scanned the file system and found no problems.
No further action is required.

1400318975 KB total disk space.
1195790564 KB in 192027 files.
     63504 KB in 26403 indexes.
         0 KB in bad sectors.
    365935 KB in use by the system.
     65536 KB occupied by the log file.
 204098972 KB available on disk.

      4096 bytes in each allocation unit.
 350079743 total allocation units on disk.
  51024743 allocation units available on disk.

If the disk only crashed once and now everything is OK, then this event was maybe exceptional.

My own guess is that the hardware overheated because you were using a lot the GPU and/or the CPU.

What you can do to check the problem :

  • Look for a useful error or warning in the Event Viewer
  • Install a product that can show the disk S.M.A.R.T. data and tell you if there is a problem (example Speccy).
  • Install a temperature-monitoring product such as Speedfan which can be configured to display the current temperatures in the taskbar for easy monitoring.

If heating up is the problem, you may :

  • Verify that the game uses the video GPU and not the CPU
  • Clean up all airways
  • If a laptop, place it at an angle, so the air can pass below it, or buy a cooling pad
  • Renew the CPU thermal paste and heat sink (better done by a professional)
  • Replace the CPU heat sink and fan with a better one (also better done by a professional)

The command you ran: chkdsk d: /f doesn’t do any checking of the drive surface.

The proper command would be: chkdsk d: /r this will perform a full surface scan. If bad blocks are discovered the drive is failing.

You can also use SeaTools and do a long test. Short tests do not check the drive surface.

Chances are the drive is failing. I’m assuming it’s a Seagate if you are using SeaTools. Seagates are not only notoriously low quality drives, they do exactly what you described when they start getting bad sectors. Instead of recovering, they shut down completely until power cycled.

Skirting the line of personal opinion here, it’s not really a loss to get rid of it now before it inevitably dies. Stay away from that brand.


Move on

In your question you seem to ask which things you 'can' do, but in the title you ask what you should do.

To list the key points:

  • You have a backup
  • The problem only occurred once
  • You did some checks, which came out clean

At this point it simply not time efficient to keep digging deeper. (Unless you enjoy it). If the problem persists, you can always decide to go deeper in your search, or replace the drive.

Of course some generic maintenance (e.g. cleaning the fans) can never hurt, but don't spend too much time on it.


If a partition "doesn't show up anymore", the first thing to check would be what diskmgmt.msc sees - is the partition itself gone (which would mean something damaged the partition table - which is a rare thing unless malice is involved), or is the filesystem in that partition damaged beyond being recognized (use filesystem specific tools), or has something simply unmapped the drive letter?

If the reliability of a drive is suspect, generic SMART utilities (eg the mentioned CrystalDiskInfo) are your friend - try to understand as much as you can of the actual SMART values displayed, not just the conclusion a tool draws from them. "Pending sectors" that aren't going away fast mean the drive should be going away fast. Overtemperature or spinup failure events also can make a drive suspect.