How do I troubleshoot hardware issues related to a computer freeze/crash?

What are some common guidelines and issues related to hardware being the issue of a computer crash?

What should I look for and how do I troubleshoot these problems?

What are some tools that are useful in diagnosing these hardware related crashes?

I am looking to be able to isolate the problematic device with specific tools and guidelines. For example if device X is causing system failure how do I go about diagnosing it?


Is hardware the problem?

Some problems are obviously hardware related. When your computer doesn't get past POST it's usually a hardware issue (but could be the software in the hardware, which for this answer we will continue to call a hardware issue). Intermittent crashes are difficult to diagnose but for a wide array of problems the steps to troubleshooting are the same.

Crashes, freezes, lockups, graphical/audio artifacts and poor performance; all symptoms of either a hardware or software fault. General software troubleshooting involves removing new programs (or programs updated/installed around the time the issue started to appear). Updating drivers, installing older drivers, reinstalling the operating system are all potential ways to determine if it is in fact a hardware issue. For the purposes of this guide we will assume it is a hardware issue.

Troubleshooting 101

The process of elimination. You could guess wildly that it might be a particular component at fault and start there, but that is bound to fail. Something is wrong and you're sure it's hardware related, but where to start? At the bare minimum. See if the least amount you can operate a computer with will re-create the problem.

This is much easier on desktops than on laptops. Your options for elimination on laptops generally will consist of removing all but one stick of memory and potentially swapping out the hard drive. But laptops and desktops will be covered separately.

Notice: The following instructions are somewhat technically oriented. If you've never seen the inside of a computer before take precautions. Use an anti-static wrist strap if you are operating in an area with the potential to have electrostatic discharge. You may also wish to use a grounding adapter cable. As a general rule of thumb, grounding yourself on the metal chassis of your computer before touching any of the internal components is a very good idea.

Getting to bare bones:

  • Shut down your computer and unplug it from the wall (if you wish setup your grounding safety equipment at this point)
  • Disconnect all of your storage drives, and optical from your motherboard
  • Remove all of your add-in cards including your video card.
  • Remove all of your RAM except for 1 module and make sure that's in the primary slot (DDR1 or something silkscreened on the board).
  • Unplug everything both internal and external from your computer including power, monitor cable, and keyboard, power and reset switches, internal speaker... (really, everything else--mouse, USB things, audio--it makes life much easier).

At this point you should clean out the dust bunnies. Pick the larger dust piles out by hand, and use compressed air on any heatsinks that look particularly bad. Clean around fans with cotton swabs/cotton buds or a toothpick (not metal, nice soft plastic or wood).

You should now have a motherboard with a processor, and a stick of ram. That's it. At this point you should also un-plug the power connections on the motherboard. Check to make sure your ram is seated correctly, did the little side tabs lock into place?

Now I plug in the 24-pin power connection (I know you've not yet removed it because it's in there correctly, but just humor me, pull it out and then plug it back in). The locking tab on the male end should match up with the latch on the female end. There should be no empty holes in that connector. Plug in the 4-pin CPU power if applicable and make sure it's the correct one, the locking mechanism for that plug should also engage. Attach any other auxilary power to your motherboad, some have 4-pin molex connectors.

Attach your keyboard to the appropriate port (USB or PS2). And connect the monitor via the onboard video, if available. If your monitor won't work with the VGA port just leave it disconnected. Now find your internal speaker and attach it to the appropriate pins (it should say PC SPKR or something silk-screened by a row of pins).

At this point there should only be power, monitor, keyboard and PC speaker attached to the motherboard (and the power is unplugged at the power supply).

Reset the CMOS, on some motherboards this is a jumper on other it is a button, check your motherboard manual for details. While resetting it, let the jumper sit for a while. Now, remove your battery. Walk away, make some tea (remember power is unplugged for this). While your tea is brewing, have some cheese toast, you don't need this much time, but it will allow you to relax. Take some deep soothing breaths.

Now, check all your power connections. Yup, I know you just put those there, but check them anyway.

Pop the battery back in and plug the power supply into the wall. Make sure the switch on the power supply has the side with the line depressed.

Now to turn it on. You can't just tap the power button on your case because you don't have the power switch on the case plugged in. If you're confident that is not the problem, go ahead and plug it in. Or you can take a trusty screwdriver, coin, olympic gold medal, or other conductive material and short the two power switch pins on that chassis header near where you plugged in your PC speaker.

Does it do anything? Does it beep? If you don't have a monitor connected and it beeps one short beep that's good. If you can see wonderful booting things, hooray!

If it does nothing, that is bad.

At this point we have a Choose Your Own Adventure story. If things are good, go to the section labeled "Huzzah!" below. If things are going bad go to the "le sad" section below. When adding or removing components make sure that the power is off to your computer before proceeding (if you don't un-plug the computer at least switch the power supply to "off").

Huzzah!

You know your base system works at this point. Turn off the computer and plug in the rest of your chassis header things and reboot it, just to make sure one of them aren't fouling things up.

It still works, yes?

Remember, turn off the computer before adding components.

Good, let's start with say, more memory, because if at bare-bones it works, memory will probably break it (not because your memory is bad, just because sometimes things are wonky). Does it still work with your memory? All of it? If not, go back to one stick and enter the BIOS, set the memory voltage and timings to the manufacturer recommended settings.

Okay, so you're loaded up with your memory and CPU, now lets get that video card in there. Make sure to put it in the appropriate slot, if it is a PCIE card in the large PCIE slots. If

You should make sure the card is pushed all the way down, many people don't get their cards in all of the way with the first few times of building their computer. You should probably lock it in place, those fan reverberations can wiggle things loose. Make sure it's screwed down as well.

Does it still boot? If so, that's good, now attach the rest of the items one by one, powering off in between adding an additional component. You'll either track down the problem or not. If it works with everything connected, congratulations you fixed it!

If after adding your video card it doesn't boot a couple of things could be happening.

  1. A bad video card
  2. A bad motherboard
  3. Insufficient power

Try using the card in a different system to rule out #1. You could/should also borrow someone's card to try it in your board.

Le sad

  • Swap that stick of memory for the other one (check to see if it's still a problem)
  • Now, at this point your fans should be clear, and your heatsinks look like new, right?

This is where you are left with few culprits. It could be your processor or motherboard. It could be a heat issue or an issue with your power supply, a short in the keyboard (I've seen it), short in the monitor cable (seen it), short in the power cable (this involves fires and melting, you should notice that). In order I would try the following:

  • swap out power cable, keyboard, monitor (one at a time, in that order, because you probably have a bunch of power cables, and probably fewer of the other two)
  • remove the heatsink from the processor, clean the processor top and heatsink until your heatsink looks new. apply thermal paste (remember a very small amount). Cod liver cream, or zinc oxide sunscreen work for a very temporary, risky fix (not recommended). Reseat everything careful here, if you've not done it before, read up and call a buddy, it's easy to snap things off of the heatsink, and forcing the CPU will ruin your day.
  • swap out the power-supply
  • find a buddy with a compatible motherboard and either try your chip in his board, or his chip in yours, try again.
  • remove the motherboard from the case, and the powersupply, set it up on a hunk of plywood after wiping it down. shake the empty case out. run stuff with the motherboard sitting on the plywood.
  • request additional help on superuser in this question

For blue screens I use WinDbg and follow these steps. That has help me diagnose a hard drive which caused my machine to crash before.

If you don't get a blue screen or don't want to use WinDbg I find swapping components with another working machine to be effective. If the issue goes away in the questionable machine, or is recreated in the new machine you've found the issue. So I might swap a hard drive out and see if the issue persists, then swap RAM, video cards, etc. Once a computer would always crash copying a certain file in XP installation. After I swapped the RAM it worked fine.


I find that the greatest information you can gather is "What can I do that will make this box crash consistently?" If you can force the computer to crash, you're in the right direction as to identifying the bad hardware.

For example, if you start up a graphic-intensive game and you get a BSOD, specifying an error with a driver file, that's a good indicator for what direction to go in. If it crashes whenever you access something on a USB hub, again, there's your direction. If you get crashes or hangs every time you access a particular file, you may have bad sectors on your hard drive, or a faulty drive in your USB device.

Random crashes are the hardest to diagnose, and I find that if random crashes really are random and you can't find the source, try reinstalling your OS. In my work, we maintain software images for all the hardware in our organization, so flashing an OS onto a laptop is a 15-minute proposition as long as the user's data is backed up. It's usually our policy to do a reimage as a very first step, unless it's obviously a hardware issue (broken lcd, broken motherboard, etc).