BSODs and Prime95 failures
My computer is notoriously unstable. It blue screens all the time. I'm running Windows 7. Here's what's in the box:
- Intel Core i7 920 (Stock cooler, not overclocked)
- Gigabyte EX58-UD3R motherboard
- 6GB (3x2GB) OCZ Gold memory (set to 1333MHz, it has problems booting if I leave it at 1066)
- GeForce 9500 GT
- Antec 650W power supply
When idle it seems to run at between around 40 and 50 degrees Celsius, according to SpeedFan. I've run many memory tests, and none of them have come up with any problems.
Now I've received several messages when it Blue Screens:
- IRQL_NOT_LESS_OR_EQUAL
- Page fault when not paging (or something like that)
- Random addresses/registered
Unfortunately, they go by too quickly for me to take much from them.
I just ran Microsoft's hot fix for the first (though I'm not positive that my error is 100% the same as theirs, I don't know if I'm getting the 0x0000000A part), so I don't know if that will help or not, but if Prim95 is any indication, it won't, for the following reason:
When I run Prime95, 8 threads start up, and they all stop very quickly. I get the following errors in the results.txt file:
[Tue Feb 16 15:44:35 2010] FATAL ERROR: Rounding was 0.5, expected less than 0.4 Hardware failure detected, consult stress.txt file. FATAL ERROR: Rounding was 0.5, expected less than 0.4 Hardware failure detected, consult stress.txt file. FATAL ERROR: Rounding was 0.5, expected less than 0.4 Hardware failure detected, consult stress.txt file. FATAL ERROR: Resulting sum was 4050964008042496, expected: 2785959515376393 Hardware failure detected, consult stress.txt file. FATAL ERROR: Resulting sum was 4.042840052791945e+056, expected: 3.789462128888016e+016 Hardware failure detected, consult stress.txt file. FATAL ERROR: Resulting sum was 5.593535921577141e+247, expected: 1.208964328863723e+017 Hardware failure detected, consult stress.txt file.
When I looked at the stress.txt file, it suggested memory might be my problem, but as I said, I've run multiple memory tests (MemTest86, I think? It was a while ago), and no problems have been detected.
After running the hotfix, the test threads managed to stay running a little longer, and while my temperatures definitely rose, they never really got about 60C.
So, basically I see three problems:
- I'm running pretty hot. With the stock cooler, I idle close to 50 on some cores with the side of my case off. Putting my hand in front of the CPU fan, I don't really feel much of a breeze. Is this normal for the 920 stock cooler?
- I blue screen all the time (like 1-4 times per day).
- I can't seem to run Prime95 for more than a few seconds.
Can anyone point me in the direction of what might be going wrong here, and perhaps what to do to confirm/fix the problem?
Thank you.
Solution 1:
First things first - Go to Control Panel > System (Windows Key+Pause/Break) and then under Advanced, you should see "Startup and recovery", click Settings and you can disable Automatic restart on system failure.
Next time a BSOD occurs, you can see what the cause is.
Also, you may want to see Blue Screen View, a very good tool to help you see previous Blue screen errors.
Now, As ~quack said, just because it passes some tests, doesn't mean it is good. If you ran it for a few hours, swapping the modules around and re-running again may make it quickly touch some places it didn't before - but really, unless you run memtest86+ for around (or ideally over) 48 hours, you will not have a good result.
Next, the errors you said are most commonly down to faulty/corrupt/bad memory, but can really be anything - the most likely reason is bad/dodgy device drivers.
If you are getting this every time you run Prime95, I would highly recommend you try unplugging EVERYTHING from your machine other than power, video and keyboard (and mouse, unless you are confident of using the machine without one). Now, go to safe mode and try running Prime 95 again. This is the best way of testing if it is a driver issue - apart from actually reinstalling Windows from scratch and not installing any drivers!
If you are still seeing random problems and Memtest86+ really is not showing errors, it is most likely a problem with the motherboard or even CPU, however, this can be very hard to diagnose.
As for temperature - the lower the temperature, the slower the fan speed - your CPU is very cool and there is nothing to worry about.
Solution 2:
I had a similar problem recently: system had shown some instabilities, Prime95 returned hardware failure, etc. I ran memtests until the cows came home, all runs came up clear, really drove me crackers ... in the end it turned out the memory voltage was too low.
Solution 3:
Googling for BSOD with Gigabyte EX58-UD3R and OCZ gold
gives me several results, what's common is that most of the BSOD vanish on changing the memory timings and voltage settings.
Have a look:
- Tweak Town
- Overclockers
- Tom's Hardware