Debian error : edac mc0 internal error

I have a problem on last Debian Server. I have this error written every seconds on my screen :

EDAC MC0: INTERNAL ERROR: csrow value is out of range (7 >= 4)

edac-utils give that :

mc0: 0 Uncorrected Errors with no DIMM info
mc0: 44747 Corrected Errors with no DIMM info
mc0: csrow0: 15330 Uncorrected Errors
mc0: csrow0: mc#0csrow#0channel#0: 0 Corrected Errors
mc0: csrow0: mc#0csrow#2channel#0: 0 Corrected Errors
mc0: csrow2: 0 Uncorrected Errors
mc0: csrow2: mc#0csrow#1channel#0: 0 Corrected Errors
mc0: csrow2: mc#0csrow#3channel#0: 0 Corrected Errors
mc0: csrow3: 0 Uncorrected Errors
mc0: csrow3: mc#0csrow#1channel#1: 0 Corrected Errors
mc0: csrow3: mc#0csrow#3channel#1: 0 Corrected Errors

Nothing on Memtest.

What's the problem? How to solve it?

Thank you.


EDAC complaning about most(all?) memory banks while Memtest shows no errors at all most likely means, that your ECC RAM is OK, but was not initialized properly by the BIOS on boot.

In order to initialize ECC bit - memory has to be written before it can be used. Usually it is done by BIOS, but with some motherboards (ASUS P5B for example) this step is skipped if "Quick Boot" is enabled. So, on every access of uninitialized cells you will get EDAC errors with server working without problems at the same time.

Try disabling Quick Boot in the BIOS and see if it helps.

If you don't have physical access to the hardware or your BIOS does not offer you option to disable quick boot functionality - there is other way to init memory before EDAC module is loaded. Add memtest=1 to your kernel command line in /etc/default/grub and run update-grub to update configuration (I assume you are running Debian/Ubuntu). Kernel will use it's built-in memory tester on boot and as a part of the tests all memory will be written, resulting in ECC bit initialization.


memtest may not show the problem, but I can see mc0: csrow0: 15330 Uncorrected Errors in that log. It looks like you have bad RAM. Depending on the board you should be able to find the exact bad stick and replace it.