Munin's smart plugin keeps reporting an error in the past because of the exit code

Eventually I have resorted to patching the smart plugin. Depending on your version there is some code like this:

        if exit_status!=None :
            # smartctl exit code is a bitmask, check man page.
            num_exit_status=int(exit_status/256)

replace it with this

        if exit_status!=None :
            # smartctl exit code is a bitmask, check man page.
            num_exit_status=int(exit_status/256)
            # filter out bit 6
            num_exit_status &= 191
            if num_exit_status<=2 :
                exit_status=None

        if exit_status!=None :

The most interesting part is the line where there is a bitwise operation with 191: this is 0x11011111 in binary, so doing an AND operation with the current value it will just set bit no 6 to 0 while letting the other values untouched.

Therefore a value of 64 (as mine does) will be reported as 0 while a value of 8 would remain at 8. But also, very importantly, a value of 72 (bit 6 set as always and bit 3 set because the disk is failing) it would also report 8.


The only way I found to avoid this problem without modifying munin sources was to avoid using -a option with smartctl, e.g. use something like this in /etc/munin/plugin-conf.d/munin-node:

[smart_sda]
env.smartargs -H -i -c -A -l selftest -l selective

(i.e. all options normally included in -a except for -l error).