Nexenta/OpenSolaris filer kernel panic/crash

Solution 1:

I know nothing about this setup but,

ffffff003fefb820 zfs:zap_lockdir+6d () seems to indicate that the worker thread is locking the directory and then mutex_vector_enter tries to lock it too.

This all seems to stem from a situation that begins with updating quota. If its possible you might want to consider turning quotas off if they are unnecessary.

Its only a workaround rather than a fix and I have no idea if it'll work as expected! But might be worth a try.

Solution 2:

The stack trace references "userquota" which is not typically used by our customers. Note that it is separate from the file system quotas that you can also set. I encourage you to turn off user quotas if you can, especially since you think they are unnecessary, but also I encourage you to file a support ticket if you have a support contract. This can be sent from the Web GUI, which would then include diagnostics from your system in the ticket.

Solution 3:

This was resolved permanently by recreating all of the zpools under Nexenta. There was a lot of baggage carried along with the zpools as they were imported from an OpenSolaris installation. And while I imported and upgraded the pools and filesystems, the stability wasn't there until everything was rebuilt.