Understanding core dump (Solaris 11 crash)

I'm still trying to get behind crashes on my Solaris 11 system. Recently, a core dump has been written during the crash. Looking into it with mdb yields

> $C
ffffff0021c09430 vpanic()
ffffff0021c09460 vcmn_err+0x2e(3, fffffffff7a8a830, ffffff0021c09520)
ffffff0021c09550 zfs_panic_recover+0xae()
ffffff0021c09610 dmu_buf_hold_array_by_dnode+0xbd(ffffff05207b5018, 400000, 20000, 0, fffffffff7a85ce0, 
ffffff0021c09654, ffffff0021c09658, 0)
ffffff0021c096b0 dmu_write_uio_dnode+0x50(ffffff05207b5018, ffffff0021c09a10, 20000, ffffff051efd0788)
ffffff0021c09700 dmu_write_uio_dbuf+0x58(ffffff05207b2320, ffffff0021c09a10, 20000, ffffff051efd0788)
ffffff0021c09960 zfs_write+0x843(ffffff051f120900, ffffff0021c09a10, 0, ffffff04e4b6adb0, 0)
ffffff0021c099d0 fop_write+0xa6(ffffff051f120900, ffffff0021c09a10, 0, ffffff04e4b6adb0, 0)
ffffff0021c09aa0 vn_rdwr+0x1bd(1, ffffff051f120900, ffffff051fb2e0c0, 20000, 400000, 1, 0, fffffffffffffffd, 
ffffff04e4b6adb0, ffffff0021c09ad8)
ffffff0021c09b20 zfs_replay_write+0xe3(ffffff04eb2b4200, ffffff051fb2e000, 0)
ffffff0021c09b60 zil_replay_wr_task+0x2d(ffffff04eb57ea00)
ffffff0021c09c00 taskq_thread+0x22e(ffffff051ede9810)
ffffff0021c09c10 thread_start+8()

How do I know what part of the stack caused the crash?


Solution 1:

pstack core

Although my solaris 11 experiance is light, it used to be that the second address was a useful place to start disassembling.

then in mdb

::stack

> <address>::dis

Solution 2:

Do you have any system log messages that could help you further? I had a quick look at the OpenSolaris source code and dmu_buf_hold_array_by_dnode can cause a panic with "zfs: accessing past end of object". I then found a good posting on zfs-discuss on opensolaris.org that explains a bit further what to do next.