How does SSD meta-data corruption on power-loss happen? And can I minimize it?

Solution 1:

For how metadata corruption can happen after an unexpected power failure, give a look at my other answer here.

Disabling cache can significantly reduce the likehood of in-flight data loss; however, based on your SSDs, data-at-rest remain at risk of being corrupted. Moreover, it commands a massive performance loss (I saw 500+ MB/s SSDs to write at a mere 5 MB/s after disabling the private DRAM cache).

If you can't trust your SSDs, the only "solution" (or, rather, workaround) is to use an end-to-end checksumming filesystem as ZFS or BTRFS and a RAID1/mirror setup: in this manner, any eventual single-device (meta)data corruption can be recovered from the other mirror side by running a check/scrub.

Solution 2:

Your best bet is to disable write caching on the disk both by telling the disk not to do write caching (look at hdparm and smartctl options and hope the disk honors them) and to make the OS not buffer writes with mount options like sync and dirsync.