What is the cause of most top-vendor SSD crashes?
I'm referring to total crashes where the SSD doesn't work anymore. Not IO errors.
SSDs have a limited lifespan. So if nothing else goes wrong, eventually they die of NAND wear. My question is what are the statistics (as far as we know) of what actually destroyed an SSD. According to this answer on this site, "SSDs rarely die due to NAND wear." Therefore, the writer advises to put a priority on "vendor track record" (and more).
So, what if I buy a top-vendor SSD, is there much of a point of buying a larger capacity SSD because in that case NAND wear is what is expected to total the drive, or will the drive likely be totaled by one of the other failures anyway? [In regards to crash-protection only! The answer there mentions that "space is never enough". But that's not the point of my question.]
All other things being equal, higher capacity drives will wear less than lower capacity ones. If the added total endurance of NAND would be a concern depends exclusively on the expected use case and workload.
For example, if you plan to issue 100% random writes at maximum speed for extended periods of time (ie: months), sure NAND endurance can be a valid concern, and you should buy a mixed-use or even a write-intensive SSD. For less extreme workloads, you have a non-trivial probability of experiencing a controller/firmware failure before exausting NAND endurance. Source: many articles/forum threads/mailing list where relatively new SSDs bricked due to controller failure versus many SSDs which lasted after they NAND endurance rating. Some examples can be found here, here and here
That said, current surviving SSD controllers seems to be much more reliable than older controller (especially Sandforce/Jmicron ones). I would say that current SSDs are very reliable, especially when coming from a reputable brand (ie: Intel, Samsung, Crucial, and the turn-key Phison and SiliconMotion products).