ssd 2 million hour mtbf? how is this proven?
Solution 1:
MTBF is defined as the predicted elapsed time between inherent failures of a system during operation.
It literally stands for "Mean Time Between Failure". Additionally...
As you can see, MTBF refers to the failure rate of a drive over its expected lifetime. This doesn't mean a 1.2 million hour MTBF drive will last 1.2 million hours, and a 1.5 million hour MTBF drive will last 1.5 million hours (that’s 136 to 171 years by the way)
So What Does SSD MTBF Actually Mean for Me?
unfortunately, most manufacturers don’t share this information freely.
What does 2,000,000 hour MTBF Mean For Me?
In attempt to make the example used in the article specific to a drive with a 2,000,000 hour MTBF. The following math was performed to determine that one failure would happen every 250 days
2,000,000 / 8 hours a day = 250,000 / 1000 drives = 250 days.
The article originally stated that a drive with a 1.5 million hour MTBF would fail once every 150 days:
if the drive is used at an average of 8 hours a day, a population of 1000 SSDs would be expected to have one failure every 150 days ...
The article continues to indicate that MTBF isn't that great of a way to determine how reliable the drive will be.
A better way to get an idea of how long an SSD will actually last for you would be to consider the Total Bytes Written spec, or TBW. Although this is another ‘overall expectation’ figure and doesn’t directly tell you the lifespan of a drive, it will give you an idea of how one drive compares to another. Unfortunately, not all manufacturers give out this spec either.
The also article continues to explain how MTBF is normally determined.
The JEDEC JESD218A standard defines the method for testing the read/write endurance of an SSD (free registration required to view) which is the leading cause of SSD failure, but manufacturers may choose to supplement this with some additional failure tests.
Another thing to consider is what workload is used to specify the MTBF. For instance, Intel qualifies their SSDs using a workload of 20 GB of writes per day for 5 years. With this workload, along with the supplemental failure tests, the Intel 335 has an MTBF of 1.2 million hours. However if the workload was reduced to 10 GB a day, the MTBF would be 2.5 million hours. At 5 GB per day, it becomes 4 million hours.
References
- Understanding MTBF in SSD – What Does an SSD’s MTBF Mean for You? - Hardcoreware.com, Carl Nelson, January 6, 2013
Solution 2:
Drives don't all fail at exactly the MTBF time: rather, the times at which they fail obey a particular statistical distribution with the given mean. You don't necessarily need to test for as long as the mean to get bounds on the mean, since testing for a shorter time can still give you a lot of information about the shape of the distribution.
For example, suppose you want to demonstrate that the MTBF is greater than one month. If the MTBF was only a month, you'd expect a few drives to fail very quickly so if you tested a bunch of drives for a week and none of them failed in that time, you have reasonable grounds for believing that the MTBF is quite a lot more than one week. If you test enough drives for time T, you can argue that the MTBF must be at least some larger value.
Also, they may be using an argument along the lines of "We tested the drive by reading and writing 24/7 for a month. In reality, most users only access the drive for 1% of the time that the computer is running, so most users will experience one hundred times the MTBF we found in our tests."
Another technique that may be used is to test in harsher conditions than real use. I don't know if this is used for hardware but it is used for shelf-life of foods. First, you do experiments that show, for example, that your canned whatevers degrade three times as fast when stored at 40C as they do at 20C. Then, if they're still good to eat after four months in storage at 40C, they should be good to eat after a year at 20C.