corrupted files, fail checksum but , disk and FS checks with disk utility
Solution 1:
I treat all Input/output (IO) errors as 5 alarm situations. When I see IO in the console log, I save all work, quit all apps and then get a full backup. The filesystem is designed to keep the filesystem intact which means when a file has a problem, the file gets truncated and deleted. Your data loses, the filesystem gets healed. Seeing an IO error bubble up to the application layer is either:
- no big deal - you have some corrupt files
- a huge deal - you have limited time to back up files that aren’t already backed up
Then once I have a backup - I do watch a day or so for IO errors and delete the files that are affected. If I see the IO errors spread, I do an erase installation and keep monitoring.
SSD are a bit different than HDD so I’ve only seen one SSD ever throw an actual IO error since the controller almost always intercepts and corrects these with checksum. In my experience, 100% of issues are just bit rot, crash and app failures - not that the SSD is starting to show signs of failure. I’ve never had warning of an SSD failing - they just go. Also, the SSD Apple delivers are way, way, way more reliable than the HDD Apple delivered. Erase install is basically a cure-all, get out of jail free card for me in the last 10 years managing Macs. Only when a system can’t install and run a blank OS do I think hardware needs diagnosis and repair.
Back to you, if you don’t have a full backup you trust, please do that now with haste. Next, read up on how to erase. All signs you have indicate your hardware is fine and you might not even find any IO errors in the console app (or using log stream
). Since you know exactly how to summon that error - watch the log as you poke at these broken files trying to read / open / checksum them.
Your instincts to test are perfect - the disk and hardware are almost certainly OK - just may only need to wipe the filesystem and restore good files on to a clean OS when the system can’t self heal itself. The SSD controller maps multiple chained storage cells with data, so TRIM and bad blocks are more about keeping a substantial portion of the space free so that “bad blocks” don’t get hard mapped out like hard drives needed. My understanding is perhaps 10% of the drive can go bad and you won’t lose a block or capacity as far as the operating system is concerned.