Hard Disk DRDY error: is it a crash
I am using IBM Thinkpad, 1.7GHz, 512 RAM with Linux Mint 9 installed. I have two partitions in addition to root.
One of the partitions became read-only yesterday, after which I rebooted my system. It is extremely slow along with DRDY Error : Is my Hard disk crashed ? Error Log while booting.
Differences between boot sector and its backup.
failed command : READ DMA
BMDMA : stat 0X25
ata 1.00 : status : { DRDY ERR }
ata 1.00 : status :{ UNC }
Buffer I/O error on logical device, logical block 65467
smartctl output for the partition:
mint mint # smartctl -a /dev/sda1
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: TOSHIBA MK4026GAX RoHS
Serial Number: X5LY1623T
Firmware Version: PA107E
User Capacity: 40,007,761,920 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 6
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Feb 17 06:48:25 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 153) seconds.
Offline data collection
capabilities: (0x1b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 30) minutes.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0
3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 310
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 3968
5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 40
7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 7257
10 Spin_Retry_Count 0x0033 179 100 030 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3484
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 489
193 Load_Cycle_Count 0x0032 064 064 000 Old_age Always - 367150
194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 36 (Lifetime Min/Max 14/57)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 33
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 82
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 1
199 UDMA_CRC_Error_Count 0x0032 200 253 000 Old_age Always - 0
220 Disk_Shift 0x0002 100 100 000 Old_age Always - 101
222 Loaded_Hours 0x0032 085 085 000 Old_age Always - 6146
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0
224 Load_Friction 0x0022 100 100 000 Old_age Always - 0
226 Load-in_Time 0x0026 100 100 000 Old_age Always - 227
240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0
SMART Error Log Version: 1
ATA Error Count: 2371 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 2371 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 05 1a 1b 00 e0 Error: UNC 5 sectors at LBA = 0x00001b1a = 6938
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 05 1a 1b 00 e0 00 00:03:10.061 READ DMA
f8 00 00 00 00 00 e0 00 00:03:10.061 READ NATIVE MAX ADDRESS
ec 00 00 00 00 00 a0 02 00:03:10.053 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 02 00:03:10.053 SET FEATURES [Set transfer mode]
f8 00 00 00 00 00 e0 00 00:03:10.053 READ NATIVE MAX ADDRESS
Error 2370 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 05 1a 1b 00 e0 Error: UNC 5 sectors at LBA = 0x00001b1a = 6938
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 05 1a 1b 00 e0 00 00:03:03.328 READ DMA
f8 00 00 00 00 00 e0 00 00:03:03.327 READ NATIVE MAX ADDRESS
ec 00 00 00 00 00 a0 02 00:03:03.320 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 02 00:03:03.319 SET FEATURES [Set transfer mode]
f8 00 00 00 00 00 e0 00 00:03:03.319 READ NATIVE MAX ADDRESS
Error 2369 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 05 1a 1b 00 e0 Error: UNC 5 sectors at LBA = 0x00001b1a = 6938
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 05 1a 1b 00 e0 00 00:02:56.582 READ DMA
f8 00 00 00 00 00 e0 00 00:02:56.582 READ NATIVE MAX ADDRESS
ec 00 00 00 00 00 a0 02 00:02:56.574 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 02 00:02:56.574 SET FEATURES [Set transfer mode]
f8 00 00 00 00 00 e0 00 00:02:56.574 READ NATIVE MAX ADDRESS
Error 2368 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 05 1a 1b 00 e0 Error: UNC 5 sectors at LBA = 0x00001b1a = 6938
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 05 1a 1b 00 e0 00 00:02:49.809 READ DMA
f8 00 00 00 00 00 e0 00 00:02:49.809 READ NATIVE MAX ADDRESS
ec 00 00 00 00 00 a0 02 00:02:49.801 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 02 00:02:49.801 SET FEATURES [Set transfer mode]
f8 00 00 00 00 00 e0 00 00:02:49.801 READ NATIVE MAX ADDRESS
Error 2367 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 05 1a 1b 00 e0 Error: UNC 5 sectors at LBA = 0x00001b1a = 6938
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 05 1a 1b 00 e0 00 00:02:43.056 READ DMA
f8 00 00 00 00 00 e0 00 00:02:43.056 READ NATIVE MAX ADDRESS
ec 00 00 00 00 00 a0 02 00:02:43.048 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 02 00:02:43.048 SET FEATURES [Set transfer mode]
f8 00 00 00 00 00 e0 00 00:02:43.047 READ NATIVE MAX ADDRESS
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
Device does not support Selective Self Tests/Logging
Do I need to get a new Hard Disk my PC ?
The SMART error log contains useful info:
Error: UNC 5 sectors at LBA = 0x00001b1a = 6938
This means an UNCorrectable error. The last command was a READ DMA, so it's a read error. It seems sectors 6938 to 6943 are not readable.
In addition, in the SMART attributes we can see there are 40 succesfully reallocated sectors, 82 sectors waiting to be reallocated, and 1 uncorrectable error (probably the one in the log):
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 40
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 82
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 1
Everything indicates that the drive is failing, so backup the data inmediately. If you cannot copy the data because of the errors, use ddrescue to image the partition skipping the bad blocks; this tutorial is very useful.