Cannot shrink btrfs filesystem although there is still data and metadata space left : ERROR: unable to resize '/home': No space left on device
I cannot shrink btrfs filesystem although there is still data and metadata space left :
$ sudo btrfs filesystem resize -11G /home;echo $?
Resize '/home' of '-11G'
ERROR: unable to resize '/home': No space left on device
1
Here are some btrfs filesystem info about /home
:
$ sudo btrfs filesystem df /home | column -t
Data, single: total=92.01GiB, used=80.68GiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=1.00GiB, used=631.41MiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=224.00MiB, used=0.00B
$ sudo btrfs filesystem show /home
Label: none uuid: c7ee56a8-ef45-46c8-86d1-13879201a1e7
Total devices 1 FS bytes used 81.30GiB
devid 1 size 100.00GiB used 94.04GiB path /dev/mapper/home_VG-home
$ sudo btrfs filesystem usage -T /home
Overall:
Device size: 100.00GiB
Device allocated: 94.04GiB
Device unallocated: 5.96GiB
Device missing: 0.00B
Used: 81.91GiB
Free (estimated): 17.29GiB (min: 14.31GiB)
Data ratio: 1.00
Metadata ratio: 1.99
Global reserve: 224.00MiB (used: 0.00B)
Data Metadata Metadata System System
Id Path single single DUP single DUP Unallocated
-- --------- -------- -------- --------- ------- -------- -----------
1 /dev/dm-0 92.01GiB 8.00MiB 2.00GiB 4.00MiB 16.00MiB 5.96GiB
-- --------- -------- -------- --------- ------- -------- -----------
Total 92.01GiB 8.00MiB 1.00GiB 4.00MiB 8.00MiB 5.96GiB
Used 80.68GiB 0.00B 631.41MiB 0.00B 16.00KiB
and here the output of dmesg
:
$ dmesg | tail -11
[44202.411949] BTRFS info (device dm-0): new size for /dev/dm-0 is 97706311680
[44202.412156] BTRFS info (device dm-0): relocating block group 120288444416 flags 1
[44208.119721] BTRFS info (device dm-0): relocating block group 119214702592 flags 1
[44211.611669] BTRFS info (device dm-0): relocating block group 118140960768 flags 1
[44212.495603] BTRFS info (device dm-0): relocating block group 117067218944 flags 1
[44213.006830] BTRFS info (device dm-0): relocating block group 95592382464 flags 1
[44216.613870] BTRFS info (device dm-0): relocating block group 120288444416 flags 1
[44222.780073] BTRFS info (device dm-0): relocating block group 119214702592 flags 1
[44225.843279] BTRFS info (device dm-0): relocating block group 118140960768 flags 1
[44226.575236] BTRFS info (device dm-0): relocating block group 117067218944 flags 1
[44226.930918] BTRFS info (device dm-0): relocating block group 95592382464 flags 1
EDIT1 : The btrfs balance failed :
$ sudo btrfs balance start /home
ERROR: error during balancing '/home': No space left on device
There may be more info in syslog - try dmesg | tail
There nothing in dmesg | tail
about it.
EDIT2 : I had to do the following to be able to start the btrfs balance :
$ sudo btrfs balance start -musage=0 -dusage=0 -v /home
Dumping filters: flags 0x7, state 0x0, force is off
METADATA (flags 0x2): balancing, usage=0
SYSTEM (flags 0x2): balancing, usage=0
DATA (flags 0x2): balancing, usage=0
Done, had to relocate 0 out of 95 chunks
EDIT3 : The btrfs balance has ran for 68 minutes and then failed :
$ time sudo btrfs balance start -v /home
Dumping filters: flags 0x7, state 0x0, force is off
DATA (flags 0x0): balancing
METADATA (flags 0x0): balancing
SYSTEM (flags 0x0): balancing
ERROR: error during balancing '/home': Input/output error
There may be more info in syslog - try dmesg | tail
real 68m10.221s
user 0m0.008s
sys 4m20.236s
Here is what dmesg
shows :
[74421.794756] ata2.00: exception Emask 0x0 SAct 0xc00 SErr 0x0 action 0x0
[74421.794766] ata2.00: irq_stat 0x40000001
[74421.794773] ata2.00: failed command: READ FPDMA QUEUED
[74421.794783] ata2.00: cmd 60/08:50:48:96:f8/00:00:25:00:00/40 tag 10 ncq 4096 in
[74421.794783] res 41/40:08:48:96:f8/00:00:25:00:00/40 Emask 0x409 (media error) <F>
[74421.794788] ata2.00: status: { DRDY ERR }
[74421.794791] ata2.00: error: { UNC }
[74421.794794] ata2.00: failed command: READ FPDMA QUEUED
[74421.794802] ata2.00: cmd 60/10:58:40:af:ed/00:00:20:00:00/40 tag 11 ncq 8192 in
[74421.794802] res 41/40:58:48:96:f8/00:00:25:00:00/40 Emask 0x9 (media error)
[74421.794806] ata2.00: status: { DRDY ERR }
[74421.794809] ata2.00: error: { UNC }
[74421.798253] ata2.00: configured for UDMA/100
[74421.798303] sd 1:0:0:0: [sdb] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[74421.798315] sd 1:0:0:0: [sdb] tag#10 Sense Key : Medium Error [current] [descriptor]
[74421.798326] sd 1:0:0:0: [sdb] tag#10 Add. Sense: Unrecovered read error - auto reallocate failed
[74421.798337] sd 1:0:0:0: [sdb] tag#10 CDB: Read(10) 28 00 25 f8 96 48 00 00 08 00
[74421.798344] blk_update_request: I/O error, dev sdb, sector 637048392
[74421.798366] BTRFS error (device dm-0): bdev /dev/dm-0 errs: wr 38, rd 451, flush 0, corrupt 0, gen 0
[74421.798425] sd 1:0:0:0: [sdb] tag#11 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[74421.798435] sd 1:0:0:0: [sdb] tag#11 Sense Key : Medium Error [current] [descriptor]
[74421.798444] sd 1:0:0:0: [sdb] tag#11 Add. Sense: Unrecovered read error - auto reallocate failed
[74421.798453] sd 1:0:0:0: [sdb] tag#11 CDB: Read(10) 28 00 20 ed af 40 00 00 10 00
[74421.798459] blk_update_request: I/O error, dev sdb, sector 552447808
[74421.798523] ata2: EH complete
EDIT 4 : I'm actually using /dev/sdb :
$ sudo smartctl -a /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-143-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Toshiba 2.5" HDD MQ01ABD...
Device Model: TOSHIBA MQ01ABD100
Serial Number: 84EWT2U5T
LU WWN Device Id: 5 000039 5b1f852cb
Firmware Version: AX1P4M
User Capacity: 1 000 204 886 016 bytes [1,00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.6, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Mon Apr 1 23:34:41 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 243) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0
3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 1735
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 5639
5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 080 080 000 Old_age Always - 8259
10 Spin_Retry_Count 0x0033 212 100 030 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 5623
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 563
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 203
193 Load_Cycle_Count 0x0032 099 099 000 Old_age Always - 17892
194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 23 (Min/Max 10/46)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 9200
198 Offline_Uncorrectable 0x0030 001 001 000 Old_age Offline - 255
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
220 Disk_Shift 0x0002 100 100 000 Old_age Always - 0
222 Loaded_Hours 0x0032 080 080 000 Old_age Always - 8117
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0
224 Load_Friction 0x0022 100 100 000 Old_age Always - 0
226 Load-in_Time 0x0026 100 100 000 Old_age Always - 177
240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0
SMART Error Log Version: 1
ATA Error Count: 1029 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 1029 occurred at disk power-on lifetime: 8257 hours (344 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 50 48 96 f8 40 Error: UNC at LBA = 0x00f89648 = 16291400
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 10 58 40 af ed 40 00 03:13:20.172 READ FPDMA QUEUED
60 08 50 48 96 f8 40 00 03:13:16.469 READ FPDMA QUEUED
60 08 48 40 96 f8 40 00 03:13:16.469 READ FPDMA QUEUED
60 08 40 38 96 f8 40 00 03:13:16.469 READ FPDMA QUEUED
60 08 38 30 96 f8 40 00 03:13:16.469 READ FPDMA QUEUED
Error 1028 occurred at disk power-on lifetime: 8257 hours (344 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 70 48 96 f8 40 Error: UNC at LBA = 0x00f89648 = 16291400
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 10 70 78 90 f8 40 00 03:13:11.731 READ FPDMA QUEUED
60 d0 68 a8 89 f8 40 00 03:13:11.731 READ FPDMA QUEUED
61 e0 60 60 aa 0b 40 00 03:13:11.727 WRITE FPDMA QUEUED
61 00 58 60 a2 0b 40 00 03:13:11.723 WRITE FPDMA QUEUED
61 00 50 60 9a 0b 40 00 03:13:11.625 WRITE FPDMA QUEUED
Error 1027 occurred at disk power-on lifetime: 8133 hours (338 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 c0 f8 bd 51 40 Error: UNC at LBA = 0x0051bdf8 = 5357048
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 58 e0 70 fa 40 40 00 00:18:59.971 READ FPDMA QUEUED
61 08 d8 d8 45 2b 40 00 00:18:59.971 WRITE FPDMA QUEUED
61 08 d0 d0 78 6b 40 00 00:18:59.971 WRITE FPDMA QUEUED
61 08 c8 18 42 2b 40 00 00:18:59.971 WRITE FPDMA QUEUED
60 08 c0 f8 bd 51 40 00 00:18:59.971 READ FPDMA QUEUED
Error 1026 occurred at disk power-on lifetime: 8133 hours (338 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 00 f8 bd 51 40 Error: WP at LBA = 0x0051bdf8 = 5357048
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 38 10 28 5f 6b 40 00 00:18:55.963 WRITE FPDMA QUEUED
61 08 08 68 85 6f 40 00 00:18:55.963 WRITE FPDMA QUEUED
60 00 00 f0 bd 51 40 00 00:18:55.946 READ FPDMA QUEUED
60 00 f0 80 75 56 40 00 00:18:55.944 READ FPDMA QUEUED
60 00 e8 80 73 56 40 00 00:18:55.930 READ FPDMA QUEUED
Error 1025 occurred at disk power-on lifetime: 8119 hours (338 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 b8 f8 7f 48 40 Error: UNC at LBA = 0x00487ff8 = 4751352
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 b8 f8 7f 48 40 00 01:10:35.049 READ FPDMA QUEUED
ea 00 00 00 00 00 a0 00 01:10:35.017 FLUSH CACHE EXT
61 08 98 88 4b cb 40 00 01:10:35.017 WRITE FPDMA QUEUED
61 08 70 98 c1 0c 40 00 01:10:35.017 WRITE FPDMA QUEUED
61 08 60 a0 45 cb 40 00 01:10:35.017 WRITE FPDMA QUEUED
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 6780 -
# 2 Short offline Completed without error 00% 1 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
$ echo $?
64
and dmesg
reports 2 bad sectors during the last btrfs balance operation :
$ dmesg | grep I/O.error.*sector
[74421.798344] blk_update_request: I/O error, dev sdb, sector 637048392
[74421.798459] blk_update_request: I/O error, dev sdb, sector 552447808
Remapped those the bad sectors :
$ dmesg | grep I/O.error.*sector | awk '/sector/{print "sudo hdparm --yes-i-know-what-i-am-doing --repair-sector "$NF" /dev/sdb"}' | sh -x
+ sudo hdparm --yes-i-know-what-i-am-doing --repair-sector 637048392 /dev/sdb
/dev/sdb:
re-writing sector 637048392: succeeded
+ sudo hdparm --yes-i-know-what-i-am-doing --repair-sector 552447808 /dev/sdb
/dev/sdb:
re-writing sector 552447808: succeeded
EDIT 5 : It seems this command was enough to have more than 11G unallocated :
$ sudo btrfs balance start -musage=0 -dusage=0 -v /home
Dumping filters: flags 0x7, state 0x0, force is off
METADATA (flags 0x2): balancing, usage=0
SYSTEM (flags 0x2): balancing, usage=0
DATA (flags 0x2): balancing, usage=0
Done, had to relocate 0 out of 95 chunks
The btrfs filesystem resize
succeeded. (I'm sorry, I've lost the output of the btrfs filesystem resize
)
Solution 1:
You're requesting the volume to shrink by 11GB, yet you only have about 6GB unallocated.
You can more efficiently use allocated extents by rebalancing the volume. Executing a command similar to btrfs balance start /home
will start that process, and it may take some time to complete.
But I don't know if that will free up enough for a large amount of shrinkage.