Why does ls -l output a different size from ls -s?

I can't figure out why I'm getting the following results:

ls -l tells me the size of a given file (HISTORY) is "581944":

$ ls -l HISTORY 
-rw-rw-r-- 1 waldyrious waldyrious 581944 Feb 22 10:59 HISTORY

ls -s says it is "572":

$ ls -s HISTORY
572 HISTORY

I obviously need to make the values use a comparable scale. So first I confirm that using --block-size 1 in ls -l gives me the same result as before:

$ ls -l --block-size 1 HISTORY 
-rw-rw-r-- 1 waldyrious waldyrious 581944 Feb 22 10:59 HISTORY

Then I do the same to ls -s to get a value in the same scale:

$ ls -s --block-size 1 HISTORY 
585728 HISTORY

Different results! 581944 ≠ 585728.

I tried generating comparable values the other way around, using -k, but I get:

$ ls -lk HISTORY 
-rw-rw-r-- 1 waldyrious waldyrious 569 Feb 22 10:59 HISTORY
$ ls -sk HISTORY 
572 HISTORY

Again, different results, 569 ≠ 572.

I tried specifying --si to make sure both options were using the same scale, to no avail:

$ ls -lk --si HISTORY 
-rw-rw-r-- 1 waldyrious waldyrious 582k Feb 22 10:59 HISTORY
$ ls -sk --si HISTORY 
586k HISTORY

...again, different values: 582k ≠ 586k.

I tried searching the web but the only thing I could find that seemed relevant was this:

Some files have "holes" in them, so that the usage listed by ls -s (...) is less than the file size listed by ls -l."

(note that in my results the opposite happens: ls -s returns sizes bigger than ls -l, not smaller.)

Meanwhile, this page says that

there is no elegant way to detect Unix file holes.

So, how can I deal with this discrepancy? Which of these values can be considered correct? Could this possibly be a bug in ls?


Solution 1:

Short answer:

  • ls -l gives the size of the file (= the amount of data it contains)
  • ls -s --block-size 1 gives the size of the file on the file system

Let's create two files:

A sparse file of 128 bytes length (A sparse file is a file containing empty blocks, see Sparse File):

# truncate -s 128 f_zeroes.img
# hexdump -vC f_zeroes.img 
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000080

Another file with random data, also of 128 bytes size:

# dd if=/dev/urandom of=f_random.img bs=1 count=128
# hexdump -vC f_random.img 
00000000  bc 82 9c 40 04 e3 0c 23  e6 76 79 2f 95 d4 0e 45  |...@...#.vy/...E|
00000010  19 c6 53 fc 65 83 f8 58  0a f7 0e 8f d6 d6 f8 b5  |..S.e..X........|
00000020  6c cf 1b 60 cb ef 06 c6  d0 99 c6 16 3f d3 95 02  |l..`........?...|
00000030  85 1e b7 80 27 93 27 92  d0 52 e8 72 54 25 4d 90  |....'.'..R.rT%M.|
00000040  11 59 a2 d9 0f 79 aa 23  2d 44 3d dd 8d 17 d9 36  |.Y...y.#-D=....6|
00000050  f5 ae 07 a8 c1 b4 cb e1  49 9e bc 62 1b 4f 17 53  |........I..b.O.S|
00000060  95 13 5a 1c 2a 7e 55 b9  69 a5 50 06 98 e7 71 83  |..Z.*~U.i.P...q.|
00000070  5a d0 82 ee 0b b3 91 82  ca 1d d0 ec 24 43 10 5d  |Z...........$C.]|
00000080

So, as you can see in the hex representation, both files have the same amount of data, although the content is quite different.

Now, let us look at the directory:

# ls -ls --block-size 1 f_*
1024 -rw-r--r-- 1 user user 128 Mar 18 15:34 f_random.img
   0 -rw-r--r-- 1 user user 128 Mar 18 15:32 f_zeroes.img
   ^                         ^
   |                         |
Amount which the           Actual file size
files takes on the fs

The first value is given by the -s --block-size 1 option, it is the amount of space used by the file on the file system.

As you can see, the sparse file takes up zero space, since the file system (ext3 in this case) was smart enough to recognize that it only contains zeroes. Also, the file with random data takes up 1024 bytes on the disk!

The value depends on how the underlying file system treats files (block size, sparse file capability, ...).

In the sixth column is the size of the file if you would read it - it is the amount of data the file contains and it's 128 bytes for both files!

Solution 2:

ls -s tells you the allocated size of the file, always a multiple of the allocation unit. ls -l tells the actual size. An easy way to test:

$ echo 1 > sizeTest
$ ls -l --block-size 1 sizeTest 
-rw-rw-r-- 1 g g 2 Mär 18 15:18 sizeTest
$ ls -s --block-size 1 sizeTest 
4096 sizeTest