What file size units do applications on Ubuntu use?
Introduction:
Data in electronic computers is stored and transmitted in various ways, but they are always interpreted as a sequence of binary values, either 0 or 1. One binary value is called a bit. Eight bits is called an octet, or a byte. On this there is consensus.
A bit is denoted as b
, and a byte as B
. On this there is consensus, and if you ever spot an application breaking this convention, it's definitely a bug or an error. People frequently confuse the two, but application developers and manufacturers on the whole do not.
Once you get to larger units, there are two schools of thought, which sadly means that there is no consensus. Different operating systems and different applications belong to one school of thought or another.
Ubuntu's unit policy:
Ubuntu has a published units policy, which defines units like this.
The first set of units are multiples of 1024. (Why 1024? Because 1024 is 2 to the power of 10, which can make life easier for programmers.) This set of units is called binary units or the IEC prefixes, after the IEC standard that defined them:
-
One kibibyte:
1KiB
= 1024 bytes (note the capital K) -
One mebibyte:
1MiB
=1024KiB
= 1048576 bytes -
One gibibyte:
1GiB
=1024MiB
=1048576KiB
= 1073741824 bytes
The second set of units are multiples of 1000. This aligns much more closely with commonly used units in the SI system, such as metres, litres and grams. A kilogram is 1000 grams; in the same way, a kilobyte is 1000 bytes. This set of units is called decimal units or the SI prefixes.
-
One kilobyte:
1kB
= 1000 bytes (note the lowercase k) -
One megabyte:
1MB
=1000kB
= 1000000 bytes -
One gigabyte:
1GB
=1000MB
=1000000kB
= 1000000000 bytes
The traditional units:
Traditionally, many applications, operating systems and developers used binary units, giving them SI names. Ubuntu, GNOME and OS X all attempt to follow the published standards as explained previously, however, Microsoft Windows and many UNIX utilities still use these traditional units, so you need to be aware of them.
-
One kilobyte:
1KB
= 1024 bytes (note the capital K) -
One megabyte:
1MB
=1024KB
= 1048576 bytes -
One gigabyte:
1GB
=1024MB
=1048576KB
= 1073741824 bytes
Traditionally, however, speeds are specified in bits per second, with SI prefixes! So 1Mbps is actually 1000000 bits per second, which is 125000 bytes per second, even on Microsoft Windows.
How to avoid ambiguity:
As you can see, these conflicting definitions lead to a lot of confusion. When I say 1MB
, do I mean a million bytes, or do I mean 1048576 bytes?
To avoid ambiguity, you should use one of these strategies:
- Exclusively use IEC prefixes.
1MiB
is always unambiguous. - Include a conversion to the number of bytes. eg: 1MB or 1000000 bytes
- Use both IEC and SI prefixes, eg: 1MiB or 1.048MB approx. I prefer this solution, as it makes it clear what you mean, and it the reader doesn't have to perform any mental calculations.
Where there is ambiguity, here's a good set of rules of thumb that has served me well:
- If you spot
KB
(with a capital K), then the traditional units are probably being used. - If you spot
kB
(with a lowercase k), then the SI units are probably being used. - If the number is describing a speed, then decimal units are probably being used.
- If the number is on OS X, on modern Ubuntu or GNOME applications, then decimal units are probably being used.
- If the number is on a hard drive or another piece of computing equipment, then decimal units are probably being used.
- If the number is from a command-line utility on Linux, then traditional binary units are probably being used.
- If the number is from a Microsoft Windows application, then traditional binary units are probably being used.
When it comes to Ubuntu applications, have a look a this list specifying which applications use which system.
References:
- Ubuntu's units policy
units
man page