What are disk sectors for?

According to this article: http://www.tech-faq.com/how-data-is-stored-in-your-hard-disk.html hard disks write data in a linear path (as what I presume).

Unfortunately that article you cite is not very good. The author uses the concept of "linear path", but disks are also known as random access devices (as opposed to sequential access devices such as mag tape). The alleged "second concept" that "data is stored on the first available space" is false, since allocation is determined by the filesystem of the OS, and is based on strange factors (cylinder boundaries?) as evidenced by clumps of unused clusters in WinXP's defrag representation. (And the Wikipedia article is not much better: it has inaccuracies and is PC-centric.)

The reasons for using disk sectors are:

  • It's the (overall) unit of magnetic recording.
  • It's a unit of data access & transfer.
  • It's a (base) unit of allocation.

Magnetic Recording

Reading and writing data on a magnetic medium requires that the medium to be moving, and that the erase & write heads be turned on and off away from existing data. So the disk data is always written and read in units of a sector (or more precisely a data record) in order to preserve the layout (or format) of each track.

A more complete explanation is my answer for: Is it possible to detect the previous byte position on a hard drive after it has been overwritten?

The gist is that writing data to disk must avoid glitching (when turning on the erase & write heads) any existing data already on the drive. The data on the disk are grouped into records. The area between the records is called an inter-record gap, or simply gap. Within that gap is a special area called the write splice. The erase & write heads must turn on or turn off only within these write splice areas, so as to never damage any existing recorded data (including the gap data immediately before and after each record). Note: the process of (physically) formatting a hard drive is the process of writing an address mark, ID record, (blank) data record and all necessary gaps for each sector on every track of the HDD. When a sector is "written", only the data record (and its leading & trailing gaps) of the sector is rewritten. The address mark and ID record are never rewritten after the format.

Data Access & Transfer

Disk drives are "random access" devices. That is, each sector is addressable, and sectors can be read and written in any order. Note that accessing sectors can be random, but that the bytes within the sector are ordered sequentially. In comparison a sequential access device (such as mag tape) may have to process all preceding records from the beginning of the medium before accessing the requested record.

Since a full "sector" always has to be read or written from/to the disk, it stands to reason that the interface between the host and drive would also transfer the same number of data bytes. Buffers on both sides of the drive interface must exist to accommodate a sector's worth of data for a transfer. The amount of (host) main memory to set aside for disk buffers and the time to perform the I/O on those buffers are both (negatively) affected by a large sector size.

Allocation

The filesystem will define some unit of allocation for available (or unused) versus allocated (to a file). This allocation unit will always be based on some number of sectors, since the sector size is the fundamental unit of access and physical I/O. A small allocation size (such as just 1 sector) tends to have more negative (rather than positive, i.e. less wasted slack space) impact on filesystem (and disk) performance, such as larger allocation table, more bookkeeping. A small sector size may also constrain sector addressing and total disk capacity, hence the move to the larger 4KB sector.

Note that disk drives and disk controllers did not always impose fixed-sized sectors. For instance Storage Module Drives, SMDs, (for which I did a controller firmware) could have arbitrary sized "sectors", including different sized "sectors" on each track. Of course a filesystem may have difficulty keeping track of what size is where. Hence the extreme simplification of using just one size of sector for the whole drive. IBM for its PC took it a step further, and only supported 512-byte sectors (until optical media came along and again for 4KB sectors). Prior to the IBM PC, sector sizes of 128, 256, and 1024 bytes as well as 512 were in use (especially for floppies, which reused a lot of hard disk concepts including soft sectoring). Because the data capacity of magnetic media depended on the track format (which included the sector size) and that in turn depended on the OS and filesystem, magnetic media (i.e. hard and floppy disks) used to (a long time ago) advertise the unformatted capacity (along with decimal-based "MB" and "GB"). Since PCs made the 512-byte sector the standard size, HDDs no longer support soft sectoring and the "unformatted capacity" is a meaningless number.