What is the difference between `read` and `sysread`?

Solution 1:

About read:

  • read supports PerlIO layers.
  • read works with any Perl file handle[1].
  • read buffers.
  • read obtains data from the system in fixed sized blocks of 8 KiB[2].
  • read may block if less data than requested is available[3].

About sysread:

  • sysread doesn't support PerlIO layers (meaning it requires a raw a.k.a. binary handle).
  • sysread only works with Perl file handles that map to a system file handle/descriptor[4].
  • sysread doesn't buffer.
  • sysread performs a single system call.
  • sysread returns immediately if data is available to be returned, even if the amount of data is less than the amount requested.

Summary and conclusions:

  • read works with any Perl file handle, while sysread is limited to Perl file handles mapped to a system file handle/descriptor.
  • read isn't compatible with select[5], while sysread is compatible with select.
  • read can perform decoding for you, while sysread requires that you do your own decoding.
  • read should be faster for very small reads, while sysread should be faster for very large reads.

Notes:

  1. These include, for example, tied file handles and those created using open(my $fh, '<', \$var).

  2. Before 5.14, Perl read in 4 KiB blocks. Since 5.14, the size of the blocks is configurable when you build perl, with a default of 8 KiB.

  3. In my experience, read will return exactly the amount requested (if possible) when reading from a plain file, but may return less when reading from a pipe. These results are by no means guaranteed.

  4. fileno returns a non-negative number for these. These include, for example, handles that read from plain files, from pipes and from sockets, but not those mentioned in [1].

  5. I'm referring to the 4-argument one called by IO::Select.