File size limit exceeded in bash

I have tried this shell script on a SUSE 10 server, kernel 2.6.16.60, ext3 filesystem

the script has problem like this:

cat file | awk '{print $1" "$2" "$3}' | sort -n > result

the file's size is about 3.2G, and I get such error message: File size limit exceeded

in this shell, ulimit -f is unlimited

after I change script into this

cat file | awk '{print $1" "$2" "$3}' >tmp
sort -n tmp > result

the problem is gone.

I don't know why, can anyone help me with an explanation?


Solution 1:

The pipe version needs many more temporary files. You can inspect this quickly with the strace utility.

The pipe version use a quick exploding number of temporary files:

for i in {1..200000} ; do echo $i ; done |strace sort -n |& grep -e 'open.*/tmp/'
open("/tmp/sortb9Mhqd", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
open("/tmp/sortqKOVvG", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
open("/tmp/sortb9Mhqd", O_RDONLY)       = 3
open("/tmp/sortqKOVvG", O_RDONLY)       = 4

The file version doesn't use temporary files for the same data set. For bigger data sets it use extremely less temporary files.

for i in {1..200000} ; do echo $i ; done >/tmp/TESTDATA ; strace sort -n /TMP/TESTDATA |& grep -e 'open.*/tmp/'