File size limit exceeded in bash
I have tried this shell script on a SUSE 10 server, kernel 2.6.16.60, ext3 filesystem
the script has problem like this:
cat file | awk '{print $1" "$2" "$3}' | sort -n > result
the file's size is about 3.2G, and I get such error message: File size limit exceeded
in this shell, ulimit -f is unlimited
after I change script into this
cat file | awk '{print $1" "$2" "$3}' >tmp
sort -n tmp > result
the problem is gone.
I don't know why, can anyone help me with an explanation?
Solution 1:
The pipe version needs many more temporary files. You can inspect this quickly with the strace utility.
The pipe version use a quick exploding number of temporary files:
for i in {1..200000} ; do echo $i ; done |strace sort -n |& grep -e 'open.*/tmp/'
open("/tmp/sortb9Mhqd", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
open("/tmp/sortqKOVvG", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
open("/tmp/sortb9Mhqd", O_RDONLY) = 3
open("/tmp/sortqKOVvG", O_RDONLY) = 4
The file version doesn't use temporary files for the same data set. For bigger data sets it use extremely less temporary files.
for i in {1..200000} ; do echo $i ; done >/tmp/TESTDATA ; strace sort -n /TMP/TESTDATA |& grep -e 'open.*/tmp/'