What are the pros and cons of fs.createReadStream vs fs.readFile in node.js?
I'm mucking about with node.js and have discovered two ways of reading a file and sending it down the wire, once I've established that it exists and have sent the proper MIME type with writeHead:
// read the entire file into memory and then spit it out
fs.readFile(filename, function(err, data){
if (err) throw err;
response.write(data, 'utf8');
response.end();
});
// read and pass the file as a stream of chunks
fs.createReadStream(filename, {
'flags': 'r',
'encoding': 'binary',
'mode': 0666,
'bufferSize': 4 * 1024
}).addListener( "data", function(chunk) {
response.write(chunk, 'binary');
}).addListener( "close",function() {
response.end();
});
Am I correct in assuming that fs.createReadStream might provide a better user experience if the file in question was something large, like a video? It feels like it might be less block-ish; is this true? Are there other pros, cons, caveats, or gotchas I need to know?
Solution 1:
A better approach, if you are just going to hook up "data" to "write()" and "close" to "end()":
// 0.3.x style
fs.createReadStream(filename, {
'bufferSize': 4 * 1024
}).pipe(response)
// 0.2.x style
sys.pump(fs.createReadStream(filename, {
'bufferSize': 4 * 1024
}), response)
The read.pipe(write)
or sys.pump(read, write)
approach has the benefit of also adding flow control. So, if the write stream cannot accept data as quickly, it'll tell the read stream to back off, so as to minimize the amount of data getting buffered in memory.
The flags:"r"
and mode:0666
are implied by the fact that it is a FileReadStream
. The binary
encoding is deprecated -- if an encoding is not specified, it'll just work with the raw data buffers.
Also, you could add some other goodies that will make your file serving a whole lot slicker:
- Sniff for
req.headers.range
and see if it matches a string like/bytes=([0-9]+)-([0-9]+)/
. If so, you want to just stream from that start to end location. (Missing number means 0 or "the end".) - Hash the inode and creation time from the stat() call into an ETag header. If you get a request header with "if-none-match" matching that header, send back a
304 Not Modified
. - Check the
if-modified-since
header against themtime
date on the stat object. 304 if it wasn't modified since the date provided.
Also, in general, if you can, send a Content-Length
header. (You're stat
-ing the file, so you should have this.)
Solution 2:
fs.readFile
will load the entire file into memory as you pointed out, while as fs.createReadStream
will read the file in chunks of the size you specify.
The client will also start receiving data faster using fs.createReadStream
as it is sent out in chunks as it is being read, while as fs.readFile
will read the entire file out and only then start sending it to the client. This might be negligible, but can make a difference if the file is very big and the disks are slow.
Think about this though, if you run these two functions on a 100MB file, the first one will use 100MB memory to load up the file while as the latter would only use at most 4KB.
Edit: I really don't see any reason why you'd use fs.readFile
especially since you said you will be opening large files.
Solution 3:
If it's a big file then "readFile" would hog the memory as it buffer all the file content in the memory and may hang your system. While ReadStream read in chunks.
Run this code and observe the memory usage in performance tab of task manager.
var fs = require('fs');
const file = fs.createWriteStream('./big_file');
for(let i=0; i<= 1000000000; i++) {
file.write('Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\n');
}
file.end();
//..............
fs.readFile('./big_file', (err, data) => {
if (err) throw err;
console.log("done !!");
});
Infact,you won't see "done !!" message. "readFile" wouldn't be able to read the file content as buffer is not big enough to hold the file content.
Now instead of "readFile", use readStream and monitor memory usage.
Note : code is taken from Samer buna Node course on Pluralsight