Node.js Spawn vs. Execute
In an online training video I am watching to learn Node, the narrator says that "spawn is better for longer processes involving large amounts of data, whereas execute is better for short bits of data."
Why is this? What is the difference between the child_process spawn and execute functions in Node.js, and when do I know which one to use?
The main difference is the spawn
is more suitable for long-running process with huge output. spawn
streams input/output with child process. exec
buffered output in a small (by default 200K) buffer. Also as I know exec
first spawn subshell, then try to execute your process. To cut long story short use spawn
in case you need a lot of data streamed from child process and exec
if you need such features as shell pipes, redirects or even you need exec more than one program in one time.
Some useful links - DZone Hacksparrow
-
child process created by
spawn()
- does not spawn a shell
- streams the data returned by the child process (data flow is constant)
- has no data transfer size limit
-
child process created by
exec()
- does spawn a shell in which the passed command is executed
- buffers the data (waits till the process closes and transfers the data in on chunk)
- maximum data transfer up to Node.js v.12.x was 200kb (by default), but since Node.js v.12x was increased to 1MB (by default)
-main.js (file)
var {spawn, exec} = require('child_process');
// 'node' is an executable command (can be executed without a shell)
// uses streams to transfer data (spawn.stout)
var spawn = spawn('node', ['module.js']);
spawn.stdout.on('data', function(msg){
console.log(msg.toString())
});
// the 'node module.js' runs in the spawned shell
// transfered data is handled in the callback function
var exec = exec('node module.js', function(err, stdout, stderr){
console.log(stdout);
});
-module.js (basically returns a message every second for 5 seconds than exits)
var interval;
interval = setInterval(function(){
console.log( 'module data' );
if(interval._idleStart > 5000) clearInterval(interval);
}, 1000);
- the
spawn()
child process returns the messagemodule data
every 1 second for 5 seconds, because the data is 'streamed' - the
exec()
child process returns one message onlymodule data module data module data module data module data
after 5 seconds (when the process is closed) this is because the data is 'buffered'
NOTE that neither the spawn()
nor the exec()
child processes are designed for running node modules, this demo is just for showing the difference, (if you want to run node modules as child processes use the fork()
method instead)
A good place to start is the NodeJS documentation.
For 'spawn' the documentation state:
The child_process.spawn() method spawns a new process using the given command, with command line arguments in args. If omitted, args defaults to an empty array.
While for 'exec':
Spawns a shell then executes the command within that shell, buffering any generated output. The command string passed to the exec function is processed directly by the shell and special characters (vary based on shell) need to be dealt with accordingly.
The main thing appears to be whether you need handle the output of the command or not, which I imagine could be the factor impacting performance (I haven't compared). If you care only about process completion then 'exec' would be your choice. Spawn opens streams for stdout and stderr with ondata events, exec just returns a buffer with stdout and stderr as strings.