Node.js Spawn vs. Execute

In an online training video I am watching to learn Node, the narrator says that "spawn is better for longer processes involving large amounts of data, whereas execute is better for short bits of data."

Why is this? What is the difference between the child_process spawn and execute functions in Node.js, and when do I know which one to use?

The main difference is the spawn is more suitable for long-running process with huge output. spawn streams input/output with child process. exec buffered output in a small (by default 200K) buffer. Also as I know exec first spawn subshell, then try to execute your process. To cut long story short use spawn in case you need a lot of data streamed from child process and exec if you need such features as shell pipes, redirects or even you need exec more than one program in one time.

Some useful links - DZone Hacksparrow

child process created by spawn()
- does not spawn a shell
- streams the data returned by the child process (data flow is constant)
- has no data transfer size limit
child process created by exec()
- does spawn a shell in which the passed command is executed
- buffers the data (waits till the process closes and transfers the data in on chunk)
- maximum data transfer up to Node.js v.12.x was 200kb (by default), but since Node.js v.12x was increased to 1MB (by default)

-main.js (file)

var {spawn, exec} = require('child_process');

    // 'node' is an executable command (can be executed without a shell) 
    // uses streams to transfer data (spawn.stout)  
var spawn = spawn('node', ['module.js']);     
spawn.stdout.on('data', function(msg){         
    console.log(msg.toString())
});

    // the 'node module.js' runs in the spawned shell 
    // transfered data is handled in the callback function 
var exec = exec('node module.js', function(err, stdout, stderr){
    console.log(stdout);
});

-module.js (basically returns a message every second for 5 seconds than exits)

var interval;
interval = setInterval(function(){
    console.log( 'module data' );
    if(interval._idleStart > 5000) clearInterval(interval);
}, 1000);

the spawn() child process returns the message module data every 1 second for 5 seconds, because the data is 'streamed'
the exec() child process returns one message only module data module data module data module data module data after 5 seconds (when the process is closed) this is because the data is 'buffered'

NOTE that neither the spawn() nor the exec() child processes are designed for running node modules, this demo is just for showing the difference, (if you want to run node modules as child processes use the fork() method instead)

A good place to start is the NodeJS documentation.

For 'spawn' the documentation state:

The child_process.spawn() method spawns a new process using the given command, with command line arguments in args. If omitted, args defaults to an empty array.

While for 'exec':

Spawns a shell then executes the command within that shell, buffering any generated output. The command string passed to the exec function is processed directly by the shell and special characters (vary based on shell) need to be dealt with accordingly.

The main thing appears to be whether you need handle the output of the command or not, which I imagine could be the factor impacting performance (I haven't compared). If you care only about process completion then 'exec' would be your choice. Spawn opens streams for stdout and stderr with ondata events, exec just returns a buffer with stdout and stderr as strings.

Node.js Spawn vs. Execute

Related

Recent Posts