Cluster and Fork mode difference in PM2

I've searched a lot to figure out this question, but I didn't get clear explanation. Is there only one difference thing that clustered app can be scaled out and forked app cannot be?

PM2's public site explains Cluster mode can do these feature but no one says about pros of Fork mode (maybe, it can get NODE_APP_INSTANCE variable).

I feel like Cluster might be part of Fork because Fork seems like to be used in general. So, I guess Fork means just 'forked process' from the point of PM2 and Cluster means 'forked process that is able to be scaled out'. Then, why should I use Fork mode?


The main difference between fork_mode and cluster_mode is that it orders pm2 to use either the child_process.fork api or the cluster api.

What does this means internally?

Fork mode

Take the fork mode as a basic process spawning. This allows to change the exec_interpreter, so that you can run a php or a python server with pm2. Yes, the exec_interpreter is the "command" used to start the child process. By default, pm2 will use node so that pm2 start server.js will do something like:

require('child_process').spawn('node', ['server.js'])

This mode is very useful because it enables a lot of possibilities. For example, you could launch multiple servers on pre-established ports which will then be load-balanced by HAProxy or Nginx.

Cluster mode

The cluster will only work with node as it's exec_interpreter because it will access to the nodejs cluster module (eg: isMaster, fork methods etc.). This is great for zero-configuration process management because the process will automatically be forked in multiple instances. For example pm2 start -i 4 server.js will launch 4 instances of server.js and let the cluster module handle load balancing.


Node.js is single-thread.

That means only 1 core of your Intel quad-core CPU can execute the node application.

It called: fork_mode.

We use it for local dev.

pm2 start server.js -i 0 helps you running 1 node thread on each core of your CPU.

And auto-load-balance the stateless coming requests.

On the same port.

We call it: cluster_mode.

Which is used for the sake of performance on production.

You may also choose to do this on local dev if you want to stress test your PC :)