Should "node_modules" folder be included in the git repository

I'm wondering if we should be tracking node_modules in our repo or doing an npm install when checking out the code?


Solution 1:

The answer is not as easy as Alberto Zaccagni suggests. If you develop applications (especially enterprise applications), including node_modules in your git repo is a viable choice and which alternative you choose depends on your project.

Because he argued very well against node_modules I will concentrate on arguments for them.

Imagine that you have just finished enterprise app and you will have to support it for 3-5 years. You definitely don't want to depend on someone's npm module which can tomorrow disappear and you can't update your app anymore.

Or you have your private modules which are not accessible from the internet and you can't build your app on the internet. Or maybe you don't want to depend on your final build on npm service for some reason.

You can find pros and cons in this Addy Osmani article (although it is about Bower, it is almost the same situation). And I will end with a quote from Bower homepage and Addy's article:

“If you aren’t authoring a package that is intended to be consumed by others (e.g., you’re building a web app), you should always check installed packages into source control.”

Solution 2:

Modules details are stored in packages.json, that is enough. There's no need to checkin node_modules.

People used to store node_modules in version control to lock dependencies of modules, but with npm shrinkwrap that's not needed anymore.

Another justification for this point, as @ChrisCM wrote in the comment:

Also worth noting, any modules that involve native extensions will not work architecture to architecture, and need to be rebuilt. Providing concrete justification for NOT including them in the repo.

Solution 3:

I would recommend against checking in node_modules because of packages like PhantomJS and node-sass for example, which install the appropriate binary for the current system.

This means that if one Dev runs npm install on Linux and checks in node_modules – it won't work for another Dev who clones the repo on Windows.

It's better to check in the tarballs which npm install downloads and point npm-shrinkwrap.json at them. You can automate this process using shrinkpack.

Solution 4:

This topic is pretty old, I see. But I'm missing some update to arguments provided here due to changed situation in npm's eco system.

I'd always advise not to put node_modules under version control. Nearly all benefits from doing so as listed in context of accepted answer are pretty outdated as of now.

  1. Published packages can't be revoked from npm registry that easily anymore. So you don't have to fear loosing dependencies your project has relied on before.

  2. Putting package-json.lock file in VCS is helping with frequently updated dependencies probably resulting in different setups though relying on same package.json file.

So, putting node_modules into VCS in case of having offline build tools might be considered the only eligible use case left. However, node_modules usually grows pretty fast. Any update will change a lot of files. And this is affecting repositories in different ways. If you really consider long-term affects that might be an impediment as well.

Centralized VCS' like svn require transferring committed and checked out files over the network which is going to be slow as hell when it comes to checking out or updating a node_modules folder.

When it comes to git this high number of additional files will instantly pollute the repository. Keep in mind that git isn't tracking differences between versions of any file, but is storing copies of either version of a file as soon as a single character has changed. Every update to any dependency will result in another large changeset. Your git repository will quickly grow huge because of this affecting backups and remote synchronization. If you decide to remove node_modules from git repository later it is still part of it for historical reasons. If you have distributed your git repository to some remote server (e.g. for backup) cleaning it up is another painful and error-prone task you'd be running into.

Thus, if you care for efficient processes and like to keep things "small" I'd rather use a separate artifacts repository such as Nexos Repository (or just some HTTP server with ZIP archives) providing some previously fetched set of dependencies for download.