ChatGPT解决这个技术问题 Extra ChatGPT

Should I check in folder "node_modules" to Git when creating a Node.js app on Heroku?

I followed the basic getting started instructions for Node.js on Heroku here:

https://devcenter.heroku.com/categories/nodejs

These instruction don't tell you to create a .gitignore node_modules, and therefore imply that folder node_modules should be checked in to Git. When I included node_modules in Git repository, my getting started application ran correctly.

When I followed the more advanced example at:

Building a Real-time, Polyglot Application with Node.js, Ruby, MongoDB and Socket.IO

https://github.com/mongolab/tractorpush-server (source)

It instructed me to add folder node_modules to file .gitignore. So I removed folder node_modules from Git, added it to file .gitignore, and then redeployed. This time the deployed failed like so:

-----> Heroku receiving push
-----> Node.js app detected
-----> Resolving engine versions
       Using Node.js version: 0.8.2
       Using npm version: 1.0.106
-----> Fetching Node.js binaries
-----> Vendoring node into slug
-----> Installing dependencies with npm
       Error: npm doesn't work with node v0.8.2
       Required: node@0.4 || 0.5 || 0.6
           at /tmp/node-npm-5iGk/bin/npm-cli.js:57:23
           at Object.<anonymous> (/tmp/node-npm-5iGk/bin/npm-cli.js:77:3)
           at Module._compile (module.js:449:26)
           at Object.Module._extensions..js (module.js:467:10)
           at Module.load (module.js:356:32)
           at Function.Module._load (module.js:312:12)
           at Module.require (module.js:362:17)
           at require (module.js:378:17)
           at Object.<anonymous> (/tmp/node-npm-5iGk/cli.js:2:1)
           at Module._compile (module.js:449:26)
       Error: npm doesn't work with node v0.8.2
       Required: node@0.4 || 0.5 || 0.6
           at /tmp/node-npm-5iGk/bin/npm-cli.js:57:23
           at Object.<anonymous> (/tmp/node-npm-5iGk/bin/npm-cli.js:77:3)
           at Module._compile (module.js:449:26)
           at Object.Module._extensions..js (module.js:467:10)
           at Module.load (module.js:356:32)
           at Function.Module._load (module.js:312:12)
           at Module.require (module.js:362:17)
           at require (module.js:378:17)
           at Object.<anonymous> (/tmp/node-npm-5iGk/cli.js:2:1)
           at Module._compile (module.js:449:26)
       Dependencies installed
-----> Discovering process types
       Procfile declares types -> mongod, redis, web
-----> Compiled slug size is 5.0MB
-----> Launching... done, v9

Running "heroku ps" confirms the crash. OK, no problem, so I rolled back the change, added folder node_module back to the Git repository and removed it from file .gitignore. However, even after reverting, I still get the same error message on deploy, but now the application is running correctly again. Running "heroku ps" tells me the application is running.

What's the right way to do this? Include folder node_modules or not? And why would I still be getting the error message when I rollback? My guess is the Git repository is in a bad state on the Heroku side.

I am the Node language owner at Heroku and the answer is simple: No. Do not check node_modules in to Heroku apps.
@hunterloftis 'Do not check node_modules in to' or 'Do not check node_modules into' ? To clarify, as the Node language owner at Heroku, do you want us to upload our entire node_modules via our git push or not? I prefer not to due to bandwidth waste and the fact that Heroku will get them on the backend of my git push; however, I have had to edit files in my node_modules manually to get Heroku to load my app. I have therefore had to ignore node_modules minus the whole module that included my edited file to get it to work.

J
Josh Correia

Second Update

The FAQ is not available anymore.

From the documentation of shrinkwrap:

If you wish to lock down the specific bytes included in a package, for example to have 100% confidence in being able to reproduce a deployment or build, then you ought to check your dependencies into source control, or pursue some other mechanism that can verify contents rather than versions.

Shannon and Steven mentioned this before but I think it should be part of the accepted answer.

Update

The source listed for the below recommendation has been updated. They are no longer recommending the node_modules folder be committed.

Usually, no. Allow npm to resolve dependencies for your packages. For packages you deploy, such as websites and apps, you should use npm shrinkwrap to lock down your full dependency tree: https://docs.npmjs.com/cli/shrinkwrap

Original Post

For reference, npm FAQ answers your question clearly:

Check node_modules into git for things you deploy, such as websites and apps. Do not check node_modules into git for libraries and modules intended to be reused. Use npm to manage dependencies in your dev environment, but not in your deployment scripts.

and for some good rationale for this, read Mikeal Rogers' post on this.

Source: https://docs.npmjs.com/misc/faq#should-i-check-my-node-modules-folder-into-git


This is not correct - in fact it is very bad idea. If you are developing on Windows then deploying on Linux, you will need to rebuild node_modules when you deploy. Which means - chaos. Lots of modified files, and no idea what to do.
That's not possible - some of our devs develop targetting windows, others targetting linux, but the same code base. The best approach would be to not commit node modules - oops.
@user3690202 Sounds like you have a very unconventional case, rather than the norm, so saying "this is not correct" is probably an overstatement. Having said that, not sure what your exact use case is, but I can't think of any reason for using both windows and linux for development. Stick to one, and run tests or QA on all platforms your support.
@Kostia Our use case is a pretty common one. We are volunteers and using our own machines, not company ones. Seems like a pretty common situation for open source.
@Adam tangentially, could you add the files being compiled to .gitignore? That way, the source is in git, and any compiled components are not, similarly to how dist or output folders are gitignored in grunt and gulp projects.
P
Peter Mortensen

My biggest concern with not checking folder node_modules into Git is that 10 years down the road, when your production application is still in use, npm may not be around. Or npm might become corrupted; or the maintainers might decide to remove the library that you rely on from their repository; or the version you use might be trimmed out.

This can be mitigated with repository managers like Maven, because you can always use your own local Nexus (Sonatype) or Artifactory to maintain a mirror with the packages that you use. As far as I understand, such a system doesn't exist for npm. The same goes for client-side library managers like Bower and Jam.js.

If you've committed the files to your own Git repository, then you can update them when you like, and you have the comfort of repeatable builds and the knowledge that your application won't break because of some third-party action.


Plenty of options today: Nexus (issues.sonatype.org/browse/NEXUS-5852), Artifactory (jfrog.com/jira/browse/RTFACT-5143), npm_lazy (github.com/mixu/npm_lazy), npm-lazy-mirror (npmjs.org/package/npm-lazy-mirror), etc.
Quote from npmjs FAQ: "If you are paranoid about depending on the npm ecosystem, you should run a private npm mirror or a private cache.". I think this points to the issue you are referring, right?
Npm isn't going to disappear over night, so the benefit doesn't really pair up well with the loss of clarity in your commit history and your huge bundle size. If someone is building an application that they think will still be active in 10 years, it's reasonable to expect that it will receive a lot of maintenance along the way. The point about NPM outages is a much better argument though, although there are probably better ways to mitigate that risk than committing to source.
Even one month down the road is dangerous if you don't commit your dependencies (preferably in a separate repository though). As I found one morning when I cloned one of my projects and found a package version had been removed from npm. I spent half a day changing all my versions of cascading dependencies to get npm update to work and build again.
P
Peter Mortensen

You should not include folder node_modules in your .gitignore file (or rather you should include folder node_modules in your source deployed to Heroku).

If folder node_modules:

exists then npm install will use those vendored libraries and will rebuild any binary dependencies with npm rebuild.

doesn't exist then npm install will have to fetch all dependencies itself which adds time to the slug compile step.

See the Node.js buildpack source for these exact steps.

However, the original error looks to be an incompatibility between the versions of npm and Node.js. It is a good idea to always explicitly set the engines section of your packages.json file according to this guide to avoid these types of situations:

{
  "name": "myapp",
  "version": "0.0.1",
  "engines": {
    "node": "0.8.x",
    "npm":  "1.1.x"
  }
}

This will ensure development/production parity and reduce the likelihood of such situations in the future.


Thanks for the help Ryan. That got me past the npm version error but now it fails when compiling the redis package. The error message is "OSError: [Errno 2] No such file or directory: '/Users/Jason/tastemade/tastebase/node_modules/redis-url/node_modules/redis/node_modules/hiredis/build'". It looks like it's using a path from my local box on the heroku servers. Are there certain files in the node_modules I need to add to .gitignore?
I'm not sure what's going on with that particular library, but I'd try excluding node_modules from git in this case and seeing if that helps (forcing npm to fetch everything itself and ensuring a fresh build environment).
@RyanDaigle Best practice now (Nov 2013) recommended by both npm (npmjs.org/doc/…) and heroku (devcenter.heroku.com/articles/…) is to check in node_modules to git. Would you update your answer (as it has top billing)?
While pushing to heroku you'll get the output "-----> Caching node_modules directory for future builds". This is to shorten future slug compilation.
I have a problem that the node_modules filepath is too long to commit. Git wont find the files.
P
Peter Mortensen

I was going to leave this after this comment: Should I check in folder "node_modules" to Git when creating a Node.js app on Heroku?

But Stack Overflow was formatting it weirdly.

If you don't have identical machines and are checking in node_modules, do a .gitignore on the native extensions. Our .gitignore looks like:

# Ignore native extensions in the node_modules folder (things changed by npm rebuild)
node_modules/**/*.node
node_modules/**/*.o
node_modules/**/*.a
node_modules/**/*.mk
node_modules/**/*.gypi
node_modules/**/*.target
node_modules/**/.deps/
node_modules/**/build/Makefile
node_modules/**/**/build/Makefile

Test this by first checking everything in, and then have another developer do the following:

rm -rf node_modules
git checkout -- node_modules
npm rebuild
git status

Ensure that no files changed.


Just added this. Solved my issue. The windows github kept crashing trying to go over 7000+ node_module files :/
P
Peter Mortensen

I believe that npm install should not run in a production environment. There are several things that can go wrong - npm outage, download of newer dependencies (shrinkwrap seems to have solved this) are two of them.

On the other hand, folder node_modules should not be committed to Git. Apart from their big size, commits including them can become distracting.

The best solutions would be this: npm install should run in a CI environment that is similar to the production environment. All tests will run and a zipped release file will be created that will include all dependencies.


Why would you have a step that runs on CI that wouldn't run as part of your deployment? This means you don't have parity between the 2 systems! As the answer says above - commit the folder just ignore the native extensions, that way you are covered for things like npm outages
Thanks for your comment. I believe that the node_modules that run in your production server should be generated from an npm install, not from whatever the devs have commited. A dev's node_modules folder does not necessarily match the package.json contents.
P
Peter Mortensen

I have been using both committing the node_modules folder and shrink-wrapping. Both solutions did not make me happy.

In short: a committed node_modules folder adds too much noise to the repository.And shrinkwrap.json is not easy to manage and there isn't any guarantee that some shrink-wrapped project will build in a few years.

I found that Mozilla was using a separate repository for one of their projects: https://github.com/mozilla-b2g/gaia-node-modules

So it did not take me long to implement this idea in a Node.js CLI tool: https://github.com/bestander/npm-git-lock

Just before every build, add:

npm-git-lock --repo [git@bitbucket.org:your/dedicated/node_modules/git/repository.git]

It will calculate the hash of your package.json file and will either check out folder node_modules content from a remote repository, or, if it is a first build for this package.json file, will do a clean npm install and push the results to the remote repository.


P
Peter Mortensen

Explicitly adding a npm version to file package.json ("npm": "1.1.x") and not checking in folder node_modules to Git worked for me.

It may be slower to deploy (since it downloads the packages each time), but I couldn't get the packages to compile when they were checked in. Heroku was looking for files that only existed on my local box.


In case this is still up for debate, I would take a look at this stackoverflow post which is almost a duplicate of your question above: stackoverflow.com/questions/11459733/… Basically, it seems the convention is to check in node_modules, and manage your versions of those modules locally. This seems pretty reasonable, and perhaps the most succinct explanation is this: mikealrogers.com/posts/nodemodules-in-git.html Good luck!
P
Peter Mortensen

Instead of checking in folder node_modules, make a package.json file for your application.

The package.json file specifies the dependencies of your application. Heroku can then tell npm to install all of those dependencies. The tutorial you linked to contains a section on package.json files.


I do have a package.json. It has the following: { "name": "node-example", "version": "0.0.1", "dependencies": { "express": "2.5.x", "redis-url": "0.1.0", "mongodb": ">=0.9.9" }, "engines": { "node": "0.8.x" } }
I did on my local box to create the node_modules directory. That's what I checked in, then removed, then added back.
After looking at the tutorial more, it seems as though they are committing node_modules. In that case, I'm not sure if there's a way to not commit node_modules. Sorry
P
Peter Mortensen

From "node_modules" in Git:

To recap. Only checkin node_modules for applications you deploy, not reusable packages you maintain. Any compiled dependencies should have their source checked in, not the compile targets, and should $ npm rebuild on deploy.

My favorite part:

All you people who added node_modules to your gitignore, remove that shit, today, it’s an artifact of an era we’re all too happy to leave behind. The era of global modules is dead.

(The original link was this one, but it is now dead. Thanks @Flavio for pointing it out.)*


The site you linked seems to have been let expired and now full of scammy ads. I wish those ads were "artifacts of an era we’d all be too happy to leave behind".
@FlavioCopes Updated my answer with link from Wayback Machine.
P
Peter Mortensen

I am using this solution:

Create a separate repository that holds folder node_modules. If you have native modules that should be build for specific platform then create a separate repository for each platform. Attach these repositories to your project repository with git submodule: git submodule add .../your_project_node_modules_windows.git node_modules_windows git submodule add .../your_project_node_modules_linux_x86_64 node_modules_linux_x86_64 Create a link from platform-specific node_modules to node_modules directory and add node_modules to .gitignore. Run npm install. Commit submodule repository changes. Commit your project repository changes.

So you can easily switch between node_modules on different platforms (for example, if you are developing on OS X and deploying to Linux).


P
Peter Mortensen

Scenario 1:

One scenario:

You use a package that gets removed from npm. If you have all the modules in the folder node_modules, then it won't be a problem for you. If you do only have the package name in the package.json, you can't get it anymore.

If a package is less than 24 hours old, you can easily remove it from npm. If it's older than 24 hours old, then you need to contact them.

But:

If you contact support, they will check to see if removing that version of your package would break any other installs. If so, we will not remove it.

read more

So the chances for this are low, but there is scenario 2...

Scenario 2:

An other scenario where this is the case:

You develop an enterprise version of your software or a very important software and write in your package.json:

"dependencies": {
    "studpid-package": "~1.0.1"
}

You use the method function1(x)of that package.

Now the developers of studpid-package rename the method function1(x)to function2(x) and they make a fault... They change the version of their package from 1.0.1 to 1.1.0. That's a problem because when you call npm install the next time, you will accept version 1.1.0 because you used the tilde ("studpid-package": "~1.0.1").

Calling function1(x) can cause errors and problems now.

Pushing the whole node_modules folder (often more than 100 MB) to your repository, will cost you memory space. A few kb (package.json only) compared with hundreds of MB (package.json & node_modules)... Think about it.

You could do it / should think about it if:

the software is very important.

it costs you money when something fails.

you don't trust the npm registry. npm is centralized and could theoretically be shut down.

You don't need to publish the node_modules folder in 99.9% of the cases if:

you develop a software just for yourself.

you've programmed something and just want to publish the result on GitHub because someone else could maybe be interested in it.

If you don't want the node_modules to be in your repository, just create a .gitignore file and add the line node_modules.


l
laggingreflex

If you're rolling your own modules specific to your application, you can either:

Keep those (and only those) in your application's /node_modules folder and move out all the other dependencies to parent ../node_modules folder. This will work because of how NodeJS CommonJS modules system works by moving up to the parent directory, and so on, until the root of the tree is reached. See: https://nodejs.org/api/modules.html

Or gitignore all /node_modules/* except your /node_modules/your-modules. See: Make .gitignore ignore everything except a few files

This use case is pretty awesome. It lets you keep modules you created specifically for your application nicely with it and doesn't clutter with dependencies which can be installed later.