Cli: [FEATURE] Do not remove node_modules on npm ci

Created on 6 Dec 2019  ·  53Comments  ·  Source: npm/cli

What / Why

I would really like to see a flag like npm ci --keep to do an incremental update on our buildserver as this would speed up our deployments a lot. Like suggested before on github and on the community. The last update was on the 7th of October that this was being reviewed by the cli team. Could someone post an update on this? :-)

Enhancement

Most helpful comment

This is how I would like it to work in a perfect world :

  • npm install - same behavior as today
  • npm install --from-lockfile - install from the lockfile (like ci does)
  • npm install --clean - same behavior as npm install but delete the node_modules content
  • npm ci - an alias to npm install --from-lockfile --clean

All 53 comments

This is not what ci / cleaninstall is meant to do. The current behaviour is correct. What you want to use is npm shrinkwrap.

We added an update to avoid deleting the node_modules _folder_ but not its _contents_ (as originally requested on that post). The npm ci command purpose is to delete everything to start from a clean slate. If you want to keep your old node_modules what you need is npm i.

Thanks for your replies! Sorry for my late reply. I've looked at npm shrinkwrap but is this intended to run on our build server for continuous integration? When running this command it renames my package-lock.json to npm-shrinkwrap.json but what should I then run during CI? Just npm install to have an incremental update? Or should I run npm ci but that will delete all packages again :-( What I'm looking for is a command that does an incremental update but will install exactly what is in our package-lock.json

@claudiahdz; My understanding is that running npm install during CI will update the package-lock.json and that could mean that running the same build a couple of weeks later would install different packages. Is that incorrect?

P.s. I thought npm ci was short for Continuous Integration

As referenced here: https://github.com/npm/npm/issues/20104#issuecomment-403321557

The current behavior is problematic if you're using npm ci inside of a Docker container (which is quite common for Continuous Integration) and you have a bind mount on node_modules

It causes the following error:

webpack_1   | npm ERR! path /var/www/project/docker-config/webpack-dev-devmode/node_modules
webpack_1   | npm ERR! code EBUSY
webpack_1   | npm ERR! errno -16
webpack_1   | npm ERR! syscall rmdir
webpack_1   | npm ERR! EBUSY: resource busy or locked, rmdir '/var/www/project/docker-config/webpack-dev-devmode/node_modules'

which then results in aborting the Docker container.

It'd be lovely to have a --no-delete flag or if npm ci could delete the _contents_ of node_modules but not the directory itself.

ci = clean install

This is expected. Why don't you use the normal npm i with a lockfile?

It'd be lovely to have a --no-delete flag or if npm ci could delete the contents of node_modules but not the directory itself.

rm -rf node_modules/* && npm i

ci = clean install

This is expected. Why don't you use the normal npm i with a lockfile?

...because: https://docs.npmjs.com/cli/ci.html

This command is similar to npm-install, except it’s meant to be used in automated environments such as test platforms, continuous integration, and deployment – or any situation where you want to make sure you’re doing a clean install of your dependencies. It can be significantly faster than a regular npm install by skipping certain user-oriented features. It is also more strict than a regular install, which can help catch errors or inconsistencies caused by the incrementally-installed local environments of most npm users.

The faster installs and clean slate approach make this ideal for CI environments such as the one I mentioned above.

rm -rf node_modules/* && npm i

This is what I do now, but see above for the desire to use npm ci

It seems reasonable to me to file an RFC asking for a config flag that makes npm ci remove the contents of node_modules and not the dir itself. This is also an issue for me, in that i've set up Dropbox to selectively ignore a node_modules dir, but if i delete it, that selective setting goes away, and the next time node_modules is created, it syncs.

It seems reasonable to me to file an RFC asking for a config flag that makes npm ci remove the _contents_ of node_modules and not the dir itself. This is also an issue for me, in that i've set up Dropbox to selectively ignore a node_modules dir, but if i delete it, that selective setting goes away, and the next time node_modules is created, it syncs.

Isn't this also what another issue described, to allow npm to create files to ignore the dir (for OSX spotlight and others)? I think there were also others who need this feature.

ci = clean install

This is expected. Why don't you use the normal npm i with a lockfile?

npm i would be a great but only if it wouldn't change the lock file. I've seen the package-lock.json been updated during an npm i or should that not happen?

I support this feature. As stated, npm i modifies package-lock.json. A flag would be the ideal solution.

same, a flag would be great

I support this feature. As stated, npm i modifies package-lock.json. A flag would be the ideal solution.

Why not add a flag for npm i then? Because this would make not much sense for ci = clean install in my sense.

What part of "clean install" is incompatible with keeping the parent node_modules/ directory intact (while doing a clean install of the actual contents)?

I realize that CI doesn't stand for Continuous Integration in this case; but a Clean Install is often quite useful in a Continuous Integration environment, as the documentation makes clear.

This command is similar to npm-install, except it’s meant to be used in automated environments such as test platforms, continuous integration, and deployment – or any situation where you want to make sure you’re doing a clean install of your dependencies. It can be significantly faster than a regular npm install by skipping certain user-oriented features. It is also more strict than a regular install, which can help catch errors or inconsistencies caused by the incrementally-installed local environments of most npm users.

npm ci is specifically mean to be used in automated environments, many times this means a Docker-based setup.

The behavior of deleting the node_module/ directory is troublesome in a Docker-based setup, for the reasons mention in this thread.

So we're asking for an option that will make this command useful for its intended purpose and environment.

I support this feature. As stated, npm i modifies package-lock.json. A flag would be the ideal solution.

Why not add a flag for npm i then? Because this would make not much sense for ci = clean install in my sense.

I have to ask this question are their any other differences between npm install and npm ci if not then why aren't both options available in npm install maybe ci needs to become some alias like npm install --no-update-package-lock --clean-node-modules

The behavior of deleting the node_module/ directory is troublesome in a Docker-based setup, for the reasons mention in this thread.

In my opinion this should only happen once when the image is built. After that npm i should be used during development.

maybe ci needs to become some alias like npm install --no-update-package-lock --clean-node-modules

Personally that makes more sense to me, additional flags for the normal npm i command.

I'm indifferent to which, and honestly too much of a n00b with js land to have a concrete argument that it must be ci, all I know is that it should not update the package-lock.json and should not remove node_modules

npm cidoes not update the lockfile, it installs from the lockfile. This was introduced to do a clean install because prior this people were advised to rm -rf node_modulesand run npm i again. And afaik people wanted that it does not change the lockfile but installs from it.

So npm ci was born. And it also skips some things like the list of the installed packages and the tree and a few more things.

See https://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable

It covers a specific use case.

For other use cases we should add new flags to npm iwith which we can also emulate npm ciwhich is a more flexible and better solution than flags for npm ci which should still cover only the current use case imho. What users request here is a bit similar to yarn install --frozen-lockfile or yarn --frozen-lockfile.

Otherwise flags are spread over npm ci, npm iand so on which makes it a bit more difficult (documentation, code, ...). At least this is what I think. Let's put it to npm i t have more powerful and flexible ways to configure its behavior.

For other use cases we should add new flags to npm iwith which we can also emulate npm ciwhich is a more flexible and better solution than flags for npm ci which should still cover only the current use case imho. What users request here is a bit similar to yarn install --frozen-lockfile or yarn --frozen-lockfile.

I'd be very happy if the feature was added to npm i. Should I update the original post?

npm cidoes not update the lockfile, it installs from the lockfile. This was introduced to do a clean install because prior this people were advised to rm -rf node_modulesand run npm i again. And afaik people wanted that it does not change the lockfile but installs from it.

So npm ci was born. And it also skips some things like the list of the installed packages and the tree and a few more things.

See https://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable

It covers a specific use case.

For other use cases we should add new flags to npm iwith which we can also emulate npm ciwhich is a more flexible and better solution than flags for npm ci which should still cover only the current use case imho. What users request here is a bit similar to yarn install --frozen-lockfile or yarn --frozen-lockfile.

How is rm -rf node_modules/* not qualifying as "cleaning" ? The feature asked here is very similar to the one present in npm ci. In my opinion it makes more sense to add a flag to npm ci so it uses rm -rf node_modules/* instead of rm -rf node_modules instead of importing the entire behavior of npm ci into npm i.

BTW this issue should get more attention and maintainers should voice their opinions and plans about that, using docker is basically always used in CI (continuous integration) which is one of the main use case of npm ci !

Please open an RFC for this change, rather than an issue in this repo.

To avoid confusion, I’d rename this issue as “empty instead of remove node_modules dir in npm CI”

My intention of this issue was never to delete the node_modules folder or only it's contents. It was always to preserve the contents of node_modules but make sure it's up to date and in sync with package-lock.json. So an incremental update which adheres to the package-lock.json.

Maybe I'm wrong but I feel there are two issues here. Maybe someone could start another issue or RFC about deleting only contents of node_modules instead of deleting the folder completely? Or am I missing something?

@Zenuka the entire reason npm CI is fast, and exists, is because it ignores the existing node_modules dir, so it’s pretty unlikely that will change.

In our use case, I think it would be faster just to check if the nodes_modules folder is up to date or not. And if it's not, only update the packages that should be updated (like npm i does) I have some dedicated VM's running as build agents so running a build and keeping the nodes_modules folder and all it's contents should be faster then deleting everything and re-installing it. We run our build and tests for code changes a lot more than changes to our package.json or package-lock.json.

In our use case, I think it would be faster just to check if the nodes_modules folder is up to date or not.

Well, this (the calculation of the package tree) is what takes the most time. This check would make npm ci really slow.

running a build and keeping the nodes_modules folder and all it's contents should be faster then deleting everything and re-installing it.

Probably not, that's why npm ci was introduced which skips what npm i does (check the package tree).

@Zenuka npm install already is the fastest possible way to do what you want. npm ci has only one purpose: do it faster, by deleting node_modules so it doesn't have to compute a diff.

Probably not, that's why npm ci was introduced which skips what npm i does (check the package tree).

I've tested this only on my machine (which is of course not a good measure) but running npm install on an up-to-date node_modules folder finishes within 10 seconds. Running npm ci takes minutes. Would you expect different results?

I'm a fan of your suggestion to add a flag to freeze the lock file with npm install.

Verifying that what’s in package-lock.json is actually present is super fast, even on Windows. See https://github.com/fuzzykiller/verify-node-modules.

Verifying that nothing else is present in node_modules would certainly take a little longer but probably still less than a second.

On this basis, an incremental version of npm ci could easily be created. The tree is already calculated and saved to package-lock.json, after all.

Also, basically the only reason npm ci exists is to install what’s in package-lock.json. Without sneaking in surprise upgrades, like npm install does.

just my 2 cents, i personally switched our infra over to npm ci as i was also sick of on deploy of an old tag npm i would not adhere to the lock file... so if its seriously that big of an issue to add the flag at the npm ci level (which i get.. its clean install its doing what its told) then npm i REALLLLLYLYY needs this flag. but i remember researching this and there was also an issue thread on the npm i that was like over 2 years old (and still open) where the npm team suggested people use npm ci lol... this is kinda why people have given up on npm in the past and just gone to yarn.

again just another devs perspective

I put my vote for adding the possibility to keep the modules :heavy_plus_sign: .

+1 here - as @phyzical and @fuzzykiller said, there's no "sweet spot" between npm install and npm ci that will KEEP node_modules, but still respect package-lock.json and run faster.
Just run as fast as possible - look for dependencies from package-lock that already exist in node_modules, and then install everything else missing.. no upgrades, no deleting.

Personally I don't care which one it is (install or ci) that would have this, but all of this sounds like npm install should just have flags for everything and npm ci doesn't need to be a separate command.

This is somewhat frustrating, given that npm ci was originally touted as the solution to the same problem this issue is raising.

The original behavior that a number of people wanted for npm install was to look at the package-lock.json instead of package.json. We wanted a flag on npm install to turn that behavior on. What we got instead was npm ci, because:

the package.json describes the required dependencies of your project. If the current lock file cannot satisfy those, the lock file has to yield. The purpose of the lockfile is to create a repeatable installation across different machines, not to obsolete the package.json.

So, fine. npm install isn't the right place for that option, npm ci is. Except npm ci adds additional behaviors (clearing out the node_modules folder) that keep it from being a useful solution to the original problem. And the reason there can't be a flag on npm ci now is because:

ci = clean install

This is expected. Why don't you use the normal npm i with a lockfile?

Which... fine. I don't really care where the flag gets added. I don't have any stake in the underlying philosophy behind the interface. But could the flag please be added somewhere?

Heck, I wouldn't raise objections even if people wanted an entirely separate 3rd command, I couldn't care less. The only thing I care about is that 3 years after this conversation about respecting package-lock.json for normal installs got started, there's still no way to get the behavior that we were originally asking for.

At my workplace we've seen bugs from minor and bugfix version updates for packages. We really only want to be looking for those bugs during purposeful package upgrades, we don't want our dev environments to be using different package versions than our production environments. Consistency there is very important. Whatever anybody wants to call it or wherever anybody wants to put it, we want a fast way to get packages from the lockfile that also won't require us to sit through node-gyp builds for already-installed modules every time we run the command.

This is how I would like it to work in a perfect world :

  • npm install - same behavior as today
  • npm install --from-lockfile - install from the lockfile (like ci does)
  • npm install --clean - same behavior as npm install but delete the node_modules content
  • npm ci - an alias to npm install --from-lockfile --clean

@jdussouillez This is exactly what should happen. Very well said! I'd love to see this solution put in place.

It is consistently frustrating to run into this issue where we have to decide between speed and consistency for a CI pipeline. I've run into it 3 or 4 times for different reasons in the last 2 months alone.

This feature would be great for Azure Pipelines and other cloud architectures.

https://docs.microsoft.com/en-us/azure/devops/pipelines/release/caching?view=azure-devops#tip

Because npm ci deletes the node_modules folder to ensure that a consistent, repeatable set of modules is used, you should avoid caching node_modules when calling npm ci.

Closing: As @claudiahdz mentioned, we shipped a fix to this behavior where npm ci does not remove the node_nodules folder itself anymore but only it's contents (ref. https://github.com/npm/libcipm/blob/latest/CHANGELOG.md#407-2019-10-09). This was shipped in [email protected] back on July 21st (ref. https://github.com/npm/cli/blob/v6/CHANGELOG.md#6147-2020-07-21) & we've maintained the same experience in npm@7.

If you have a separate issue with npm ci or any other command, please use one of our issue templates to file a bug: https://github.com/npm/cli/issues/new/choose


Side notes...

@jdussouillez appreciate the feedback; In terms of installing directly from a lockfile - you can do that today with the flag --package-lock-only (ex. npm install --package-lock-only). In terms of adding a --clean flag to install, I don't feel like this adds much value but I could be wrong. If you feel strongly about it, we'd love to have you submit an RFC over at https://github.com/npm/rfcs

The comment made by @claudiahdz almost a year ago seems to be related with making sure the npm ci behavior is to delete the node_modules content, instead of the folder itself. Which is handy when mounting it into a docker container (for example), but still doesn't change the end result - npm ci will download all the dependencies from scratch.

Using npm install --package-lock-only seems to be doing the exact opposite of what the original issue is about (If I understand correctly) - it will only update the package-lock.json file, and will not download any dependencies.

What I understand from the original issue, is the need to have an option that gets a current state of the node_modules folder, and a package-lock.json file, and downloads only the required packages to get the node_modules versions to match the package-lock.json. So it will be much faster than downloading everything every time, with the same net result at the end.

Isn't that what npm install already always does?

Isn't that what npm install already always does?

AFAIK -
npm install will resolves all the dependencies according to the package.json file (ignoring the package-lock.json), compare with what is currently in the node_modules folder, and download the dependencies that need to be downloaded to match the requirements. It will also update the package-lock.json accordingly.

It definitely does not ignore the lockfile - it just takes into account the existing tree, which npm ci does not.

You are correct, I am sorry.
I remembered incorrectly, (maybe that was the behavior in the past?). Just did some testing with a simple dep tree, and when the package-lock.json file is present, npm i install exactly the versions it specifies, and does not change anything. This was just the behavior I was looking for, so I'm happy with it. 👍
I apologize for posting on a closed issue.

My original request was indeed what ATGardner describes:

What I understand from the original issue, is the need to have an option that gets a current state of the node_modules folder, and a package-lock.json file, and downloads only the required packages to get the node_modules versions to match the package-lock.json. So it will be much faster than downloading everything every time, with the same net result at the end.

My experience with npm install is that it sometimes updates the package-lock.json file. I tested this again this morning with a repository which I hadn't updated in a while and ran git pull and npm i. It didn't actually update any versions this time, just added some dependencies and extra packages.
image
Unfortunately this is a private repository but maybe someone else as a reproducible public repository? Where there are multiple commits and switching between them causes npm install to update the package-lock.json?

I realize there could be some user error involved when not commiting the package-lock.json when updating the package.json but my colleagues know that they should update the package-lock.json as well. I'll look into this.

I couldn't get my simple example to have npm i change the package-lock.json file. But I will try it out some more.

If npm i always ends up downloading the exact same versions specified in the package-lock.json, while keeping as much as it can from the current node_modules, why would I ever need to run npm ci? what would be the benefit of deleting everything before downloading again?

I apologize again for this not being the place for this discussion. Is there anywhere else more preferable?

I still don't understand. If the state of node_modules after running npm i exactly matches the package-lock.json, and the state of node_modules after running npm ci has the exact same end result - in almost all scenarios, assuming the computer you are building on already has some/most dependencies in the folder, wouldn't npm i will be faster? It will just not download what is already present locally, and matches the required version.

Why would I rather delete and download everything from scratch?

No, npm ciis still faster as it does not check the deptree again, some console output is not done.

Why would I rather delete and download everything from scratch?

To prevent issues and ci is for specific environments like deployments.
I think the docs already mention the differences.

It can be significantly faster than a regular npm install by skipping certain user-oriented features. It is also more strict than a regular install, which can help catch errors or inconsistencies caused by the incrementally-installed local environments of most npm users.

See also https://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable

npm ci is still faster.

So when using npm i, the time it takes to read the current node_modules, and figure out which packages should be downloaded, is significantly larger than the time it takes to actually download all the packages from npm's servers? I'd love to see an actual experiment that measures it.

And I also don't understand this paragraph -

npm ci bypasses a package’s package.json to install modules from a package’s lockfile. This ensures reproducible builds—you are getting exactly what you expect on every install.

Haven't we just concluded right here that running npm i uses the exact versions in the package-lock.json file, and the state of node_modules after the run is identical to the state it would be after running npm ci? So the builds will be just as reproducible.

UPDATE:

I have made the following test -
I created a new create-react-app project. After it completed it's initialization, it had a package.json with 7 direct dependencies, and a package-lock.json that contained 1982 packages.
At this state (node_modules contains all dependencies) - running npm i takes

real    0m2.548s
user    0m2.659s
sys     0m0.182s

When I deleted a single package folder (node_modules/babel-eslint), and then ran npm i again, it took

real    0m3.295s
user    0m3.543s
sys     0m0.434s

to re-download the missing dependency

When I deleted the entire node_moduels folder, and ran npm i again, it took

real    0m16.701s
user    0m19.251s
sys     0m10.379s

When I ran npm ci, it took

real    0m20.997s
user    0m23.844s
sys     0m14.857s

This did not differ by much when I tried removing a single package, or even deleting the entire node_modules folder manually before the call. It wasn't surprising, since npm ci starts by deleting the content of node_modules anyway.

After every run, I ran diff -q -r node_modules_orig/ node_modules/ to make sure the result is identical to the original dependencies. It always was.

So to conclude - it seems that using npm ci takes ~21 seconds on my machine, regardless of the current state of node_modules. Using npm i on a recently cloned project (no node_modules) takes ~18 seconds, and running it on a project that has no changed dependencies (the current node_modules matches the required dependencies) takes ~3 seconds.

So when would using npm ci be preferable? It doesn't seem faster (though of course, this is just a single test), and the end result is identical to npm i, so the subsequent build would be just as reliable.

npm ci is preferable when you need _exactly_ what is in package-lock.json and nothing but. npm i does not guarantee that it will install exactly what is in package-lock.json. This is by design. While package-lock.json is an input to npm i, it is also an output.

I believe there are only a few cases left where npm i would install something different (and thus modify package-lock.json), like maybe package versions that were soft-deleted.

Back when npm ci was first introduced, npm i either ignored package-lock.json outright or at least was a lot more proactive at installing different versions.

Either way, it doesn’t really matter. npm ci is only OK when the node_modules folder doesn’t exist yet. Otherwise it is prohibitively slow, especially on Windows. So npm i simply needs a flag that _guarantees_ it will not modify package-lock.json and install exactly what is inside package-lock.json.

I don’t see any point in further discussing the why and how. Either we’ll get it or we won’t. As is, npm ci sucks.

/update:
Here’s a repo where running npm i will change package-lock.json: https://github.com/fuzzykiller/npm-install-demo

Though the changes are only technical in nature, they’re still not acceptable.

Just to quickly reiterate:

  • npm ci always deletes the content of node_modules by design, which is undesirable for non-production builds because it's slow. However, it uses exact versions of packages found in package-lock.json, which is desirable for multiple situations.

  • npm install just updates the contents of node_modules, which is very performant, but by design it ignores the contents of package-lock.json if package.json version numbers differ, which is undesirable for multiple situations.

  • npm install --package-lock-only is described in the docs:

    The --package-lock-only argument will only update the package-lock.json, instead of checking node_modules and downloading dependencies.

    This does not seem useful for any of the scenarios described above.

What people have been asking for during the past 3 years:

  1. A command (anywhere) that will ignore package.json and _only_ respect package-lock.json as the definitive source of what packages will be installed.

  2. That will not delete the entire contents of node_modules and re-download everything from scratch.

As far as I can see from both the docs and local testing, npm install satisfies point 2, but not 1. npm ci satisfies point 1, but not 2. npm install --package-lock-only satisfies none of those requirements.

I'm not completely sure why this issue has been closed, there's still no way to get the desired behavior.


Edit: To extend off of @fuzzykiller's example, it's not just that package-lock.json gets updated. That would be annoying, but it wouldn't break any of my builds. But if package.json has fuzzy dependencies listed anywhere, and a bugfix version of those dependencies get released, they'll get changed when I run npm install on a new machine. Suddenly I have install differences between two machines. We've run into bugs at my company from exactly this behavior, it's not just that the package-lock.json needs to be checked into Git again.

It is desirable in that situation to have a command that behaves like npm ci -- that makes a reproduceable install based _only_ on the contents of package-lock.json. However, deleting the contents of the node_modules folder slows down builds too much for some environments and situations, even though it's appropriate behavior for a final production build.

There could be a flag anywhere to address this problem. It could be npm install --from-lockfile. It could be npm ci --preserve-existing. But right now it seems like we're in a circle where anyone who asks for a flag to get added to npm install gets pointed at npm ci as the solution, and anyone who asks for a flag on npm ci gets pointed at npm install as the solution. This issue was closed pointing at npm install --package-lock-only, but that flag is almost the opposite of what people are asking for. It doesn't respect package-lock.json as the authoritative source, and it also doesn't update or install any of the dependencies in the node_modules folder :)

This issue should be reopened.

Was this page helpful?
0 / 5 - 0 ratings