yarn probably shouldn't cache packages resolved with a file path

Created on 6 Dec 2016  ·  75Comments  ·  Source: yarnpkg/yarn

Do you want to request a feature or report a bug?

I guess a _bug_.

What is the current behavior?
If the current behavior is a bug, please provide the steps to reproduce.

Say you have the following project structure:

component-foo/
└── package.json
└── index.js

yarn-test/
└── package.json

With the following files:

component-foo/package.json:

{
  "name": "component-foo",
  "version": "1.0.0",
  "private": true,
  "main": "index.js"
}

component-foo/index.js:

console.log('foo');

yarn-test/package.json:

{
  "name": "yarn-test",
  "version": "1.0.0",
  "private": true,
  "dependencies": {
    "component-foo": "file:../component-foo"
  }
}

Now you run $ yarn install inside yarn-test/ and yarn-test/node_modules/component-foo/index.js is:

console.log('foo');

Now remove yarn-test/node_modules/ and yarn-test/yarn.lock and change component-foo/index.js to this:

console.log('bar');

Now you run $ yarn install inside yarn-test/ again and yarn-test/node_modules/component-foo/index.js will be:

console.log('foo');

It uses the cached version of component-foo, but component-foo/index.js has changed.

What is the expected behavior?

I'd expect that yarn-test/node_modules/component-foo/index.js should be this at the end:

console.log('bar');

Maybe packages installed with local paths like file:../ shouldn't be cached at all, if we can't figure out, if they have changed.

(FYI: It looks like npm doesn't cache them.)

Please mention your node.js, yarn and operating system version.

$ node -v
v6.9.1

$ yarn -V
0.18.0

macOS 10.12.1

Most helpful comment

@hantuzun why cache local dependencies at all? they are local anyways, so it will be fast regardless cached or not.

All 75 comments

This fooled me as well. There must be a way to get around this without clearing all the cache.

I think there could the three ways to make yarn usable for this case:

  1. @donaldpipowitch 's suggestion of ignoring all local dependencies.
  2. Reinstalling local dependencies if a file in there is modified later than when it's cached. This would require us to keep this metadata. For local dependencies we may insert lines like this to "Resolved" column: file://<path>@<cache_timestamp>
  3. Ignoring packages by package names with commands such as yarn cache rm <package> and yarn cache add <package>. For all dependencies.

I'd like to see the second suggestion to be implemented. Unless, the third option could be useful for other cases as well. For instance, yarn cache add <package> can be used to refresh the cache for already cached dependencies in case something could go wrong while downloading a dependency.

@hantuzun why cache local dependencies at all? they are local anyways, so it will be fast regardless cached or not.

@satya164 you're right. Although, I'm more inclined to the third approach would help if a dependency from network is modified intentionally.

Something like yarn cache ignore <package> will be useful. But I think they don't have to be mutually exclusive. Ignoring a package is useful, but it requires manual work. File dependencies could be made to work without any additional effort if they were ignored by default.

Can someone explain me the internal logic?

My understanding:
When a dependency with file: is encountered the file-resolver.js kicks in. It says the dependency should be copied and is not hashed. Doesn't no hash mean it should not be cached already...? But the copy-fetcher.js seems to set the hash to an empty string instead of keeping null... Is this problem?

@bestander or @kittens Maybe you could explain this a little bit more...? Would love to get a little help to create a PR ♥

Hash means the md5 hash that is used for tarball-fetcher, for example.
This hash is then added to yarn.lock file for future verifications and also appended to folder name when saving the unzipped folder in cache.
You are digging in the right direction, thanks for looking into this, a PR is very much welcome.
You can probably start with a PR that adds a broken unit tests

You can probably start with a PR that adds a broken unit tests

Okay, thank you for your response. Should I ping you as a reviewer in the PR or should I ping someone else (or no one)...?

Yeah, ping me

@bestander probably this issue shouldn't be closed since it's not fixed yet?

Yes, it should be re-opened. It was closed because I wrote " fixes #2165" in the title of my PR. In the beginning I thought it would be an ongoing PR, but to fix this bug we want two PRs: the first one added a unit test (the assertion which would fail is not really active, so CI doesn't blow up) and the second one will actually fix it.

Sorry, github closes issues when PR is merged

so obviously this is still a problem? it's quite annoying to develop with, to be honest. it's causing a bit of a disruption for me on a personal level at work, where i'm making use of file: to create a modular development environment. the sucky part is that every local package that I edit (using the file: path in package.json), requires doing this, in order to pull down the refreshed contents:

editing the contents of my eslint-config-base-eslint package

yarn cache clean && rm -rf node_modules/eslint-config-base-eslint && yarn install --force && yarn lint

Everyone is welcome to contribute to the project.
It could be anything - submitting a broken integration test for this case, a fix or convincing someone to work on the fix.
If you want to chat about the best way to approach that, you can find help on the discord channel.

I actually think that the fix should be 10-15 lines of code, it will probably save a lot of time fixing it sooner

FYI: It could be that this issue is also relevant for slow linking dependencies steps. See my comment here: https://github.com/yarnpkg/yarn/issues/1496#issuecomment-282688818.

Sorry. I couldn't found the time to create another PR for this issue :( I was (and still am) very busy.

@bestander This is a pretty big blocker for me, working on a multi-module project. If I'm reading @donaldpipowitch 's code links and comments correctly, could we have the file-resolver create a new hash (instead of null) every time it tries to resolve so it forces a reinstall? Say a UUID or current timestamp? Forgive me if I'm missing something, I'm not familiar with the way the code works.

A new cache with a time stamp and uuid sounds like a reasonable hack.
Ideally we should copy the files directly without the cache but it may be a
more complex change.
Go ahead send a PR

On Tue, 7 Mar 2017 at 03:38, Matt Traynham notifications@github.com wrote:

@bestander https://github.com/bestander This is a pretty big blocker
for me, working on a multi-module project. If I'm reading @donaldpipowitch
https://github.com/donaldpipowitch 's code links and comments
correctly, could we have the file-resolver create a new hash (instead of
null) every time it tries to resolve so it forces a reinstall? Say a UUID
or current timestamp? Forgive me if I'm missing something, I'm not familiar
with the way the code works.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/yarnpkg/yarn/issues/2165#issuecomment-284612526, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACBdWDS3xSr8KNu1o9Zn8sA9xdO8pyHOks5rjNEmgaJpZM4LFbmf
.

@bestander Regarding your last comment: Wouldn't it make even more sense to symlink it, instead of copying the files? Or is there any reason not to do that?

@danrot windows seems to require admin to symlink.
also it messes up recursing down to find node modules

Symlink also ignores .npmignore and stuff like that. (Which currently seems to be ignored, too. Which could be problematic: https://github.com/yarnpkg/yarn/issues/1496#issuecomment-282688818)

Is there any workaround for this currently barring clearing the entire cache? Unfortunately there seems to be no yarn cache rm package kind-of command.

@rhys-e I use this small script:

#!/bin/sh
if [ $# != 1 ]
then
   echo Usage $0 package-name
   exit 1
else
    echo Reinstalling $1
fi

dir="node_modules/$1"

if [ -d $dir ]
then
    rm -fr $dir
fi

cache=`yarn cache dir`/npm-${1}*
#        echo $cache
rm -fr $cache && yarn install --force

Have anybody tried just to yarn link all the local dependencies on postinstall? Looks like appropriate solution until proper fix lands.

I guess the idea is to not having to update the package's version number on every change for the local dependencies. A yarn link would force you to do that, right ? (did not try)

On my side what I finally did is to invoke a script on preInstall phase that compares what is in the source folder and node_modules folder for local dependencies. If it detects any difference, it just removes the cached dependency and removes the integrity file (to force a reinstall). Thus when there's no change the yarn install is pretty fast (there aren't many local dependencies) and if there is the outdated cached version is not used.

@lucdew link will basically create a symlink under the hood, meaning that latest local version will be used. But I think having a script, that before installation cleans the cache of specific deps is the right solution right now.

I've tested different combinations in working around changes in the local package not being updated in a parent project's installed version under ./node_modules/. I find that just this would do, without need for deleting the ./node_modules/ first:

yarn cache clean; yarn upgrade file:../<package>

Needless to say, manually forcing global cache clean should not be necessary in order to update/upgrade local packages.

@fungilation As a workaround, you can also use npm to install local dependencies and thus avoid losing the entire cache every time you want an update.

Per #2860 and the subsequent merge commit https://github.com/yarnpkg/yarn/commit/7241de13bb236526fa439a2528fbed319f60ef24, you can now "refresh" file: protocol dependencies with

yarn install --force

Edit Or upgrade the particular package (didn't realize this worked as well). If the dependencies version didn't change, this doesn't modify the lockfile, but it will still copy the newer version down.

yarn upgrade [file protocol package name]

The PR change will invalidate the dependency in the cache & reinstall it locally. yarn install will also work if the dependencies version changed, which invalidates the yarn.lock file. You no longer need to clear the cache, nor delete the module from the local install.

It was also made apparent to me that there is an active RFC for linking dependencies, possibly with a link: protocol. That can be followed here, https://github.com/yarnpkg/rfcs/pull/34.

@mtraynham Thank you for your pull request! 👌 This is awesome. Any reasons why --force is needed? I currently don't even know what it does exactly right now without looking it up :) Just asking, because npm doesn't need a --force flag and it would have been nice to behave like npm.

Edit Looks like yarn upgrade [dependency] works as well. I want to point out, this does not always change the lockfile, which should only reflect package.json version changes. I've updated my original post, as an upgrade is probably more appropriate.


The short version is, Yarn won't do anything with cache-resolvers if the lockfile didn't change, so we need to skip the lockfile check and ask the cache if there's a newer version. That can be done using upgrade or install --force

Per yarn install --force documentation

"it refetches all packages, even ones that were previously installed".

This really means it will skip the lockfile integrity check. The lockfile integrity check will usually pass if you don't change the package.json file and bail out gracefully. Skipping it asks the cache to check for missing/mismatched dependencies against the lockfile, download them if missing, and copy back any newer/missing dependencies locally. Then it runs npm install/postInstall scripts.

The PR change will now invalidate the file: dependency in the cache and copy the new version down locally. Prior to this, it would never invalidate the file: dependency. For other protocols, if you didn't change your package.json file, they will go unchanged (in the cache and locally).

So what, does this mean for us? I have about 60 dependencies on a project (ranging from Angular to Webpack), with one file: dependency. On a second install --force, where I only want to refresh the local dependency, it takes about ~5 seconds (up from ~1.5 seconds for yarn install). To me this is pretty negligible, and I'm actually very impressed by how little work yarn tries to perform throughout the whole process.

If there is another CLI command to skip the lockfile check and check the cache for only the particular file dependency, that would probably be even faster, but I haven't found one. Actually yarn upgrade might work, I just haven't tried it.


All that being said, I'd still call this a band-aid. This could be replaced by a better solution such as link:; I don't think anyone really cares to cache local dependencies. At the very least, the added overhead of using install --force or upgrade is mostly negligent and you no longer have to yarn cache clean; mv node_modules /tmp/.

Great answer. 👏 Thank you for writing this down.

Does yarn overwrite local symlinked project files with files from yarn's cache? (because it looks like that's what's happening)

The PR change will now invalidate the file: dependency in the cache and copy the new version down locally.

Does this means that whenever I call $ yarn inside a package which has "foo": "file:../" as a dependency a new copy of "file:../" will be created?

E.g. I have a package with multiple examples which are also packages:

foo/
foo/examples/
foo/examples/example-1/
foo/examples/example-2/
foo/examples/example-3/
...
foo/examples/example-10/

And it looks I have foo 10 times now in the yarn cache. I also test my examples on every version change of foo (and I have multiple modules configured in that way, not just foo) so it looks like my yarn cache grows _really_ quick currently.

Is this the correct behaviour?

I think it is a better alternative than having a stale version in your cache.
With yarn 0.26 you can use link: to create symlinks instead of copying files.
Also we are working on workspaces that will automate the symlink creation, https://github.com/yarnpkg/yarn/issues/3294

Yeah, looking forward to workpaces 👍

link: does not yet work with npm, right? (Because https://github.com/npm/npm/pull/15900 is still open.)

From npm 5 patchnote, files are now automatically symlinked with the file: syntax:

npm install ./packages/subdir will now create a symlink instead of a regular installation. file://path/to/tarball.tgz will not change – only directories are symlinked. (#15900)

So yeah, there is no link with npm.

npm install ./packages/subdir will now create a symlink instead of a regular installation

Kinda unfortunate. File deps never behaved the same due to copying everything (including node_modules) and not respecting .npmignore or files field. Now it's worse with symlink.

I think file: and link: could be improved further, there are different strategies that have their own PROs and CONs and Yarn should allow people to choose one.
For example knit RFC could be implemented as one of the strategies https://github.com/yarnpkg/rfcs/blob/master/accepted/0000-yarn-knit.md

@bestander

I think it is a better alternative than having a stale version in your cache.

Or so you'd believe until you run out of disk space and find that your Yarn cache is using tens of gigabytes for millions of utterly useless files.
IMO that makes the whole thing more seriously broken than it was before, because you find out only once you've temporarily broken your development system.

Hi guys, any updates on the issue? It's really a huge problem for us, especially when we're working on multiple products at once that depend on same linked multi-dependencies. Gigabytes of cache in a day of development, etc. Could you maybe at least make it optional and disable caching for such packages?

@nikdojo Use the yarn link: protocol for dependencies instead of file:, it uses symlinks. It was introduced in 0.26.

Alternatively, start using workspaces if you have many inter-project dependencies.

@mtraynham Thx for the hint, I tried to find any info abt link: protocol in official docs, but seems it's not there. Experimenting with workspaces now.

@bestander btw as you know, react-native doesn't support symlinks, so if we work with rn libs, it's still a big issue.

This was never solved was it? You have to use linklocal (NPM packages) to link local packages (which should be the standard way yarn operates when consuming packages from local filesystem - using junctions or symlinks on Windows instead of caching). A new yarn install then overwrites everything with old stuff from the cache and you have to start over linking again.

Can we be less architecture astronaut and simply don't cache local packages? This issue is now 1.5 years old, and I'm tired of running yarn add ../<another-local-package> every time I change something in another-local-package.

Hi @fungilation
I've opened another related issue: #6037

doesn't work
I've put in file App.js
console.log('here we are'), and it didn't output.
then remove all files and it still makes the output from the cache.
How to avoid this?

Yarn has really failed me on this issue. It's been great for everything else, except for this.

I've been trying to install a new version of a local package for testing purposes, and I keep ending up with the old package no matter what I do.

I've tried:

  1. Cleaning the yarn cache (yarn cache clean package-name)
  2. yarn add'ing with --force
  3. Removing node_modules/package-name and yarn adding again
  4. Updating the version number and filename of the local package (yarn still installs the contents of the old version, even though it reports installing the newer one)
  5. Combinations of all of the above

This is ridiculous, and I've spent almost 4 hours of my day on this.

We need the ability to develop and reinstall local packages. I am relying on yarn to install a file in the .bin folder, so yarn link is out of the question.

@ Yarn Developers: Are you able to install a local package, make changes to that local package, and then get the changes to be reflected by installing again?

@gregjacobs I had success with yarn install --force

@jonathantorley I just tried it again with --force and it still does not work with yarn 1.12.3

For others facing this issue: one workaround that I used to actually get this to work was to increase the version number of the dependent package. A bit annoying, and you have to remember to do it on each change (unless you can script it).

yarn should definitely not require this workaround when installing local packages

I added yarn upgrade MY_PACKAGE_NAME in preinstall script, and it upgrades to the latest NPM version well. (Still, I need to bump the NPM version manually).

It seems yarn add file:PATH now always upgrade the new content now in yarn 1.13.0.
yarn install still not.

@leavesster Still doesn't for me.

I have to rename the tgz in order for it to update

Try using the link command: https://yarnpkg.com/lang/en/docs/cli/link/

yarn add file:PATH did not upgrade the new content for me. Moreover, files in package.json and .npmignore are not respected.

It works of you change the version of your package.

If you want yarn add file:PATH to respect files in package.json and .npmignore, you must run yarn pack in your local package dependency and then where you want to install it run yarn add file:path-to-local-pacckage.tgz

Try using the link command: https://yarnpkg.com/lang/en/docs/cli/link/

yarn link is not intended to answer the same problem. It is not good if someone wants the npm package like if it was published respecting the files in package.json and .npmignore

@leavesster Still doesn't for me.

I have to rename the tgz in order for it to update

I don't have any tgz in my package which I use yarn add file: PACKAGE_PAH to add it in the current project. I just have js code in my the package. Maybe tgz still wrong ?

not working for me either

@bestander why this issue closed? Yarn still caching local files and tgz packages. Workspaces is not a solution in some cases, for example if one package have deps with specific versions, and other package have devDeps same packages as the first, but with another versions, and if you link one package into another, your project have been broken.

I have same issue as @gregjacobs. yarn cache clean package didn't help. But I figured out that if you install yarn add path/to/package.tgz and then replace archive to another one, you can install new version simply change the path like yarn add path/to/../to/package.tgz, so absolute path is same, but for yarn its different.

I don’t understand where else yarn store cached packages by resolution path, even yarn cache list --pattern package is empty.

I think the problem is somewhere here https://github.com/yarnpkg/yarn/blob/eb2b565bb9b948e87b11119482ebc184a9d66141/src/resolvers/exotics/tarball-resolver.js#L58-L63

Whats happening:

  • From url path/to/package.tgz generates hash (that's why path/to/package.tgz and path/to/../to/package.tgz results to different hashes)
  • With that hash creates temp directory for package (like this one /Users/kich/Library/Caches/Yarn/v4/.tmp/c019816ee7d10ed5e1fef4072e8cc617)
  • For first run isValidModuleDest return false and all goes right
  • Tarball packages extracted in this directory
  • Files from that directory installed into the project
  • You modify develop project sources and build/pack the new package tarball with same name and location that previous one
  • And you try install new tarball, but generated hash is a same as before and temp directory was not cleared after first run
  • So in this time isValidModuleDest return true
  • And yarn install old version of your package
  • You run yarn cache clean package, but temp directory remains untouched

@bestander can we simply remove temp directory here https://github.com/yarnpkg/yarn/blob/master/src/config.js#L431 ?

Facing the same problem, the cache under temp folder takes 10GB after I work on my local pacakge for only one day!

Can we reopen this issue? We are facing the same problem. Two projects referencing another package via file:. After some builds in our CI/CD server, the cache folder is starting to occupy a lot of space.

I think the link protocol is the best fix for this, file:// doesn't work since it still requires manually clean caches & force installing

https://github.com/yarnpkg/yarn/issues/2165#issuecomment-345825904

just make your dependency something like this

"<package>": "link:./libs/<package>"

@alxtz does links protocol works with packages in tgz?

I think the link protocol is the best fix for this, file:// doesn't work since it still requires manually clean caches & force installing

#2165 (comment)

just make your dependency something like this

"<package>": "link:./libs/<package>"

Thanks for this this replicates the behaviour of NPM which will symlink file:.. references in node_modules. It does not appear to be documented, at least not here where I would expect to find it: https://yarnpkg.com/lang/en/docs/package-json/

Sadly, link can't be used in all cases.

For example, I need a shared/peer dependency from my SDK package, which when I make changes, I would link for local dev work.
With link, yarn does not realise that the dependency is a shared/peer dependency and uses the local package where the symlink is, which is incorrect.

I've been using yarn pack alongside yarn add file:<path_to_packed_tgz> to get around this.
Whilst I can continue to rename the package every time I pack it and paste it into my repo, so as not to generate the same hash, as per @wKich fantastic find, it's damn annoying.

I forked the repo and put an extra clause in the if statement to stop yarn from ever loading a local .tgz package from its cache if it's specified using yarn add file:<path>.

const dest = this.config.getTemp(crypto.hash(url));
// If specified using file: never load from cache
if (!url.match(/file:/) && (await this.config.isValidModuleDest(dest))) {
  // load from local cache
} else {
  // continue as if it's a new package
}

I can make a PR if people want it, though I've never done that before and it's a pretty hacky fix.
Could someone confirm if this approach would continue to add the local packages to yarn's cache or not, please?

@ojboj for now, yalc might help until this gets a proper fix. It has been working really well for me to test packages locally before publishing.

@souporserious That is exactly what I needed/wanted, thank you so much!

Insane this is still an issue after so many years.

Is it fixed in [email protected]?

@wKich I believe so! I haven't personally tested it, but it looks like they solve local development through the new portal protocol.

I still get "unpack in same destination" error using the link: protocol, and it's referencing my cache dir, so it seems to me yarn is still caching link: dependencies. I'm referencing 2 copies of the same local package, copied into the .deps/ folder of each app that uses it -- front-end and renderer in the error below. (Can't link to the original for unrelated reasons)

warning Pattern ["@horizon/common@link:packages/front-end/.deps/@horizon/common"] is trying to unpack in the same destination "/home/garyo/.cache/yarn/v6/[email protected]/node_modules/@horizon/common" as pattern ["@horizon/common@link:packages/renderer/.deps/@horizon/common"]. This could result in non-deterministic behavior, skipping.
Was this page helpful?
0 / 5 - 0 ratings