Yarn: Workspaces: lock file per workspace

Created on 28 Feb 2018  Β·  54Comments  Β·  Source: yarnpkg/yarn

Do you want to request a feature or report a bug?
feature

What is the current behavior?
Using yarn workspaces for a monorepo which includes a top level node module creates only a single yarn.lock at the root of the monorepo, with no yarn.lock that is specific for the top level node module.

What is the expected behavior?
I want to use yarn workspaces to manage a monorepo that includes both apps (top level node modules) and libraries. But having only a single yarn.lock file at the root of the monorepo prevents me from packaging my app into a docker image for example. I would love a way to get a yarn.lock file for chosen workspaces that need to have one of their own, because that workspace may later be used outside of the monorepo.

An example:
If I have a monorepo with 2 workspaces: workspace-a and workspace-b. workspace-a uses some of the exported modules from workspace-b. If I want to package workspace-a into a docker image (or any other way of packaging that workspace by itself, without the whole monorepo), I don't have a yarn.lock for it. That means that when I'll move the files of workspace-a into a different environment apart from the monorepo, I'll be missing a yarn.lock file and when installing dependencies, I'll lose all the advantages of a lock file (knowing I'm installing the same dependencies used in development).

I'm quite surprised I couldn't find an issue about this. Am I the only one who wants to work with monorepos that way? Maybe I'm missing something? My current workaround is using lerna without hoisting at all, so I'll have a lock file per package.
The recently released nohoist feature also doesn't seem to help (though I hoped), as it doesn't create a different yarn.lock per workspace.
This issue is somewhat related to this comment on another issue. Thought that it may deserve an issue of its own.

Please mention your node.js, yarn and operating system version.
node 8.9.3, yarn 1.5.1, OSX 10.13.3

cat-feature triaged

Most helpful comment

So for us, we don't want to package the whole monorepo into the resulting docker container. We are using docker in production and those images should be as light as possible. Our mono repo is quite big and contains multiple microservices that share code between them using library packages (and some of the libraries are relevant for some of the microservices, but not all). So when we package a microservice, we want the image to contain the files of that microservice and any other dependencies as proper dependencies - downloaded from our private registry, and built for the arch of the docker image.

So I think the main consideration here is to keep our docker images as light as possible, and packaging the whole monorepo doesn't fit our needs. Also, when we run "yarn" inside the image of the microservice, we don't want to have symlinks there, just a normal dependency.

The solution here doesn't have to be creating a yarn.lock file per workspace, it could also be a yarn command, that helps in the process of packaging a given workspace, generating a yarn.lock file for a workspace on demand, etc..

Hope it helped clearify the use case.. 🍻

All 54 comments

based on yarn blog:

When you publish a package that contains a yarn.lock, any user of that library will not be affected by it. When you install dependencies in your application or library, only your own yarn.lock file is respected. Lockfiles within your dependencies will be ignored.

it doesn't seem necessary to bundle yarn.lock when publishing the individual package... it is more a development artifact for the whole repo, therefore, no need to put it in each package.

@connectdotz it may not be needed for a library or published package but for building a docker container you're going to want to deploy somewhere it definitely would be.

sure... but wouldn't the development docker container just have the whole repo and therefore the yarn.lock anyway? I can see we use docker containers to test our monorepo project for different OS or platforms, in which case we just deploy the whole repo and its yarn.lock. Can you give me an example use case that you need to deploy individual packages from the monorepo project into docker containers during the development cycle, so I can get a more concrete understanding...

So for us, we don't want to package the whole monorepo into the resulting docker container. We are using docker in production and those images should be as light as possible. Our mono repo is quite big and contains multiple microservices that share code between them using library packages (and some of the libraries are relevant for some of the microservices, but not all). So when we package a microservice, we want the image to contain the files of that microservice and any other dependencies as proper dependencies - downloaded from our private registry, and built for the arch of the docker image.

So I think the main consideration here is to keep our docker images as light as possible, and packaging the whole monorepo doesn't fit our needs. Also, when we run "yarn" inside the image of the microservice, we don't want to have symlinks there, just a normal dependency.

The solution here doesn't have to be creating a yarn.lock file per workspace, it could also be a yarn command, that helps in the process of packaging a given workspace, generating a yarn.lock file for a workspace on demand, etc..

Hope it helped clearify the use case.. 🍻

@netanelgilad thanks for details, it does help to clarify that your use case is more about publishing individual packages, for production or development, to docker containers. Please join the discussion in #4521 so we can start to consolidate them.

While I can se the use of individual lock files, they are not necessary. If you run docker from the root of the repo with the -f flag pointing to the individual files you'll have the whole repo as context and can copy in the package.json and yarn.lock from the root.

You only need the package.json for the packages you will build in the image and yarn will only install packages for those package.json files you have copied in even thou the yarn.lock includes much more.

EDIT: With that said. it causes docker cache to not be used for package changes in any package even though it is not included in the build

4206 is related/duplicate, and the use-case described there is exactly the problem we're facing:

Let's say we have ten different packages. We want all of them to live in their own repository so that people can work on them independently if they want, but we also want to be able to link them together if we need to. To do this, we have a mega repository with a submodule for each package, and a package.json that references each of these submodule as a workspace.

I have similar issues with workspaces. My project is a web-app which depends on many local packages:

web-app/
|--node_modules/
|--packages/
|  |--controls/
|  |  |--src/
|  |  |--package.json
|  |--utils/
|     |--src/
|     |--package.json
|--src/
|--package.json
|--yarn.lock

Workspace packages controls and utils aren't published and are used by paths. The issue is that I need to release controls package (yarn pack) and I wan't to build/test it on its own. That mean's I want to do yarn install inside web-app/packages/constols/. With workspaces it will use top-level web-app/yarn.lock file together with top-level web-app/node-modules/. So, It installs all packages instead of a subset, specified in web-app/packages/controls/package.json. But I need to check that my package has all required dependencies in it's own package.json and doesn't work by luck of filling missing deps from other workspaces.

There are 2 possible solutions:

  1. If it isn't root, use root's yarn.lock, but install only packages specified in local package.json.
  2. Do not search for top-level configs, but .yarnrc/.npmrc.

Also struggling with this. We've got an Angular CLI project alongside our API so they're in the same repository and trying to push the frontend to Heroku.

We're using a buildpack which tells Heroku to jump up to the frontend repository first: https://github.com/lstoll/heroku-buildpack-monorepo

Problem is, there's no yarn.lock inside that nohoist package so Heroku just installs with npm and we end up with all new packages rather than the locked ones

You can just use the global yarn.lock file with the individual packages. I've recently approached my Dockerfile like this:

WORKDIR /app
ENV NODE_ENV=production

ADD yarn.lock /app/
ADD package.json /app/

# Only copy the packages that I need
ADD packages/my-first-package /app/packages/my-first-package
ADD packages/my-second-package /app/packages/my-second-package

RUN cd /app && yarn install --frozen-lockfile

This will install only dependencies that are actually in use by the two packages I copied and not by anyone else.

I have a build process where first I'd like to create a release artifact from one package and then not have any of its dependencies installed. This is fairly easy with Docker's multistage build

  1. Add only yarn.lock, package.json and the UI package to docker
  2. run yarn install --frozen-lockfile
  3. run the build process
  4. start a new stage and add yarn.lock, package.json and the necessary runtime packages/workspace folders
  5. Do a COPY --from=<stage> for the built artifact
  6. run yarn install --frozen-lockfile and expose a RUN command.

And you'll end up with a small container that only contains the dependencies specified in your yarn.lock file and needed in production.

@johannes-scharlach
I've ended up using an almost identical method to what you are describing. multistage build is a good tip :).

@connectdotz I think that from my end, this issue could be closed and we can keep working on issue #4521. Since the main yarn.lock file could work with a subset of the packages, it seems like a yarn.lock per workspace is not necessary (even though I still think this could be a better dev workflow πŸ˜‰ ). But the issue in #4521 is still important, because in the solution we got to here, we need to mention every dependency workspace in the Dockerfile even though yarn should know the interdependencies and how "vendor" a given workspace.

I spent the last couple of days trying to convert our monorepo to first Lerna and then Yarn workspaces. Yarn worked generally more reliably and it's really close to what we need, especially with the recent introduction of yarn workspaces run <script> and other nice things like wsrun.

However, the single yarn.lock is a pain point:

  • I'm not sure how to correctly migrate our existing lockfiles to a single one, see https://github.com/yarnpkg/yarn/issues/6563. We have tens of thousands of lines there and adding existing packages as workspaces introduced many subtle versioning issues.
  • Installing dependencies in specific package only ("de-hoisting" / "vendoring") Dockerized build is not well supported, see above (https://github.com/yarnpkg/yarn/issues/5428#issuecomment-403722271) or https://github.com/yarnpkg/yarn/issues/4521.

What would you think about Yarn workspaces being just a tiny core – a declaration of packages without any specific functionality. For example:

  • If you wanted to run some script across your workspaces, you'd do yarn workspaces run <script>.
  • If you wanted a single lock file and hoisting (are the two necessarily tied together?), this would be your root package.json:
    json "workspaces": { "packages": ["packages/*"], "hoistDependencies": true }
  • If you wanted to migrate your current lockfiles to a hoisted structure, you'd run yarn workspaces hoist-dependencies.

Etc. These are just examples and in practice, some features would probably be opt-out instead of opt-in (for example, people expect a single yarn.lock and hoisting by now) but the general idea is that workspaces would be a lightweight foundation for repo-wide tasks.

What do you think?

I believe the problem this feature request is addressing is the same as in #4521 . A command to do essentially what @johannes-scharlach describes would certainly be more feasible than a lockfile per workspace.

There is also an RFC open right now for nested workspaces, which sounds similar to this feature request though I believe it's solving a different problem.

What would you think about Yarn workspaces being just a tiny core – a declaration of packages without any specific functionality

Workspaces won't drastically change, I think we're satisfied with their current interface.

If you wanted to run some script across your workspaces, you'd do yarn workspaces run <script>

That's already possible (v1.10, #6244).

If you wanted to migrate your current lockfiles to a hoisted structure, you'd run yarn workspaces hoist-dependencies.

Since we won't change the workspace interface it would be the opposite (dehoistDependencies).

What I don't like about this is that it takes a technical behavior (hoisting) and tries to turn it into a semantical behavior. You should focus on the user story and then figure out the implementation rather than the opposite.

In this case, I think your use case ("Installing dependencies in specific package only") would be better solved by extending yarn --focus.

I guess the core question is whether hoisting and a single yarn.lock file are strictly necessary for workspaces. I mean, is is what truly defines them or is it "just" the first feature they historically got?

For example, in our use case, the best hypothetical behavior of workspaces would be:

  • Hoist node_modules at development time for efficiency.
  • Keep local yarn.lock files for build (we build specific packages in Docker, something that other people mentioned in this thread as well) and also so that packages can lock their specific versions. See also #6563.
  • Run scripts via yarn workspaces run <script> even if you don't need (or must avoid) hoisting.

Hoisting can be disabled with nohoist, run can be "disabled" by just not using the command but it's not possible to "disable" the single yarn.lock file, and I'm not sure if it's such a core feature that it cannot be disabled or if it just hasn't been requested enough yet :)

I think the best way to solve this would be to have yarn install --app-mode package@version

That way, you could simply copy the workspace lockfiles when publishing your app at a certain version, and install in app-mode will respect the bundled lockfile.

Yarn doesn't have to install the entire lockfile; it should easily be able to extract only the part of the dependency graph relevant to that package.

In fact this might be fairly easy to do manually even now:

  • download the package zip from the registry directly (yarn has no equivalent, npm does: npm pack package@version)
  • extract the gzip into node_modules/package
  • cd into node_modules/package
  • run yarn install --production from there (it will respect the bundled lockfile)

edit: unfortunately this is all wrong, as the workspace lockfile does not include the versions of packages within the workspace, which might be dependencies of the app package. There would need to be something more involved than copying when creating app lockfiles from workspace lockfiles.

I'm not exactly sure if separate lockfiles is the answer, but I have a similar problem. I have a monorepo set up with a CLI and a backend. The CLI requires a few packages that are platform-specific and only work on desktop machines with a particular setup. On the other hand I need to be able to build my api into a docker image, which is fundamentally impossible in the current implementation of workspaces.

Very similar use case than @samuela here! This one would be massively helpful!

My use-case might seem laughable compared to the other, "real" ones. But I have a monorepo for some utils - in ths case react hooks - inside packages/*.

I have a second workspace next to packages/*, and that is local/*. This is actually on gitignore, and the idea is that developers in the company may do whatever they like in there, for example put create-react-app apps in there and test the hooks during development.

Now, although the local/* packages are on gitignore, the root yarn.lock is simply bloated and polluted - and checked into git - because of the local workspaces.

What I would wish for is a way to specify that some workspaces shall use some specific lockfiles, e.g. some mapping like so:

  "workspaces": {
    "packages": [
      "packages/*",
      "local/*"
    ],
    "lockfiles": {
      "local/*": "./local.yarn.lock"
    }
  }

Or even a way to specify "do not put anything from _this_ workspace into the lockfile at all".

But yeah, mine is not a serious use-case in the first place :)

I'm not exactly sure if separate lockfiles is the answer, but I have a similar problem. I have a monorepo set up with a CLI and a backend. The CLI requires a few packages that are platform-specific and only work on desktop machines with a particular setup. On the other hand I need to be able to build my api into a docker image, which is fundamentally impossible in the current implementation of workspaces.

You nailed it – as I see it, one of the very core benefits of yarn.lock file is for creating frozen-dep production builds! Did the creators of Yarn forget that?

Another argument for solving the single lockfile problem is code ownership. If you have a monorepo that's using something like the GitHub CODEOWNERS feature, it's not possible to give complete ownership over a package to a group of developers. That's because if they install something in their own workspace, they will invariably change the root level lockfile. This change will need to be approved by the code owners of the lockfile, which, given a monorepo of sufficient scale, will be different than the owners of the original workspace.

Yet another reason to have an option to generate per workspace lockfiles: Google App Engine refuses to launch a Node service without a lock file (NPM/Yarn). This is excellent devops on their part, but a pain for us. So far the options we have are:

  • Deploy all with env vars indicating which service we mean and modify our yarn start (the only supported entry point) to branch based on env vars
  • Have the build script copy the main lockfile to each workspace and deploy just the service we are interested in. (thanks @johannes-scharlach)

Ultimately I think a yarn install --workspace-lockfile command that generated per workspace lockfiles would be the best solution.

Having an option for package level lock files would help us too. Our use case is a little different, we're trialing a new way to manage local dependencies.

So we have some mono repos already, and we have some repos that only contain a single package. These are all published and so can be used together, however there are a lot of times when having them locally and symlinked is super useful.

But some devs have a hard time managing symlinks etc and so we're trying out a standard empty yarn workspace mono repo that we all clone to our machines, then we clone out package repos into that local mono repo. Some of us might just have one packages clones, some might have 5. This is super convenient and makes local, cross repo, cross dependency development an absolute breeze.

But we've come across one problem we can't solve, editing dependencies doesn't update the local yarn lock file, it always updates the root one for the empty mono repo we don't update update dependencies on (it has everything under /packages gitignored.

Having an option to not hoist lock file writes to the mono repo root would be great and have them write out at the package level.

As a note I've also come across the deployment and build issues around Docker that others have mentioned and this would solve that too!

This feature would be very valuable to me too. In my case I have a monorepo with some packages deployed to different platforms. One is a Next.js app deployed with Now.sh and the other is a bunch of cloud-functions deployed to Firebase.

In both of these deployments the source code is bundled and uploaded for install & build in the cloud. Not having a yarn.lock file to go along with the source means that the dependencies are installed using the versions in package.json and no versions are locked.

I would also love to be able to enable yarn.lock files in each workspace.

I realize yarn workspaces are mostly intended for monorepos, but our use case if pretty similar to @LeeCheneler's.

Basically, we have created a React component library that we use as a dependency in different projects (that all have their own repos). By using yarn workspaces we can easily reference the local version of the component library and have changes propagate to the local version of our other projects quickly. We also don't need to modify the package.json when we push to production because the dependency library: "*" works without any changes. Our only issue is that, without yarn locks files, the production versions of each project could wind up using different package versions.

I have to imagine that this issue would common among any package developers that use yarn workspaces.

Another critical issue with a top-level lockfile is that it breaks Docker’s layer caching. One would usually be able to optimize Docker caching by first copying the package.json and yarn.lock. If Docker sees no changes in those files, it will use a previous layer. If that lockfile is a single one for the entire monorepo, though, any change in any package invalidates the cache. For us this results in ridiculously slow CI/CD pipelines where every package is built without the cache. There are other tools, like Lerna, that check for package changes to run certain scripts. This also breaks as a dependency change in the lockfile might not be picked up for being at the top level.

Sorry to bring up this (slightly old) issue but I also have a use case in which this would be helpful. I have 10 or so microservices which are hosted and developed independently but it would be nice to have a central workspace repo where you can type yarn install to install dependencies in all of the folders and then run yarn start to run a script which will start up all of the microservices. This is something that could be done with a relatively simple script but it also seemed doable with yarn workspaces but I couldn't get it to work while respecting the yarn.locks in each micorservice

@nahtnam I think the idea of Yarn 1.x monorepo is a little bit different. It isn't about independent projects under one roof, it is more about a singular big project having some of its components exposed (called workspaces). These components are not entirely independent, but may be used like so: the Babel compiler as one greater entity and preset-env as sub-module. Also, they are homogeneous in the sense that their dependencies are unified: if some packages depend on core-js, it should be the same core-js version in every single one of them, because you can't lock different versions with a single root lock file, nor it makes sense for project's parts to depend on different versions. And because it is one project, all its components are automatically linked into the root node_modules, which is strange for entirely independent projects.

So, if you're developing a pack of microservices (which are independent, and some of them would not be touched in years while others will be created/updated, maybe developed by different teams), then they should have personal lock files without root lock (Docker issue goes here as well). The only question is what tool will help with running scripts. Lerna might be an answer as it isn't tied to the Yarn.

The only question is what tool will help with running scripts. Lerna might be an answer as it isn't tied to the Yarn.

@the-spyke Not only that. yarn workspaces also solves, linking modules for development the same way npm link does, which is the main reason we are using it. npm link is not working well in some cases.

@the-spyke Thank you! For some reason, I thought it was flipped (lerna vs yarn workspaces). I looked into lerna and it looks like it solves my issue

I ended up using workspaces for every single project I'm working on, because of how easy it is to have a separate utilities package that I can update when needed (even though it is published), and the fact that I don't have to reinstall dependencies if I want to fork some package (essentially allowing me to work on two branches at the same time, which for me was unheard of), and whenever I want to use a package from my utilities (which includes most of the things I usually use), I can use it right away without adding it to package.json of second package (but that's obviously a good idea in case of separate installation, and is required for automatic IDE imports) ; everything just works. @the-spyke makes a good point, perhaps independent projects under one roof isn't the purpose of workspaces, and yet that's pretty much what I seem to be doing here: I have a single monorepo-base repository, which excludes packages folder, while each folder under packages is a separate independent git repo.
image
Of course that brings me to the topic of this thread; since I don't commit all packages as one repo, root-level yarn.lock is meaningless. I've been using --no-lockfile to install everything, and recently hit a problem with conflicting versions of class-validator. For now I will lock all deps to specific versions (honestly, that level of control makes more sense to me) and see how it works out. I'll read the whole tread again, maybe there are some tips I would be able to use for my use case.

PS.
yarn why doesn't work without lockfile for one, and I've noticed some people mentioning problems with App Engine. I suppose in case of every package being a separate repo, the lockfile could be generated every time on install (without adding it to VCS)? Not sure about that specific case.

Unfortunately the solution suggested by @johannes-scharlach gets very messy if your built image requires runtime node modules that are not included in some build folder, because you will have to figure out exactly what modules are required to run, and painstakingly copy them across to the final build stage.

(slightly off topic) @GrayStrider you can also utilize the "resolutions" field in package.json - it is the only way of forcing a version on a nested dependency, e.g. if you want all the lodashes to be the exact version, no matter how deeply nested. However, that can introduce very subtle bugs that will be hard to spot..

Here is a solution that we arrived on - with minimum impact to existing Docker workflow.

  1. ln project/yarn.lock packages/package1/yarn.lock - create a hard symlink from the root yarn.lock into each package.
  2. Add COPY yarn.lock . to each packages/package1/Dockerfile
  3. yarn install inside Docker

Advantages:

  • Do not have to copy your entire monorepo into an image layer
  • Do not have to merge your package level Dockerfiles into a single Dockerfile at the root
  • Basically satisfies the requirement of workspace lockfiles

Disadvantages:

  • --frozen-lockfile does not work. As the workspaces packages are not included in the yarn.lock , therefore yarn sees a package you have "added" to the package.json that does not exist in the yarn.lock

This is a minor disadvantage regardless, as you can get around it by performing a yarn --frozen-lockfile as the first step in your CI/CD pipeline

Edit:
As an aside, I really think the yarn docs on installing could be a little clearer with how the lockfile is used by the package resolution process.

Edit:
So turns out git does not actually support hard-links, it only supports soft symlinks, so this strategy will not work.

Another alternative is to simply use a precommit githook to copy the yarn.lock into each of your workspaces... its not ideal as it still allows for issues when deploying from your local machine.

@dan-cooke thanks a lot for your insights, much appreciated!

@dan-cooke, this breaks Docker's layer caching since a new dependency in any workspace would invalidate the install layer for all Dockerfiles.

@migueloller If the install layer must not change, than you shouldn't use one lock file for all packages. Flattening and hoisting dependencies into single giant list is the whole purpose of Workspaces. It isn't like it "breaks" Docker cache, the cache is invalidated because real dependencies aren't specified in package.json, so the final code of you package depend on the content of yarn.lock. As the result you need to rebuild all packages on every change in yarn.lock and Docker does everything right. And it isn't like where every package is built without the cache because all builds could reuse the layer with yarn install (probably you will need to set this up).

@migueloller
Correct. We fortunately don't add new dependencies often , so this will only be an issue once a sprint at most.

And even then, its a small price to pay (for us) for reproducible dependencies in Docker

@the-spyke, indeed. Which is why this issue is about having an individual lockfile per package. This way, the cached layer is only invalidated when the package's dependencies change and is independent of other packages.

Maybe it's also worth moving this discussion to npm itself, which supports workspaces too starting with v7.0..

I've been researching related topics and would like to add some clarification to my comment above (mostly aimed at less experienced developers I suppose, since the issues I was facing were to some extent due to my failure in understanding the importance of lockfile).

"locking all dependencies to specific version" is called pinning; unfortunately, it will not prevent things from potentially breaking if sub-dependency updates (final paragraphs of the article here), which I haven't considered. That is exactly what lockfile meant to prevent from happening.

I've wasted more than enough hours on breaking updates in the past on multiple occasions; I will experiment with using lockfile in monorepo, since I'd rather deal with merge conflicts and organizational headache, than invisible bugs caused by minor updates.

Above said, I'm very much looking forward to any progress on this issue

@migueloller Individual lock-files mean individual Yarn Monorepos. You can't have a lock-file for a Workspace because it breaks deps uniformity in the Monorepo. If you want to do so, you're going away from the original idea of Yarn Monorepo: it's about unifying, hoisting, and reducing deps into a flat list in the root instead of having different versions (and even whole subtrees) in different workspaces.

@the-spyke but the original issue is about exactly the opposite. A lock file per workspace.

I fail to understand how you cant have a lockfile per workspace.

It breaks deps uniformity in the Monorepo? Sure for development. But the whole purpose of shared dependencies goes out the window when you must deploy lightweight micro services from each of your workspaces

Shared, hoisted deps only makes sense in development.

For a workspace to live in Docker, it requires a lockfile.

@dan-cooke As you can see, I had this issue in 2018 too, but now I have different opinion.

You're saying Docker and Microservices. But what if I develop a regular npm package? I have no production dependencies subtree to pin, because they will be provided by end-user according to my dependencies specification. So, what I want is to maximize my development experience, and that what Monorepos and Yarn Workspaces perfectly do.

Same time, if you're developing microservices (MSs) there 2 possible situations:

  1. Independent projects. Some MSs are in development, some weren't touched in years. In this case they are completely independent. It is possible to have UserService using [email protected] and MessagesService using [email protected]. That's not that easy world where you just link folders from Workspaces to the root node_modules. So, no point in having root lock-file. Create separate files (roots) and manage them independently. That called a Multirepo in Yarn docs. But now what you're saying is "I want to run tasks in different folders from the root folder for convenience". And that's a completely different topic.

  2. Projects with unified dependencies like Jest/Babel/etc. This is what Workspaces were made for, but in MSs there additional requirements. During CI stages like linting and testing all works fine because it works the same as you do on a developer machine: deps installed by Yarn into root node_modules and flattened out. Just with addition that you probably cache the yarn install phase to speed up concurrent builds.

    In production it's completely different: starting from that you only need deps for one workspace and ending with how to install that utils package? Should it be linked or downloaded as tarball? So, what you really need is not having lock-files per Workspace, but having a command like yarn install --prod <workspace> that you can run specifying a Workspace and it will install only production deps and same time ignore other not referenced Workspaces. Like if my data WS depends on utils WS, but not on logging WS, then logging itself and its deps should not appear in node_modules. A similar result, but a completely different approach to a "lock-file per workspace".

    If you're publishing build packages into a repository (npm, Arifactory, GutHub), you can get similar behavior by just copying lock-file into a Workspace and doing yarn install --prod here. It should warn about outdated file, but instead of recreating if from scratch with fresh versions it should just remove excess deps from it (just tried and looks legit). Should be even better and robust with using Offline Mirror.

    And in the end you have Focused Workspaces implemented exactly for Multirepos.

So, what I was saying it that maybe the issue doesn't look like what it is.

@the-spyke
I see what you are saying. I think it definetly does come down to how this result is achieved. Perhaps a lockfile per workspace is not actually the best way to achieve the desired result. You make some good points.

Its definetly not a "one size fits all" solution anyway

@the-spyke, your bring up good points. Perhaps more thought needs to be put into what problems Yarn workspaces was designed to solve and if using it to manage large monorepos is aligned with that design.

I'm curious as to how you would solve a scenario like this:

.
└── packages
    β”œβ”€β”€ app1
    β”œβ”€β”€ app2
    β”œβ”€β”€ lib1
    β”œβ”€β”€ lib2
    └── lib3

lib3 is a shared library and is depended on by app1 and app2. lib1 is only used by app1 and lib2 is only used by app2. Based on your suggestions, lib1 and app1 should be in their own workspace with their own lockfile and same with lib2 and app2. Now, the question is what to do with lib3 if _both_ app1 and app2 depend on it? Perhaps one could make it so that both workspaces (app1 and app2) add lib3 to their workspace and then run yarn install in each app? Does Yarn allow this? Would there be any conflicts if one wanted to run both app1 and app2 in development locally (perhaps app1 is a React app and app2 a GraphQL API)? This sounds like it could work.

The next question to ask would be "How does one get the benefits of hoisting with this method?" For example, if app1 and app2 share a lot of common dependencies, it would be nice to hoist them. I can see how this might be out of scope, though, and is a problem to be solved by Yarn PnP (it doesn't copy files to node_modules and instead has a shared cache).

I'm going to give this a shot and will report back. If this ends up working, then perhaps we've just been using Yarn workspaces wrong all along...

EDIT: I tried it out, and it does work.

I've changed my stance now and realize that while having an individual lockfile per workspace might be the first thing that comes to mind when managing an entire monorepo with Yarn workspaces, it might not be the right question. A better question might be "Is Yarn workspaces designed to manage a monorepo?". The answer, as usual, is "it depends".

If you're Babel and you have a single team working on the monorepo and everything is meant to change in lockstep, then yes, this is what Yarn workspaces was designed for. But if you're an organization with multiple teams and you're using a monorepo, you likely don't want to manage the entire monorepo with a single Yarn workspace root. You probably just want to use Yarn's default behavior or multiple Yarn workspace roots within the monorepo. This will be determined by what apps you're building, how many teams there are, etc.

For us, it became clear that for each deployable entity (in our case there's a Dockerfile), we want to have a separate yarn install done for each one (whether it's a worksapce root or not). This provides clarity around code ownership, allows for isolated deployments that happen at different cadences, solves caching issues with lockfiles and Docker, etc. There are a few downsides to this, though:

  • What about duplicated node_modules packages? This is a common class of problems with monorepos and while Yarn workspaces help with hoisting, it's not a general monorepo solution. There are other solutions, though. For example, Yarn PnP takes care of this. You could also use Lerna without Yarn and use the --hoist option.
  • What about the utility of running command across workspaces during development? Again, Yarn workspaces lets you do this, but that doesn't mean one should make the entire monorepo a Yarn workspace root. Building the necessary tooling and scripts will be different for each team and depends on their monorepo. Yarn workspaces probably wasn't designed as a monorepo task runner. One might try to bend Yarn workspaces a bit to do this job (i.e., run NPM scripts across the monorepo using yarn workspace ...) but it's important to keep in mind that a single workspace root for the entire monorepo probably won't give you what you need unless you're like Babel, Jest, React, etc.

There's a whole other host of problems that come with running a monorepo. For example, what about tracking dependencies and only rebuilding things that changed to save time in CI? Yarn workspaces could help there by letting you query the dependency graph. Lerna does this, for example, to allow topological sorting of commands being run. Yarn v2 actually lets you query the dependency graph as well. The package manager PNPM also does that. But I would argue that depending on the complexity of the monorepo one might want to try tools built for that (not package managers) like Bazel, Pants, Buck, etc.

@migueloller From your requirements I see that you don't need a strictly independent packages or other exotic things, you also do want slimmer developer installs. In such case you should start with regular Yarn Monorepo: single root and all packages as workspaces. You'll have faster installation times, lower disk usage, and app1 will use local linked lib1 and lib3. The only downside will be more often CI cache invalidation because addition of a devDep to lib1 will update shared yarn.lock. But usually you don't update dependencies that often to worry much about this tradeoff.

lib1 may depend on lodash@^4.5.0" andlib2may depend onlodash@^4.10.0". In case of Monorepo you want a single version of lodash being used, so Yarn will install something latest compatible to both specifiers like `[email protected]" hoisted to root node_modules. And in case of an update you're updating that single unified version, so all workspaces are always stay on the same page. This is the desired behavior.

There're also situations where independent teams develop independent projects. In this case proj1 may want to stay at [email protected] with proj2 having [email protected] and update that with their own cadences. Of course, you can achieve this by using more strict specifiers like lodash@~4.5.0, but this may still update 2nd level dependency too early. So, in whole those projects may be completely unrelated, just happened to be inside of one git repo. In this case there is no reason to bind them as a Yarn Monorepo and trade-off independence for a shared lock-file. Just treat them as what they are: separate projects with their independent life. And that is called Multirepo. On Unix all your directories are under /, but it doesn't mean that all possible JS projects on you PC should be a Monorepo :-)

Building a minimal possible Docker image for production use is completely unrelated to the Yarn, but you may force Yarn to reuse a development artifact called yarn.lock and help you with this task too.

@the-spyke, in this example I only used 3 workspaces but in our actual repo we have over 20 workspaces with a combination of libraries and deployed workloads for both the front-end and back-end. The way I see it now is that we have a monorepo (or what you call multirepo) where it will probably make sense to have multiple Yarn workspace roots with independent lockfiles. We're thinking of using a deployable unit as the unit of separation, it aligns nicely with lockfiles.

I think for me, what makes this work very nicely is the fact that Yarn workspaces supports paths outside the workspace root, even though the initial blog post says otherwise. For example, you can have this:

{
  "workspaces": [
    "../lib1",
    "../lib3"
  ]
}

We have the same use case as @migueloller and one possible idea is for Yarn to support multiple sets of workspaces, like this:

{
  "workspaces": {
    "frontend-app": ["frontend", "common"],
    "backend-app": ["backend", "common"]
  }
}

Yarn would maintain two additional lock files (I imagine the main yarn.lock would still exist):

.
└── monorepo/
    β”œβ”€β”€ yarn.frontend-app.lock
    β”œβ”€β”€ yarn.backend-app.lock
    └── packages/
        β”œβ”€β”€ frontend
        β”œβ”€β”€ backend
        └── common

When building a Docker image e.g. for frontend, we'd create a context (e.g., via tar) that includes this:

.
└── <Docker build context>/
    β”œβ”€β”€ yarn.frontend-app.lock
    └── packages/
        β”œβ”€β”€ frontend
        └── common

What I didn't think about deeply is whether it's possible to install (link in node_modules) the right versions of dependencies if frontend and backend lock different versions. But purely from the high-level view, two-dimensional Yarn workspaces is probably what we're after.

(Something similar was also posted here.)

It looks like that you don’t need a lock file per workspace but instead you require node_modules per workspace for deployment

@gfortaine, if you read the discussion you will realize that that's actually not the case. The reason for having a separate lockfile has nothing to do with installation, but instead having a lockfile that only changes when a specific package changes. A top-level lockfile will change with _every_ workspace dependency change, but a lockfile scoped to a single package will only change when _that_ package's dependencies change.

It might be worth mentioning that that this can be done in user-land. Using the @yarnpkg/lockfile package one can parse the top-level lockfile, and then using yarn workspaces info one can determine workspace dependencies. Using this information, together with the package.json of each workspace, one can generate a lockfile per workspace. Then, one could set up that as a postinstall script in the repo to keep those individual lockfiles in sync with the top-level one.

I might take a stab at building this and report back with my findings.

It looks like that I’ve just found this implementation albeit for pnpm : @pnpm/make-dedicated-lockfile

Hope this helps πŸ‘

My need for this behavior (versioning per workspace, but still have lockfiles in each package) is that I have a nested monorepo, where a subtree is exported to another repo entirely, so must remain independent. Right now I'm stuck with lerna/npm and some custom logic to attempt to even out versions. Would be nice if yarn (I guess v2, given that's where nested support lies?) could manage all of them at once, but leave the correct subset of the global pinning in each.

A postinstall script could attempt to manage the lock files directly, I suppose. Sounds complicated, but would be nice either way.

Was this page helpful?
0 / 5 - 0 ratings