Ember.js: [2.15.0] - Ember build out of memory errors while using CI tools

Created on 7 Sep 2017  ·  23Comments  ·  Source: emberjs/ember.js

Parallel babel transpiling was introduced in ember 2.15. (https://github.com/babel/broccoli-babel-transpiler#number-of-jobs)

By default, broccoli-babel-transpiler uses system resources (cpus) to determine the amount of jobs that can be ran in parallel.

Most modern CI tools use docker to help isolate builds from other builds running on a given server. This allows them to use large VMs to run lots of builds that are largely isolated from each other. For example, the VMs that Circle CI run on usually have 36 CPU cores... But the builds themselves are limited to two.

The issue comes in when attempting to use traditional means to determine what resources are available to the program. For example, to determine the number of CPUs using node you might do node -e "console.log(require('os').cpus().length);". However, this information actual reports the instance resources, not the limited resources available to the docker container. Resulting in whatever is running thinking it's got access to 36 cores but really it only has two.

I've created an example repo:
https://github.com/mwisner/ember-circleci-example in which you can see both passed and failed build history (https://circleci.com/gh/mwisner/ember-circleci-example) (https://travis-ci.org/mwisner/ember-circleci-example/builds) (I have not fixed the travis build yet.)

Tested CI tools
-- Circle CI 2.0
-- Circle CI 1.0
-- Travis CI (Failing with default provided Travis CI)

Parallel jobs docs: https://github.com/babel/broccoli-babel-transpiler#number-of-jobs

When errors do come up they look something like like the output below. However in many situations (such as travis CI) it simply times out after 10mins without any information)

#!/bin/bash -eo pipefail
ember test
Could not start watchman
Visit https://ember-cli.com/user-guide/#watchman for more info.
Building
'instrument' is imported from external module 'ember-data/-debug' but never used
/usr/local/bin/node[1116]: ../src/node_file.cc:598:void node::InternalModuleReadFile(const v8::FunctionCallbackInfo<v8::Value>&): Assertion `(numchars) >= (0)' failed.
fs.js:682
  var r = binding.read(fd, buffer, offset, length, position);
                  ^

Error: ENOMEM: not enough memory, read
    at Object.fs.readSync (fs.js:682:19)
    at tryReadSync (fs.js:480:20)
    at Object.fs.readFileSync (fs.js:509:19)
    at Object.Module._extensions..js (module.js:579:20)
    at Module.load (module.js:488:32)
    at tryModuleLoad (module.js:447:12)
    at Function.Module._load (module.js:439:3)
    at Module.require (module.js:498:17)
    at require (internal/module.js:20:19)
    at /home/circleci/app/node_modules/esutils/lib/utils.js:31:23
fs.js:682
  var r = binding.read(fd, buffer, offset, length, position);
                  ^

Error: ENOMEM: not enough memory, read
    at Object.fs.readSync (fs.js:682:19)
    at tryReadSync (fs.js:480:20)
    at Object.fs.readFileSync (fs.js:509:19)
    at Object.Module._extensions..js (module.js:579:20)
    at Module.load (module.js:488:32)
    at tryModuleLoad (module.js:447:12)
    at Function.Module._load (module.js:439:3)
    at Module.require (module.js:498:17)
    at require (internal/module.js:20:19)
    at Object.<anonymous> (/home/circleci/app/node_modules/ember-power-select/node_modules/babel-core/lib/transformation/transformers/index.js:43:22)
fs.js:682
  var r = binding.read(fd, buffer, offset, length, position);
                  ^

Error: ENOMEM: not enough memory, read
    at Object.fs.readSync (fs.js:682:19)
    at tryReadSync (fs.js:480:20)
    at Object.fs.readFileSync (fs.js:509:19)
    at Object.Module._extensions..js (module.js:579:20)
    at Module.load (module.js:488:32)
    at tryModuleLoad (module.js:447:12)
    at Function.Module._load (module.js:439:3)
    at Module.require (module.js:498:17)
    at require (internal/module.js:20:19)
    at Object.<anonymous> (/home/circleci/app/node_modules/debug/src/node.js:14:28)
 1: node::Abort() [/usr/local/bin/node]
 2: node::Assert(char const* const (*) [4]) [/usr/local/bin/node]
 3: 0x12e49fa [/usr/local/bin/node]
 4: v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) [/usr/local/bin/node]
 5: 0xb45e2c [/usr/local/bin/node]
 6: v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/bin/node]
 7: 0x1cbf812040c7
fs.js:682
  var r = binding.read(fd, buffer, offset, length, position);
                  ^

Error: ENOMEM: not enough memory, read
    at Object.fs.readSync (fs.js:682:19)
    at tryReadSync (fs.js:480:20)
    at Object.fs.readFileSync (fs.js:509:19)
    at Object.Module._extensions..js (module.js:579:20)
    at Module.load (module.js:488:32)
    at tryModuleLoad (module.js:447:12)
    at Function.Module._load (module.js:439:3)
    at Module.require (module.js:498:17)
    at require (internal/module.js:20:19)
    at Object.<anonymous> (/home/circleci/app/node_modules/debug/src/node.js:14:28)
fs.js:682
  var r = binding.read(fd, buffer, offset, length, position);
                  ^

Error: ENOMEM: not enough memory, read
    at Object.fs.readSync (fs.js:682:19)
    at tryReadSync (fs.js:480:20)
    at Object.fs.readFileSync (fs.js:509:19)
    at Object.Module._extensions..js (module.js:579:20)
    at Module.load (module.js:488:32)
    at tryModuleLoad (module.js:447:12)
    at Function.Module._load (module.js:439:3)
    at Module.require (module.js:498:17)
    at require (internal/module.js:20:19)
    at Object.<anonymous> (/home/circleci/app/node_modules/regenerator/node_modules/ast-types/lib/node-path.js:6:12)
cleaning up
cleaning up...
Build failed.
The Broccoli Plugin: [BroccoliMergeTrees: Addon#treeFor (ember-concurrency - addon)] failed with:
Error: Worker terminated unexpectedly
    at ChildProcess.<anonymous> (/home/circleci/app/node_modules/workerpool/lib/WorkerHandler.js:177:17)
    at emitTwo (events.js:106:13)
    at ChildProcess.emit (events.js:194:7)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:215:12)

The broccoli plugin was instantiated at: 
    at BroccoliMergeTrees.Plugin (/home/circleci/app/node_modules/broccoli-plugin/index.js:7:31)
    at new BroccoliMergeTrees (/home/circleci/app/node_modules/broccoli-merge-trees/index.js:16:10)
    at Function.BroccoliMergeTrees [as _upstreamMergeTrees] (/home/circleci/app/node_modules/broccoli-merge-trees/index.js:10:53)
    at mergeTrees (/home/circleci/app/node_modules/ember-cli/lib/broccoli/merge-trees.js:85:33)
    at Class.treeFor (/home/circleci/app/node_modules/ember-cli/lib/models/addon.js:526:30)
    at addons.reduce (/home/circleci/app/node_modules/ember-cli/lib/models/addon.js:383:26)
    at Array.reduce (native)
    at Class.eachAddonInvoke (/home/circleci/app/node_modules/ember-cli/lib/models/addon.js:380:24)
    at Class.treeFor (/home/circleci/app/node_modules/ember-cli/lib/models/addon.js:515:22)
    at project.addons.reduce (/home/circleci/app/node_modules/ember-cli/lib/broccoli/ember-app.js:559:25)


Exited with code 1

The "Workaround" solution, provided by @rwjblue, is to specifically define the number of jobs you wish use for parallel transpiling by taking advantage of the JOBS ENV var. (https://github.com/mwisner/ember-circleci-example/blob/09c63e11c34d4cdfe602b63166b71e6f31e30f3c/.circleci/config.yml#L42)

Has Reproduction

Most helpful comment

I chatted with @mwisner and @kiwiupover a bit about this over the weekend and we figured out that it has to do with the new parallelism that was added to broccoli-babel-transpiler. It defaults to paralellizing to the number of CPU's currently present. Unfortunately, on CircleCI this shows as 36 CPU's but the job itself is limited to 2 concurrent processes (and also limited in available RAM).

The fix here is to set the JOBS environment variable to 1 to essentially disable the parallelism on CI.

All 23 comments

It does appear that there are some people who do have karma tests working with this step:

https://discuss.circleci.com/t/running-browser-tests/10998/9

After the last few hours, I haven't been able to get this to work successfully yet, but i'm pretty confident this isn't specifically ember.js.... Closing for now.

I'm having this same issue and have been unable to resolve.

I chatted with @mwisner and @kiwiupover a bit about this over the weekend and we figured out that it has to do with the new parallelism that was added to broccoli-babel-transpiler. It defaults to paralellizing to the number of CPU's currently present. Unfortunately, on CircleCI this shows as 36 CPU's but the job itself is limited to 2 concurrent processes (and also limited in available RAM).

The fix here is to set the JOBS environment variable to 1 to essentially disable the parallelism on CI.

@rwjblue thanks for the context here!

Even with Circle's parallelism, I would think (because they're split off into containers) that the JOBS would always need to be 1, then. So by...

to essentially disable the parallelism on CI

...I'm assuming that you mean to disable the broccoli-babel-transpiler parallelism and not Circle's containerized parallelism. Is that right?

@eric-hu does this also seem like a correct reading to you? Is this something we should expect a fix from Circle on so that the CPUs actually available in the job are shown? Does this prevent any sort of parallelism on Circle's side (by splitting tests with something like ember-exam for example)?

I'm not sure if we can expect a fix form circleci themselves... I am not super 100% but I think it's more of a Docker problem than a circle problem specifically. Or maybe a Node + Docker issue not being able to accurately detect the cpu/mem limitations imposed by the docker container.

I have created this repo for experimentation purposes: https://github.com/mwisner/ember-circleci-example.

It includes circleci 2.0 (working w/ the JOBS=1 workaround provided by @rwjblue) (https://github.com/mwisner/ember-circleci-example/blob/master/.circleci/config.yml)

Along with the public circle builds: https://circleci.com/gh/mwisner/ember-circleci-example/83

However I've also added the repo to travis ci, which also uses Docker for builds, I haven't fixed the travis config file yet but you can see the travis builds are failing with the provided travis config (https://travis-ci.org/mwisner/ember-circleci-example/builds)

Thanks @mwisner. Was I correct in thinking that the JOBS=1 does _not_ mean we can't parallelize the builds themselves?

@JoshSmith Yeah if I understand correctly it's to disable parallelism within broccoli-babel-transpiler. Not circle itself. So in theory using circleci parallelism + jobs=1 would be fine?

But personally, I haven't experimented with using circleci's parallelism functionality. So I'm not 100% sure what a configuration file for that would look like.

Makes sense! I wanted to be sure I was just drawing a clear line between what seemed like two distinct uses of "parallelism" here, so I think I'm finally on the same page.

I noticed you were setting JOBS=1 manually in the config prior to ember test, but it seems like that could be set at the ENV var level within Circle, perhaps without issue?

It's also not specific to testing. The OOM issue is caused during the build phase of ember test. I've tested just running ember build as well with the same OOM result.

@JoshSmith I do believe setting JOBS=1 at the env var level would work as well but I haven't confirmed.

I noticed there is a pattern for setting env vars for ember cli commands for a few addons:

https://github.com/ember-cli/broccoli-viz#usage
https://github.com/kategengler/ember-cli-code-coverage#usage

So I was just going off of those usage patterns.

Without knowing _anything_ about how broccoli-babel-transpiler actually works, it's hard for me to say whether or not Circle (or Docker, or whomever) could provide a fix for it. My layman's _feeling_ – not thinking – around this would be that if the transpiler could take explicit instructions on how many cores are available, then we could perhaps avoid the problem of pure inference here. Again, this comes from a place of deep ignorance.

Just to update, I can run these builds successfully with JOBS set to 1 in Circle's environmental settings. Thanks again @rwjblue, @mwisner, and @eric-hu.

I just ran into this same issue on CircleCI, and JOBS=1 also fixed it for me, thanks all :v:

@eric-hu does this also seem like a correct reading to you? Is this something we should expect a fix from Circle on so that the CPUs actually available in the job are shown? Does this prevent any sort of parallelism on Circle's side (by splitting tests with something like ember-exam for example)?

@JoshSmith there's two concepts of parallelism to keep in mind for CircleCI 2.0:

A. Per-command parallelism, limited to the number of cores available to a container group. By default, each container group is allocated 2 CPU shares, which guarantees they get 2 CPU cores. There's a premium Configurable Resources feature that lets you choose a larger/smaller share (1, 4, 8 off the top of my head).

B. CircleCI parallelism, which you can think of as "how many machines [1] do I want to split this across?". This is useful for test isolation, when you might want to run 2 tests at the same time and both write to a database. This is less useful for, say, transpiling your assets ; you probably want all your tests to run with the transpiled assets.

Regarding A: since you have 2 cores guaranteed available by default, you may be able to run the command you want with JOBS=2. This may speed up execution, but I haven't checked if it works.

Regarding B: even with JOBS=1, you can still use CircleCI parallelism to speed up your test suite.

Regarding "who should fix this", I've seen this as a long-running issue with multiple containerization tools. The CircleCI 2.0 and 1.0 containerization tools --Docker and LXC respectively-- leak information about the host system for many common Linux commands, like the ones used to check how many cores are available. It's been this way for a number of years, I think if there were a simple fix it would have been solved by now. Further complicating things, CircleCI changed the CPU core availability model from 1.0 to 2.0. In 1.0, you got a fixed number of cores for a job. In 2.0, you get assigned CPU shares to guarantee your minimum number of cores. If you're on a fully utilized host, you'll get at least that many cores. If you're running on an under-utilized host, you'll have more cores available to you. Tools like broccoli-babel-transpiler tend to assign the number of cores they'll use just once, though, and the available resources may change over the life of the program's execution. It's best to just hard-code your CI scripts to use the guaranteed resources available.

[1] Your code may not be running on N machines for N parallelism. But you can think of it this way, as they're effectively isolated from one another.

Just tried JOBS=2 on my app and the build was successful as well. Should that be the canonical advice here?

No noticeable time difference between those values in my small sample FWIW.

@rwjblue I know you recommended having this issue in the ember-cli repo. However I opened this issue up when I first discovered the issue and looks like it was found before I could open up another in the ember-cli repo... Would you like me to open up another issue there and just reference this conversation? Know of anyway to easily move it?

I reopened this issue because after doing some additional testing the provided travisci.yml config that ships with ember is also running into this issue.

While I understand that ember doesn't support circle, I do think that it would be nice to at least have the issue fixed in the shipped travisci.yml file.

I've also updated the title and description to make it a little more generic and not scoped specifically to circleCI

@eric-hu many thanks for the detailed advice here. Very helpful for the community to understand what's going on in detail. It would be great to see a canonical example in the documentation on a per-framework basis, although I understand and appreciate the time this would take.

@bgentry thanks for reporting on the time difference. I was hoping that it would speed up the build times. I'm going to set JOBS=2, as well, but am a bit disappointed about this since my build times are by far my biggest hindrance in speeding up Circle jobs.

oh wow, I’m glad I finally found this conversation, because this has been happening to me too. I didn’t know what to look for at first because npm test was timing out for me on Travis, with no feedback. It wasn’t until I tried overriding timeouts (which you’d normally have to write in to support to do) that I got an ENOMEM, which finally led me here.

Changing to JOBS=2 npm test has caused my builds to pass again (or at least fail for the right reasons 😆) so thanks all!

Maybe this was only happening to me on Travis because the application has heavy dependencies, but it was difficult to debug and I just ignored it for a long while, so it does seem worth considering how to handle this in the Ember CLI blueprint or otherwise address it.

In my testing a default out of the box ember project without any changes does pass fine, the introduction of a bunch of dependencies does end up causing the error.

I'm not sure what is considered a 'lot' of dependencies in an ember project. But with travis being the de-facto way to do CI with ember addons, I think as people start upgrading / making new addons people are going to see this more and more.

I recently started working on upgrading all the dependencies for the ember-burger-menu project and am getting this error.
Example:
https://travis-ci.org/offirgolan/ember-burger-menu/builds/275031562?utm_source=github_status&utm_medium=notification
https://github.com/offirgolan/ember-burger-menu/pull/95

Oh! Thank you @mwisner for posting this issue and @rwjblue for the temporary solution! I just spent few hours trying to understand why my builds are failing.. Setting JOBS to 1 does the trick 🎉

Closing as JOBS=1 was updated as the default in ember-cli a while ago. Sorry for the troubles...

I just had to add JOBS=1 to fix this issue on 3.16.0. Was there a regression?

Was this page helpful?
0 / 5 - 0 ratings