Yarn: Need a big optimization for Windows

Created on 13 Oct 2016  ·  80Comments  ·  Source: yarnpkg/yarn

Do you want to request a _feature_ or report a _bug_?

Feature

What is the current behavior?

I've tested a lot about the installation speed between MacOS and Windows. According to the results it seems that yarn has far less optimizations for Windows. e.g. Here are the comparisons of installing react-native :


Test Machines:

  • ThinkPad X1 Carbon 4, 1TB PCI-E SSD, 16GB Memory
  • MacBook Air 2014, 256GB SSD, 4GB Memory

No cache & same network environment


MacOS

[email protected]: 1m 31s

2016-10-13 17 52 24

[email protected]: 39s

2016-10-13 17 54 53


Windows

[email protected]: 2m 24s

2

[email protected]: 2m 19s

1


So, it seems yarn has no advantage over npm on Windows. Did anyone face with this appearance?

Please mention your node.js, yarn and operating system version.
nodejs: 6.8.0
yarn: 0.15.1
OS: Windows 10 14393.321 & MacOS 10.12

cat-performance

Most helpful comment

@cpojer I guess they are right . I don't have any anti-virus software on my machine except the pre-installed Windows Defender, so I banned the scanning of global cache folder & my project folder and did some test:

Default: 128.08s

2


No scanning of cache folder: 104.43s

3


No scanning of project folder: 78.28s

5


No scanning of cache folder & project folder: 53.48s

4


Though it's slower than Mac for 10+s, it has a significant boost.

This should be informed from the official docs I think.

All 80 comments

+1

Hi @OshotOkill! Thanks for trying Yarn. Are you using Cygwin or WSL ("Bash on Ubuntu on Windows")? Both are known to have pretty bad disk IO performance.

Also, React Native has a huge number of files so copying them into node_modules is pretty slow, and disk IO for lots of little files on Windows is generally slower than Mac OS (which itself is slower than Linux ext4). We do have a task to experiment with hardlinks (#499) which should improve perf in this scenario.

No cache & same network environment

The main improvement with Yarn is when you have a warm cache (ie. after you've installed a package at least once), but with React Native, the huge number of files will also be a cause of some of the slowness.

@Daniel15 Nope, I'm not using Cygwin/MinGW/MSYS2 or WSL (the latter fails due to a knotty bug).

According to your description I can assume the problem is caused by the file system (NTFS) right ? Even if a warm cache exists, the copying process still runs much slower than MacOS.

Hope the dev teams can come up with a solution ASAP. Thanks.

I'm seeing the same.

Do install, wipe node_modules, install

MacBookPro takes 17 seconds, my Windows machine takes 122 seconds.

Somebody pointed out this might be related to anti-virus software scanning node_modules and the global yarn cache continuously. Can you try disabling it for those folders?

@cpojer I guess they are right . I don't have any anti-virus software on my machine except the pre-installed Windows Defender, so I banned the scanning of global cache folder & my project folder and did some test:

Default: 128.08s

2


No scanning of cache folder: 104.43s

3


No scanning of project folder: 78.28s

5


No scanning of cache folder & project folder: 53.48s

4


Though it's slower than Mac for 10+s, it has a significant boost.

This should be informed from the official docs I think.

Somebody pointed out this might be related to anti-virus software scanning node_modules and the global yarn cache continuously.

Good catch! I totally forgot about this as I already have c:\src whitelisted on my computers.

@OshotOkill - Would you like to submit a pull request adding a note about antivirus apps to the website, in the Windows installation instructions? Here's the file you'd need to edit: https://github.com/yarnpkg/website/blob/master/en/docs/_installations/windows.md (you can edit it directly on Github). It'd be appreciated 😄

I wasn't as meticulous as @OshotOkill, but I added exceptions for my source and my node install folder, and then specifically exempted the yarn, npm and node binaries and now my fresh install time on Windows is down to 50 seconds from 122 seconds.

@Daniel15 PR is ready. Apologize for my poor English.

PR has been merged. Close this issue.

This is still painfully slow on windows, even deactivating anti-virus and Windows defender. I don't think it's just an environment issue (like this anti-virus solution) but it looks like yarn tries to copy all the files, 1-by-1 even if you install some unrelated dependency.

Why not just delete/copy the files that need to change? If I had webpack installed and is not modified when I installed rimraf, It shouldn't have to be copied again from the cache to the local node_modules folder.

I have created a StackOverflow article about this too: http://stackoverflow.com/questions/40566222/yarn-5x-slower-on-windows

By the way in my (dual-booted) Ubuntu benchmarks I was using the same NTFS drive as the one Windows normally runs on; and it's still fast there.

Adding node.exe to Windows Defender exclusions gave me a huge performance boost http://126kr.com/article/1884rsed7l

I'll definitely try this out!

It did seem to improve the speed a bit 212 -> 170 seconds
So it seems to help, but it could still be improved, because it's still more than 3x slower than in Linux

Another issue I have noticed - Indexing service on Windows tries to index every file in node_modules.
I don't really need it at all, so I disabled it http://www.softwareok.com/?seite=faq-Windows-10&faq=53 and gained another performance boost.

My windows isn't set to index the path in question, so that still doesn't solve the issue.

So to sum up there are 4 ways to improve performance:

  • Whitelist project folder from AV
  • Whiteilst the Yarn cache directory ((%LocalAppData%Yarn)) from AV
  • Adding node.exe to Windows Defender exclusions
  • Disabling Indexing service on Windows on node_modules folder

@Altiano yes, but it's still not enough to get performance even close to Mac/Linux

Seems kinda sketchy that you'd have to disable AV or indexing on directories to make yarn as fast or faster than npm. After all, you don't have to do this for npm. I decided to give yarn a shot because it stated it was fast and the offline installs made coding without a network connection a plausible thing. Is there no way to optimize the linking?

According to some issues that relate here and the comments above, I'd like to reopen this issue in order to gather some other solutions.

Personally I suggest to list the hardware configurations about your test machine and upload some related pics. There could be many other irrelevant elements that make a big difference between platforms rather than Yarn itself, i.e. the benchmark performance of the SSD on a MacBook is usually much better than a Windows machine.

@OshotOkill like I said earlier, I got 3.5x slower performance on Windows vs Linux having indexes and windows defender disabled for the relevant directories, on the same regular PC on the same _ntfs_ drive. That that even on ntfs it's much faster on Linux says a lot I think.

Let's get to the reason for this.
Could it be NTFS working slower on a large number of files being moved during install?

Can anyone share a way to repro this on a single machine?
For example, a particular package.json installed on a Windows laptop takes X seconds but running in VirtualBox Ubuntu X-20% seconds.

@amcsi @bestander As is often the case, EXT4/XFS are faster while copying large amounts of small files. However, NTFS is not that much slower. I just cleaned the cache and tested again by using the latest version of Yarn and Node (0.19.1 & 7.5.0):

a

The result is really close to a MacBook while installing react-native. All I did was just whitelist the related folders and Node.exe process.

I was having this problem myself until I whitelisted the node.exe and yarn.exe processes in Windows Defender, along with my project directory. I haven't disabled search indexing at all, nor have I whitelisted the Yarn cache directory. Install times went from 190+ seconds on a medium-sized project to about 25 seconds from a clean cache. My Ubuntu machine is just a little bit faster than that, but only by 5-10 seconds.

Fresh Yarn install

Hardware config:
512gb SSD
12gb RAM
AMD FX-8350 8-Core CPU @ 4.01ghz
Windows 10 64-bit, build 14986.

I just did some quick tests on my own system. I've got Linux Mint and Windows 10 dual booted off the same SSD. I cleaned my yarn cache, deleted node_modules and ran yarn on this vue project.

Linux Mint: _12.22s_

yarnlinuxmint

Windows 10 (No white listing): _64.32s_

yarnwindows10

Windows 10 (With white listing): _42.58s_

yarnwindows10_withexclusions

These were the Windows Defender exclusions I had active:
yarnwindows10_exclusions

While white listing did seem to have a significant effect, it still didn't come close to matching the speed on Linux.

EDIT: For @bestander, here's my normalized data:

| OS | Calculation | Normalized Data |
|---|---|---|
| Linux Mint | 12.22 / 12.22 | 1 |
| Windows 10 | 64.32 / 12.22 | 5.2635 |
| Windows 10 (With white listing) | 42.58 / 12.22 | 3.4845 |

@keawade I had 26.48s to install your project from a clean cache, and 13.58s to install it with the cache.

keawade.github.io

Just spitballing here, I'm using the Yarn.cmd from the MSI installer and it looks like you're using Yarn installed from NPM. I wonder if there's maybe a discrepancy between them?

@nozzlegear While that might be possible, I think that is less likely than it being due to differing internet connections.

We need to eliminate network from this.
Currently I can test this repo on a latest Windows 10 with "Linux on Windows" feature enabled.
Both via CMD and Bash with prime caches installation takes about 27-29 seconds on a 2 core i7 processor.

@keawade, can you run the same test with node_modules removed but caches in place?

I can't install a second OS on the device I have yet.
Can anyone check if running Windows and Linux in a Virtual box give different results?

I've built current master with timestamps https://github.com/yarnpkg/yarn/releases/download/v0.21.0-pre/yarn-0.21.0-0.js

Can you use it for installing with --verbose flag?

E.g.

node /Users/bestander/work/yarn/artifacts/yarn-0.21.0-0.js install --verbose

It should give timestamps to all the FS operations

Data without cleaning caches

_Note: This data is being recorded on a dual booted system. All hardware is identical for these tests._

| OS | Avg Time | Normalized |
|-----------------------------|----------|------------|
| Linux Mint | 5.598s | 1.00000 |
| Windows 10 (w/ White list) | 12.119s | 2.16488 |
| Windows 10 (w/o White list) | 31.578s | 5.64094 |

_Avg Time is the average across a set of 10 tests_

Raw Linux Mint Data

[5.47, 5.40, 5.84, 5.96, 5.55, 5.48, 5.40, 5.57, 5.81, 5.50]

Raw Windows 10 Data

With White Listing

[11.91, 11.87, 11.88, 12.07, 11.81, 12.02, 12.39, 12.49, 12.28, 12.47]

Without White Listing

[30.85, 31.52, 31.39, 31.46, 31.14, 31.41, 34.24, 31.09, 31.40, 31.28]

Methodology

I used this PowerShell script to generate all the data shown here. The script clones this repo and runs 10 iterations of the command yarn, deleting node_modules after each iteration.

@bestander, I've updated the previous post with the Windows data.

Great, thanks for more data.
Can you try the --verbose version with yarn.js with time stamps for both OS?
It would give us a good idea where time is spent.

Whew, that is a lot of logging! Do you want 10 runs for each OS / white list combination or is one for each good enough?

@bestander Here you go! One of each.

Side note: Turns out if you try to upload ~30mb of raw text to a single gist collection you get an nginx 405 error. 😆

~Linux Mint~
~Windows 10 with exclusions and with clean~
~Windows 10 with exclusions and without _clean_~
~Windows 10 with clean and without _exclusions_~
~Windows 10 without _exclusions_ and _without_ clean~

VerboseLogs.tar.gz

EDIT: Removing gists and uploading the compressed files.

Turns out if you try to upload ~30mb of raw text to a single gist collection you get an nginx 405 error. 😆

You could compress the files (bzip2 or 7-Zip) and attach them here... Plain text compresses very well :)

@Daniel15 Good point, here are the compressed files: VerboseLogs.tar.gz

1 run would be fine :)

I compared LinuxMint.txt vs Windows10NoClean.txt

Linux:

  • linking phase starts at 1.156 seconds
  • all folders inside node_modules created at 1.968
  • last file copied at 3.873 seconds
  • builds are done in another 3 seconds

Windows

  • linking phase starts at 2.779 seconds
  • all folders inside node_modules created at 4.83
  • last file copied at 32.853
  • builds are done in another 3 seconds

Obviously verbose logging affects execution time on Windows (12 -> 35 seconds) but not on Linux (same 6 seconds).

From the benchmarks I found on the internet Linux EXT3 FS usually outperforms NTFS when a lot of files are copied.
I wonder if this is the limit we have to face.

@keawade, are the speeds different when using npm@3 on Windows and Linux?

A few ideas:

  • Windows may be bad at concurrent copy, we copy files in 4 threads. Maybe do it single threaded?
  • Maybe use robocopy wrapper in Windows https://github.com/mikeobrien/node-robocopy
  • we use readstream.pipe.writestream to copy files, maybe it is inefficient on Windows

If you are eager to experiment, replace 4 with 1 in https://github.com/yarnpkg/yarn/blob/master/src/util/fs.js#L322 and see if single threaded copying gets faster on windows

Thread Tests

Per @bestander's request, I forked yarnpkg/yarn and modified line 322 of src/util/fs.js, replacing the 4 with a 1. I then used yarn run build to build the project and ran 10 tests with that build using the yarn.cmd that was compiled by the build. These are the results.

| | Avg Time | Normalized |
|----------------------------|----------|------------|
| Windows 10 (w/ White list) | 12.119s | 1.00000 |
| Single copy thread | 16.927s | 1.39673 |
| Single copy thread + Clean | 42.268s | 3.48775 |

_Avg Time is the average across a set of 10 tests_

It looks like using only a single thread to copy the files results in slightly slower install times.

Raw Data

Windows 10 (w/ White list)

This data is from a previous test

Single Copy Thread

[15.72, 17.43, 15.16, 17.21, 17.83, 17.47, 16.68, 16.58, 16.93, 18.26]

Single Copy Thread + Clean

[37.68, 40.10, 43.20, 46.18, 40.84, 40.58, 39.69, 47.93, 42.45, 44.03]

Thanks, @keawade.
Can you verify my assumption that NTFS might be slower at copying large number of files than Linux FS?

Measure copying via terminal full installed node_modules to another location in both Linux Mint and Windows 10, please.

It is also necessary to test the copy using robocopy with the option /mt (multi-threaded copies)

I'd also like to report a possibly reported bug, wherein every single yarn add or yarn remove takes about 30-40 minutes. It apparently copies ALL the dependencies again, and since I'm on Windows, this takes a long time. See linked issue:

https://github.com/yarnpkg/yarn/issues/2460

@kumarharsh #2458 It took me 28s to finish the installation.

image

Also I must mention that do not forget to whitelist the project folders as well, not only the cache.

Copy Tests

I used this script to run 10 iterations of copies on both Linux Mint and Windows 10. I copied this repo after running yarn in the directory. These are my results.

| OS | Avg Time | Normalized |
|------------|------------|------------|
| Linux Mint | 1527.4620 | 1.00000 |
| Windows 10 | 53676.3155 | 35.14085 |

That time difference is crazy. These copies were done copying files from one location to another on _the same SSD_.

Raw Data

Linux Mint

TotalMilliseconds
-----------------
        1515.3961
        1513.9469
        1540.3275
        1527.2777
        1514.6029
        1521.3711
        1512.0628
        1547.8331
        1518.1499
        1563.6521

Windows 10

TotalMilliseconds
-----------------
       55729.4968
       55915.5972
       53427.5155
       51624.6760
       52191.4177
       53556.4542
       53562.5533
       53527.9015
       53610.6127
       53616.9302

I don't have time right now to test robocopy but I can get that data this evening after work.

Robocopy Test

I used this script to run 10 iterations of copies on both Linux Mint and Windows 10. I copied this repo after running yarn in the directory. These are my results.

| OS | Avg Time | Normalized |
|-----------------------|------------|------------|
| Linux Mint | 1527.4620 | 1.00000 |
| Windows 10 | 53676.3155 | 35.14085 |
| Windows 10 (Robocopy) | 58089.7457 | 38.03024 |

Robocopy performed slightly worse than a regular copy.

Raw Data

The Linux Mint and Windows 10 values are from the previous tests

TotalMilliseconds
-----------------
       56935.3304
       58234.8084
       57838.7956
       56731.7850
       58380.1805
       58097.6040
       59161.0365
       59062.9404
       58363.5527
       58091.4234

@keawade, can you verify that file indexing and Defender don't interfere with the copy?
Afaik it can get involved even for a cp command.

Check what is active in Task Manager when the copy is done.
And maybe just turn off those services for a test

Indexing and Defender Tests

I performed tests under the following conditions:

  • With disabled Windows Defender
  • With disabled Windows Indexing service
  • With _both_ disabled Windows Defender and Windows Indexing service
  • With _both_ disabled Windows Defender and Windows Indexing service _and_ cleaning the Yarn cache

To disable Windows Defender, I toggled off Real-time protection under the Windows Defender settings panel.

To disable Windows indexing, I stopped the Windows Search service in the Services control panel.

_Note: When Windows Defender was enabled, no exclusions were listed_

I used this script to run 10 iterations of copies on both Linux Mint and Windows 10. I copied this repo after running yarn in the directory. These are my results.

Summary

It looks like while Windows indexing (Search service) does have an impact on copy operations and Yarn, the larger impact comes from Windows Defender.

Copying

| OS | Avg Time | Normalized |
|--------------------------------------------|------------|------------|
| Linux Mint | 1527.4620 | 1.00000 |
| Windows 10 (No defender) | 7301.4307 | 4.78011 |
| Windows 10 (No indexing) | 10307.0794 | 6.74787 |
| Windows 10 (No defender, no indexing) | 7044.1393 | 4.61166 |
| Windows 10 Robo (No defender, no indexing) | 10094.8358 | 6.60889 |

Indexing Fully disabling indexing and antivirus provides a huge boost to performance when copying files.

Yarn

Since the results above were so pronounced, I figured we could probably use data on Yarn's performance under these conditions as well.

I used this script to run 10 iterations of yarn on both Linux Mint and Windows 10. I cloned this repo and ran yarn in the directory.

| OS | Avg Time | Normalized |
|---------------------------------------------|----------|------------|
| Linux Mint | 5.5980 | 1.00000 |
| Windows 10 (No defender) | 16.5450 | 2.95552 |
| Windows 10 (No indexing) | 38.5170 | 6.88049 |
| Windows 10 (No defender, no indexing) | 16.8490 | 3.00982 |
| Windows 10 Clean (No defender, no indexing) | 30.7730 | 5.49714 |

Raw Data

The Linux Mint values are from the previous tests.

Windows 10 Copy-Item

[7053.7702, 7163.6924, 7081.5366, 7131.2731, 6887.5165, 6960.7251, 6999.6528, 7051.1932, 7046.8592, 7065.1741]

Windows 10 Robocopy

[10096.4991, 10290.1073, 10350.6061, 9999.0552, 10294.0660, 10024.2568, 9949.6786, 9878.1346, 9801.2121, 10264.7418]

Windows 10 Yarn

[16.81, 16.23, 16.29, 16.48, 19.03, 16.27, 17.64, 16.64, 16.05, 17.05]

Windows 10 Yarn Clean

[47.46, 27.83, 28.31, 27.87, 28.90, 30.70, 31.17, 27.97, 28.77, 28.75]

Windows 10 Yarn Indexing Disabled

[38.47, 38.63, 38.37, 38.82, 38.05, 38.54, 38.44, 37.90, 39.02, 38.93]

Windows 10 Copy Indexing Disabled

[10222.4855, 10063.3654, 10152.2953, 10151.6155, 10316.7628, 10705.8277, 10199.5391, 10624.1961, 10308.2336, 10326.4731]

Windows 10 Yarn Windows Defender Disabled

[17.03, 16.21, 16.76, 16.43, 16.19, 16.71, 16.23, 16.30, 17.37, 16.22]

Windows 10 Copy Windows Defender Disabled

[7273.9684, 7427.1726, 7409.7312, 7417.4478, 7164.8717, 7427.4655, 7321.0481, 7292.2561, 7159.4540, 7120.8913]

That is some solid research, @keawade, thanks for sharing all the data.
The data suggests that the raw filesystem performance is the bottleneck for yarn installations on Windows.
I am not sure if Yarn can do anything here unless there is some smart copy command that works around the limitation

@keawade thanks for taking so much pain compiling those numbers! @bestander could it be that since mom npm doesn't face these same problems while copying (perpetual scanning), maybe yarn is not signed? Could be that windows defender is not trusting yarn the same level as npm. Just a thought...

@kumarharsh, we'll need to measure the difference between npm and yarn then.
Maybe npm is copying less files (Yarn's hoisting is not optimized for smallest node_modules tree).
And it would be great if we could automatically whitelist yarn via installer.

maybe yarn is not signed? Could be that windows defender is not trusting yarn the same level as npm.

I don't think scripts can be signed (with the exception of PowerShell scripts which do support Authenticode signatures), so I don't think Yarn and npm would differ in that regard. Yarn's installer is Authenticode signed just like npm's.

And it would be great if we could automatically whitelist yarn via installer.

I feel like automatically touching a virus scanning whitelist would result in virus scanners marking the installer as malware. It seems like a risky thing to do. Perhaps we could automatically blacklist the directory in the search indexer, though.

@keawade I have tested robocopy with the options /E /MT (multi-threaded copies).

| Copy method | Avg time |
|----------------------|----------|
| Copy-Item -Recurse | 20219 |
| Robocopy /E | 26652 |
| Robocopy /E /MT | 9043 |

Raw data (windows 10)

Copy-Item -Recurse

[19494.3827, 19471.0148, 19573.9441, 19896.9619, 19413.0355, 20050.4264, 19370.4315, 22959.5867, 20969.9693, 20994.3076]

Robocopy /E

[26522.4862, 26489.6131, 26654.8518, 26910.1073, 26536.042, 26836.0344, 26682.3544, 26408.4497, 26883.7998, 26605.5189]

Robocopy /E /MT

[9274.1374, 9125.6525, 9292.1629, 9014.8979, 8947.7882, 8985.4369, 8742.3616, 8915.4609, 8938.8326, 9200.9616]

I don't think scripts can be signed (with the exception of PowerShell scripts which do support Authenticode signatures), so I don't think Yarn and npm would differ in that regard. Yarn's installer is Authenticode signed just like npm's.

Sounds reasonable.
Can we double check this?
If running npm install do Indexer and Defender show up in Task Manager?

NPM Defender and Indexing Tests

I recorded the time it took to run npm install on this repo under the same set of conditions as the Indexing and Defender Tests for Yarn and Windows copy methods.

| OS | Avg Time | Normalized |
|---------------------------------------------|----------|------------|
| Linux Mint (Yarn) | 5.5980 | 1.00000 |
| Linux Mint (NPM) | 28.9793 | 5.17672 |
| Windows 10 (No defender) | 42.6296 | 7.61514 |
| Windows 10 (No indexing) | 53.8791 | 9.62470 |
| Windows 10 (No defender, no indexing) | 37.9727 | 6.78326 |
| Windows 10 (No alterations) | 58.5047 | 10.45100 |

Summary

Looks like NPM is also impacted by Windows Defender.

Raw Data

NPM (Linux Mint)

[29.2353468, 35.6938315, 31.2105951, 30.9298704, 36.5016868, 31.8017671, 30.6387978, 32.3466556, 31.4340427]

NPM (Windows)

[61.2370640, 63.8799427, 62.3602369, 54.0541606, 55.1055082, 59.8259424, 56.7668692, 61.1153600, 54.7739699, 55.9277175]

NPM with Windows Defender Disabled

[41.1666621, 45.6951565, 43.1979249, 43.9185817, 40.8516877, 42.3445648, 43.5419790, 43.5084263, 45.0731120, 36.9975436]

NPM with Windows Indexing Disabled

[61.1470203, 58.6288137, 52.2553500, 52.4279906, 53.5446943, 54.2839412, 51.1620714, 52.1045756, 51.6424888, 51.5937462]

NPM with Windows Defender and Windows Indexing Disabled

[37.1311942, 37.7022530, 38.4630113, 37.5750357, 38.1434941, 37.2711589, 37.2249454, 39.4748951, 38.5522905, 38.1883537]

I suppose that we will need to work around the limitation of Windows being slower at disk IO by reducing the number of IO operations in general and educating people about Indexing and Defender.
Replacing copy with robocopy seems like a good idea, too.

reducing the number of IO operations in general

This is a good idea in general, and will help perf everywhere. It could also be pretty beneficial for people building on servers with slow hard drives.

Looking forward for a PR to replace fs streams piping with a robocopy on Windows here.

-- Update
However, this might not be optimal because copyBulk has some extra logic like exclusions that might not be translated into a single robocopy command.

Does anybody know why this happens for me (everytime)?

image

To expand on the post:
On my Windows machine - every single yarn add or yarn rm re-copies all the node modules in my project, which makes every change to package.json take an excruciatingly long time to complete. That progress-bar for 160k dependencies comes every time, and it crawls like a turtle stranded in an oilfield. Observe the timings on the yarn rm paper operation I just did before the yarn add - 1000 seconds!

And cancelling one of those add/rm operations is not possible, as it messes up the node_modules folder and any subsequent yarn install/npm install won't install all the dependencies - which ultimately means that I end up doing a rm -r node_modules/ and starting all over again. This single reason is painful enough to stop me from using yarn install at all.

I think you have badly hoisted node_modules, this bug is going to be fixed in #2676

With @bestander's introduction of hardlinking in #2620 (Which works fine in Windows 7 without administrator privileges), my overall installation times, and node_modules/ size, dropped.

Without hardlinking:

Done in 167.76s.

real    2m49.633s
user    0m0.229s
sys     0m1.368s

du -sh node_modules/
216M    node_modules/

With hardlinking:

Done in 58.07s.

real    0m59.967s
user    0m0.183s
sys     0m1.369s

du -sh node_modules/
189M    node_modules/

Wait for 0.21.1, it will have @kittens' fix to hoisting.
Should be even faster

On Wed, 15 Feb 2017 at 20:04, Hutson Betts notifications@github.com wrote:

With @bestander https://github.com/bestander's introduction of
hardlinking in #2620 https://github.com/yarnpkg/yarn/pull/2620 (Which
works fine in Windows 7 without administrator privileges), my overall
installation times, and node_modules/ size, dropped.

Without hardlinking:

Done in 167.76s.

real 2m49.633s
user 0m0.229s
sys 0m1.368s

du -sh node_modules/
216M node_modules/

With hardlinking:

Done in 58.07s.

real 0m59.967s
user 0m0.183s
sys 0m1.369s

du -sh node_modules/
189M node_modules/


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/yarnpkg/yarn/issues/990#issuecomment-280122923, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACBdWAEghXfPo4bX9mN0hV8l8YaH2rmlks5rc1pigaJpZM4KVwpA
.

I am on Win7 /w Yarn v0.21.3

[3/4] Linking dependencies...
Done in 947.71s. 

Had to wait this amount of time adding any new package with yarn add ...
Defender off
Indexing disabled

Have some other AV running so just following these steps as above mentioned by @Altiano

Whitelist project folder from AV
Whiteilst the Yarn cache directory ((%LocalAppData%Yarn)) from AV

Will update on this one

@kuncevic, what would be a clean install time for the project?
What is the size of node_modules folder in files?
How does it compare to npm install?

@bestander this is the same problem with me. Any yarn add or yarn remove takes equal time - about , even after @kittens' hoisting fixes.

Everytime this happens:

  1. First, the Fetching packages runs (takes about 30s):
    image
  2. Then, linking dependencies pauses for about 1 or 2 minutes.
    image
  3. Then, it resumes with next step, and installs(?) 63k files again.
    image

As I said, this happens every single time I run yarn add or yarn remove. It does not matter if the dependency I'm installing depends on any other dependency, A simple npm install for installing a new dependency or upgrading an existing one takes a fraction of this time. Things did improve by 2x with @kittens' hoisting fixes, but still the time taken is too much.

@bestander if you want a reproducible case, please clone this repo: https://github.com/kumarharsh/yarn-bug, and run yarn install, and then yarn add react-helmet.

Yarn is preserving determinism every time it runs add/remove, so it needs to check if any dependency got hoisted to the root of node_modules when dependencies change.
That is why it runs full linking phase.

Fetching dependencies - download, you can't optimise it.
First linking phase (1561 operations) - it creates all the folders for all the dependencies.
Second linking phase (63K operations) - it copies the files from cache to node_modues.

Yarn optimises file copy operations by checking if the files are the same before doing the copy.
We might want to profile this area better and see if we can decrease number of unnecessary IO.
Maybe on Windows copying would be faster then checking?

What about npm, how fast it does clean install?

A clean install for npm (npm install) takes 552301.1944ms.
Installing an additional dependency (npm install weird) takes 57023.7593ms. (Most of this time is wasted in paperjs trying to install canvas as a dep - but this time would be common for both npm & yarn)

A clean install for yarn (yarn install) takes 612698.4915ms.
Installing an additional dependency (yarn add weird) takes 495633.0307ms.

npm version 3.10.9
yarn version 0.21.3

@bestander @kumarharsh Yarn doesn't optimises file copy operations on windows due to a libuv/nodejs bug (See #2958 for a potential fix in yarn code) that isn't present on node 7.1+ so you can get your second command (yarn add) to be a lot faster just by upgrading node.

Using windows file copy operation is a little bit faster than using node API to copy files too (See #2960 for a potential PR) and would optimize yarn install a little bit but I don't know if it would egalize with npm (didn't test)

Just updated to 7.8.0

nvm install 7.8.0
npm install npm -g (came with 4.4.4)
nvm use 7.8.0
`git clone https://github.com/angular/material2`
cd material2
yarn install - Done in 210.22s.
rimraf node_modules
yarn install - Done in 180.66s.
rimraf node_modules
yarn install - Done in 181.11s.

However by doing yarn add rimraf got it done in 20.52s. but why yarn install after removing node_modules taking so long?

p.s.

rimraf node_modules
npm install - Done in 332.4s
rimraf node_modules
npm install - Done in 402s
rimraf node_modules
npm install - Done in 489.6s

@kuncevic Nice to see that upgrading node works for yarn add :)

Regarding empty node_modules a good thing to do is to measure how much is due to yarn and how much is due to FileSystem, Hard drive & Anti virus.
What I did to test that was to copy the full node_module (As generated by yarn, not npm) of material2 somewhere in yarn cache :

for /f "delims=" %i in ('yarn cache dir') do set yarncachedir=%i
xcopy /E /Y /I /Q node_modules %yarncachedir%\x-temp

And then for each test I cleaned node_modules & ran either yarn install, npm install or an xcopy from the previously created folder :

rd node_modules /s /q
powershell -Command "Measure-Command { xcopy /E /Y /I /Q %yarncachedir%\x-temp node_modules}"

And took the total seconds.

Results

Here are the results on 3 PCs

  • 🏠 Home PC: Samsung 950 Pro NVMe, ESET Nod32
  • 🏢 Work PC: Samsung 850 EVO SATA, TrendMicro OfficeScan that I can't disable
  • 🍎 MacBook pro: 2015 version, on macos, no anti virus

||yarn 🏠|npm 🏠|xcopy 🏠|yarn 🏢|npm 🏢|xcopy 🏢|yarn 🍎|npm 🍎
|-|-|-|-|-|-|-|-|-
|AV Disabled|34s|90s|23s|-|-|-|32s|92s
|AV Exclude cache & code|38s|104s|29s|-|-|-|-|-
|Av Exclude cache only|43s|-|31s|-|-|-|-|-
|Av full|48s|122s|32s|100s|274s|236s|-|-

Each time AV was enabled it was toping the CPU chart during yarn install or xcopy (On my home PC 30% cpu total was taken at the max but on my work PC it fill one core for xcopy & all my cores for yarn)
xcopy is slower on my work PC than yarn, I suspect because it doesn't copy files in parallel while yarn does (That shouldn't matter for IO bound operations but AV are making it a CPU bound operation & xcopy wasn't written to fight so much stupidity 😄 )

In conclusion

  • yarn is faster than npm and can even be faster than xcopy when AV make file copy CPU bound
  • Windows on a good SSD isn't really slower than a MacBook Pro 2015 (That already has a good SSD) even if it's hard to compare as not exactly the same packages install, & not all post-install scripts do the same thing
  • Some changes could be done in yarn to sidestep that (symlinking files ?) but essentially coping lot of small files is slow
  • On windows AV can make it slower, mine add 30% when enabled in both source & dest folders 😞
  • Corporate AVs can be a magnitude slower than home AVs & kill performance enough for any copy operation to be painful (when it make the naturally IO bound operation CPU bound) 😡

Adding npm, yarn cache folders and node.exe to defender's exclusion list would be enough, of course all this can't be in indexed folders. Now yarn add / rm takes 7 secs

Thanks everyone, a significant optimization for Windows landed in 0.24 https://github.com/yarnpkg/yarn/pull/3234#issuecomment-297552326

@vbfox Can you please add version numbers for npm and yarn in your benchmark?

this is still a piece of shit for MacOSX

I'm still experiencing some crazy install times. yarn add seems to install and link everything (all items in package.json, ~30k dependencies) all the time.

Linux versions:

$ yarn -v
1.3.2
$ node -v
v8.9.3

Windows versions:

> conemu-cyg-64.exe --version
ConEmu cygwin/msys connector version 1.2.2
> wslbridge.exe --version
wslbridge 0.2.3
> Get-ItemProperty 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion' | Select-Object ProductName, CurrentMajorVersionNumber, CurrentMinorVersionNumber, ReleaseId, CurrentBuild, CurrentBuildNumber, BuildLabEx


ProductName               : Windows 10 Pro Insider Preview
CurrentMajorVersionNumber : 10
CurrentMinorVersionNumber : 0
ReleaseId                 : 1709
CurrentBuild              : 17025
CurrentBuildNumber        : 17025
BuildLabEx                : 17025.1000.amd64fre.rs_prerelease.171020-1626

I've got two (and a half) questions:

  1. What's the accepted solution to the issue in this thread? Was it #3234 or tweaking Windows Defender?

    • If the solution was tweaking Windows Defender, is there a complete writeup of what to do somewhere?
  2. Is my issue actually related to this thread, or should I create a new one?

Thanks for raising a new issue, I'll respond there

It's been almost an hour now and I'm waiting for this command to finish the process. I've followed the above points mentioned by @Altiano but nothing works

do we have any alternative for this? like can i use npm i -g . will it act in the same way or I'll have to make some changes because this code uses yarn workspace

So finally after struggling for 2-3 hours, I had to use npm i -g . instead of yarn global add file:.. npm worked like a charm

Was this page helpful?
0 / 5 - 0 ratings