React-pdf: Webpack production bundling react-pdf is super slow

Created on 20 Oct 2017  ·  36Comments  ·  Source: wojtekmaj/react-pdf

Hello,

Am I the only one experiencing slow build times in production mode? it takes Webpack like 180seconds to build/minify pdf+worker js files, which are probably a peer dependency?

Have anyone found a working solution to ship it to production without too long bundling times?

Cheers

help wanted question

Most helpful comment

A webpack solution

HTML

<script src="//cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/[email protected]/web/pdf_viewer.min.js"></script>
<script>
PDFJS.workerSrc = "//cdn.jsdelivr.net/npm/[email protected]/build/pdf.worker.min.js";
</script>

Webpack external configure

{
 "externals": {
    "pdfjs-dist": "pdfjsDistWebPdfViewer",
    "pdfjs-dist/lib/web/pdf_link_service": "pdfjsDistWebPdfViewer.PDFJS"
 }

Import

import {Document, Page} from 'react-pdf/dist/entry';

Result was improved.

from 6 min to 25sec.

real 0m25.909s
user 0m24.120s
sys 0m0.448s

All 36 comments

Hey!
Nope, you're not the only one. PDF.js is a really big library (around 2 MB before, 800 KB after minification) and it simply takes a long time to parse all of that.
It might help a little to use flags like cacheDirectory: true, but I wouldn't expect a big improvement.

If anyone has any ideas on how to improve this situation though, I'd be happy to hear!

well, this is sad actually :( I've been experimenting with solutions but so far no luck... I've been trying to bundle this lib into separate file and and bundle it only once, ofc it's not too maintainable approach, but cutting down 180seconds would be worth it, as this library doesn't update too often, it would suffice to check version diff now and then. Will get back to you if I come up with better solution.
And cacheDirectory is not an option in my case, every deployment we do is clean project initialisation from scratch.

@mvirbicianskas try to use parcel-bundler, it's not ideal, but it's blazingly fast!

hey, thanks for the recommendation, will take a look, but I don't think it's webpack's problem it's uglifyjs plugin speed problem

Either way, parcel is using uglify-es, much much newer solution than Webpack's. It may improve build speeds.

Please be aware though that I don't officially support Parcel just yet, but I'm definitely on it; super excited about Parcel as much as the rest of the community is!

In our case it's taking more than 15 minutes after adding react-pdf, we're using Rails with Webpacker, so I don't think using Parcel is an option for us. Does anybody have any other suggestions? I went into configuration hell trying to get Happypack and AutoDllPlugin working without success. Is there any other pure Webpack-based solution that would work for this? This one issue is making us have to reconfigure different parts of our stack and our deployment flow.

Here are my suggestions (not only for you @obahareth but you may give some a go):

  • In UglifyJsPlugin which you are likely using during production build, you could exclude pdf.js and pdf.worker.js files from minification using exclude option.
  • pdfjs-dist come with pre-minified versions of their libs, but their entry files are not using them by default. You can use NormalModuleReplacementPlugin to load pdf.min.js instead of pdf.js and pdf.worker.min.js instead of pdf.worker.js. Make sure to exclude minified files from minification using option mentioned in bullet point above.
  • Try and make a use of two other UglifyJsPlugin options: cache to ensure you're not processing the same, unchanged file over and over, and parallel to use multi-process.

Let me know if anything had an especially good impact on your build performance!

A webpack solution

HTML

<script src="//cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/[email protected]/web/pdf_viewer.min.js"></script>
<script>
PDFJS.workerSrc = "//cdn.jsdelivr.net/npm/[email protected]/build/pdf.worker.min.js";
</script>

Webpack external configure

{
 "externals": {
    "pdfjs-dist": "pdfjsDistWebPdfViewer",
    "pdfjs-dist/lib/web/pdf_link_service": "pdfjsDistWebPdfViewer.PDFJS"
 }

Import

import {Document, Page} from 'react-pdf/dist/entry';

Result was improved.

from 6 min to 25sec.

real 0m25.909s
user 0m24.120s
sys 0m0.448s

Parcel is not a solution but an alternative which is quite new and has more issues.

You can easily switch to uglify-es in your webpack config and webpack 4.8 and its docs should provide the needed information.

Personally I would not use CDNs but create a copy task and set the externals. Also Google Closure compiler can produce smaller files of pdf.js.

The externals solution just circumvents the issue by referencing an "external" script which is not part of the bundling.

I highly advice to not bundle such big libraries, use the CommonsChunkPlugin, AggressiveSplittingPlugin + externals.

@DanielRuf Do you like share more information about your experience for webpack4 with react-pdf?
Such as how fast in compile time.

It's useful let us know webpack4 is good to go.

BTW anyone interested in enable uglify-es with "webpack3", you could just install newer version uglifyjs-webpack-plugin and set it in config which just like @DanielRuf talking about.
There is a sample. "https://github.com/react-atomic/reshow/blob/master/packages/reshow-app/src/webpack.client.js"

webpack 4 in general is great and subsequent builds will be faster thanks to the new cache.

Also see https://github.com/webpack-contrib/uglifyjs-webpack-plugin#uglifyjs-webpack-plugin

Regarding pdf.js: I highly suggest using the externals option for this, webpack will exclude ith then from the bundle during the bundle generation.

Why not use the dist min file of pdf.js and load it as extra script tag as it was already recommended? This provides the best performance during development.

@DanielRuf Yes, I use external solution, and share my use case above, it get large improve (from 6 min to 25sec.)

Webpack4 still in my evaluating, So happy to see some expert like you could share real number that more developer align with webpack4.

For me the main issue of webpack4 still have some minor problem with webworker, but it's not related with react-pdf.
https://github.com/react-atomic/reshow/issues/4

I can do some benchmarking (also with different babel and browserslist settings) in the coming weeks if it would be helpful.

"externals": { "pdfjs-dist": "pdfjsDistWebPdfViewer", "pdfjs-dist/lib/web/pdf_link_service": "pdfjsDistWebPdfViewer.PDFJS" }

Your trick and basically everything that was mentioned here doesnt work. i've been playing with this for the last 12 hours. nothing.

I think this is not the global variable which is exported from pdf.js.

See https://webpack.js.org/configuration/externals/

Did you try pdfjsDistBuildPdf?
And the viewer should be pdfjsDistWebPdfViewer.

All in all, we can only help if we have the code and config files of your project.

I've tried HillLiu's exact example, it doesnt break the code, but i dont get any build boost. especially when i use: react-pdf/build/entry.webpack (needed for worker) to get the page/doc, I've tried Other versions as well didnt work.

I've also tried pdfjsDistBuildPdf and several others.

i play alot with externals and managed to make externals of all the needed dep, and still no build boost.

i use NormalModuleReplacementPlugin as well to use the native minified, and excluded the them as well.. no build boost.

And you are right about the code example. i will do that when i wake up its a bit late.

webpack --progress --profile probably shows some more stats.

If it does not work in your example and there is no difference, then something seems to be wrong here.

A simple benchmark with a quick sample (without externals).

Hash: 67461963d3c7a380ebe2
Version: webpack 4.28.3
Time: 13478ms
Built at: 01/02/2019 7:55:14 AM
                 Asset     Size  Chunks                    Chunk Names
               main.js  435 KiB       0  [emitted]  [big]  main
vendors~pdfjsWorker.js  725 KiB       1  [emitted]  [big]  vendors~pdfjsWorker
Entrypoint main [big] = main.js
[20] (webpack)/buildin/global.js 472 bytes {0} [built]
[34] ./src/index.js 3.76 KiB {0} [built]
[40] zlib (ignored) 15 bytes {0} [optional] [built]
[41] fs (ignored) 15 bytes {0} [built]
[42] http (ignored) 15 bytes {0} [built]
[43] https (ignored) 15 bytes {0} [built]
[46] (webpack)/buildin/module.js 497 bytes {0} [built]
    + 77 hidden modules

With the externals:

externals: {
    "pdfjs-dist": "pdfjsLib",
    "pdfjs-dist/build/pdf.worker.js": "pdfjsWorker"
}
yarn run v1.12.3
$ webpack
Hash: 5efe836747d41e38755a
Version: webpack 4.28.3
Time: 2987ms
Built at: 01/02/2019 8:12:40 AM
  Asset      Size  Chunks             Chunk Names
main.js  93.8 KiB       0  [emitted]  main
Entrypoint main = main.js
[18] external "pdfjsLib" 42 bytes {0} [built]
[31] ./src/index.js 3.76 KiB {0} [built]
    + 64 hidden modules

With the profile flag:

yarn run v1.12.3
$ webpack --progress --profile
1156ms building                                                                 
2ms finish module graph                             
0ms sealing                                
0ms basic dependencies optimization 
2ms dependencies optimization                           
1ms advanced dependencies optimization 
0ms after dependencies optimization 
4ms chunk graph 
0ms after chunk graph                          
0ms optimizing 
0ms basic module optimization 
0ms module optimization 
0ms advanced module optimization 
1ms after module optimization 
1ms basic chunk optimization                             
0ms chunk optimization 
4ms advanced chunk optimization                         
0ms after chunk optimization 
1ms module and chunk tree optimization 
0ms after module and chunk tree optimization 
0ms basic chunk modules optimization 
1ms chunk modules optimization                           
0ms advanced chunk modules optimization 
1ms after chunk modules optimization 
0ms module reviving                 
1ms module order optimization                                
0ms advanced module order optimization 
0ms before module ids 
0ms module ids 
2ms module id optimization 
0ms chunk reviving                 
1ms chunk order optimization                               
0ms before chunk ids 
1ms chunk id optimization                          
0ms after chunk id optimization 
1ms record modules                 
0ms record chunks                 
9ms hashing 
1ms content hashing                         
0ms after hashing 
0ms record hash 
0ms module assets processing 
8ms chunk assets processing 
1ms additional chunk assets processing 
0ms recording 
0ms additional asset processing 
19ms chunk asset optimization             
1ms after chunk asset optimization 
0ms asset optimization 
0ms after asset optimization 
0ms after seal 
3ms emitting 
1ms after emitting                  
Hash: 5efe836747d41e38755a
Version: webpack 4.28.3
Time: 1239ms
Built at: 01/02/2019 8:17:21 AM
  Asset      Size  Chunks             Chunk Names
main.js  93.8 KiB       0  [emitted]  main
Entrypoint main = main.js
[18] external "pdfjsLib" 42 bytes {0} [built]
     [31] 771ms -> [24] 34ms -> factory:38ms building:93ms dependencies:171ms = 1107ms
[31] ./src/index.js 3.76 KiB {0} [built]
     factory:56ms building:715ms = 771ms
    + 64 hidden modules

Some info about my system:

osquery> SELECT cpu_brand, cpu_physical_cores, cpu_logical_cores, physical_memory, hardware_model FROM system_info;
+-------------------------------------------+--------------------+-------------------+-----------------+-----------------+
| cpu_brand                                 | cpu_physical_cores | cpu_logical_cores | physical_memory | hardware_model  |
+-------------------------------------------+--------------------+-------------------+-----------------+-----------------+
| Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz | 4                  | 8                 | 17179869184     | MacBookPro10,1  |
+-------------------------------------------+--------------------+-------------------+-----------------+-----------------+

You can find the code at https://github.com/DanielRuf/webpack-react-pdf

The viewer is generally more and is the viewer app in Firefox.
Let me know if your setup is different and which version of react-pdf you use. You can see the imports at https://unpkg.com/[email protected]/dist/pdf.worker.entry.js for the big worker.

@yanivkalfa I think it depend on how you use react-pdf.

you could take a look my import sample.
https://github.com/wojtekmaj/react-pdf/issues/93#issuecomment-369196394

and check the js.
https://cdn.jsdelivr.net/npm/[email protected]/dist/entry.js

In webpack it should replace pdfjs-dist to an empty object,
so you could also inspect the webpack bundle size, if your size is not change.
It's mean your use case is not in my same way.

And mention by @DanielRuf , if you use new version, you probably need change external parameter to new one.

I've tried HillLiu's exact example, it doesnt break the code, but i dont get any build boost. especially when i use: react-pdf/build/entry.webpack (needed for worker) to get the page/doc, I've tried Other versions as well didnt work.

I've also tried pdfjsDistBuildPdf and several others.

i play alot with externals and managed to make externals of all the needed dep, and still no build boost.

i use NormalModuleReplacementPlugin as well to use the native minified, and excluded the them as well.. no build boost.

@DanielRuf

A simple benchmark with a quick sample (without externals).
You can find the code at https://github.com/DanielRuf/webpack-react-pdf

Which files to you link in the HTML? which variable do you use?
and lastly will that also work for worker ?

Which files to you link in the HTML?

The bundled ones.
I did no further setup as I did not get any example project to reproduce the issue. In general webpack is fast and in my case processed the big files. See the profiling information https://github.com/wojtekmaj/react-pdf/issues/93#issuecomment-450799319

@DanielRuf so you didnt linked the file in the HTML like :

<script src="//cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/[email protected]/web/pdf_viewer.min.js"></script>
<script>
PDFJS.workerSrc = "//cdn.jsdelivr.net/npm/[email protected]/build/pdf.worker.min.js";
</script>

and then use the externals ?

This makes not much difference for the bundling process.

Hey guys,

In our case the problem is the bundle size of pdf.js and pdf.worker.js. The latter is bigger than the whole index.js of the application, making page loads pretty slow:

image

Is there anything we can do in this case?

In our case the problem is the bundle size of pdf.js and pdf.worker.js. The latter is bigger than the whole index.js of the application, making page loads pretty slow:

Please see the previous comments =) This was already mentioned and with a few solutions.

https://github.com/wojtekmaj/react-pdf/issues/93#issuecomment-388542120

I also was struggling with the huge bundle size as soon as i included react-pdf. I'm sorry to say, but all the solutions provided were not satisfying. Either too hard to configure, half-baked or they just "moved" the problem to a different spot. Ultimately i refactored my app to use vanilla pdf.js in a local