Pdf.js: add paper size to document information

Created on 15 Feb 2016  ·  29Comments  ·  Source: mozilla/pdf.js

Would it be possible that you add paper size to document information?

This is important to know whether the document uses letter or A4 paper size (among others).

1-viewer 2-feature

Most helpful comment

I cannot code, so experienced coders may give better advice.

I'm sure core contributors will have open mind about the way this problem will be addressed. Main concern here is that a solution must not be complex and not add burden in future maintenance of the project.

Personally, as a first step, I would just add page size information without "nice" names (to avoid their localization), maybe (auto) detect units. The second step would be discussion of the page size name and orientation.

All 29 comments

And can it include portrait/landscape?

What is the expected behavior if multiple page sizes/orientations are used, i.e., some pages are A4 portrait, but some are A3 landscape for example? How do other viewers do that?

and also we need to deal with internationalization and localization, e.g.

  • "A4" or "Letter" needs to be translated
  • 10cm×15cm shall be in inches for US

Re: rotation. I'm going to check what PR 8043 does for me. That may fix my problem. It looks like I really need to read PR 8043 and #6103 properly, and make sure I'm on a late release with those fixes applied.

Re: multi-size. Dunno if any print backends support that without splitting the printout into multiple print jobs. If you don't split the print job, you have "print on biggest" or "scale to smallest" as sane options.

Cups includes an media size header with every print job, so Linux and MacOS are limited to one-size per print-job, as far as I know.

Re: localisation.
The printer has a list of possible and available paper sizes.
The OS print dialog knows the locale of the user, and a list of printers.
The OS print dialog doesn't know the original size.
The pdf viewer knows the locale, but not a list of printers, so not a list of available paper.
The pdf viewer knows the original PDF size.
The pdf generator knows the server and client locales.
The user hates "pc load letter"

Re: rotation. I'm going to check what PR 8043 does for me.
Re: multi-size. Dunno if any print backends support that without splitting the printout i..
Re: localisation... The printer has a list of possible and available paper sizes...

@berenddeschouwer I don't think any of the above related to the OP

What is the expected behavior if multiple page sizes/orientations are used, i.e., some pages are A4 portrait, but some are A3 landscape for example? How do other viewers do that?

@timvandermeij, SumatraPDF displays the size of the browsed page (the one that has the number in the toolbar).

and also we need to deal with internationalization and localization, e.g.

  • "A4" or "Letter" needs to be translated
  • 10cm×15cm shall be in inches for US

@yurydelendik, I don’t think A4 should be translated.

And as far as i know, page dimensions are specified in points inside the PDF document. So PDF.js should use the default dimension unit from OS.

@ousia what is the expected page size for document at https://github.com/mozilla/pdf.js/blob/master/test/pdfs/sizes.pdf ?

@ousia what is the expected page size for document at https://github.com/mozilla/pdf.js/blob/master/test/pdfs/sizes.pdf?

@yurydelendik, I don’t understand your question.

Size is a property related to the actual page, not to the document.

in PDF terms, /MediaBox is an entry in the /Page dictionary, not in the /Catalog (/Pages is the entry there).

Which should be the page size in the document you mention?

_Acrobat Reader_ displays the dimensions of the displayed page. Here you have it.

pageone

pagetwo

pagethree

Thank you for providing this information! Now that this is clear, I think this should be good to implement. I'm marking this as a good beginner bug since it should not be too hard.

I can knock this out if nobody is working on it.

@loganhuskins It's yours!

I think I am pretty close, I want to use getPagesOverview() from base_viewer.

Should I import import { BaseViewer } from './base_viewer'; in pdf_document_properties and then do BaseViewer.getPagesOverview()[0] as the value for pageSize value that I am planning to display? This somehow doesn't seem to be working, is there any other util method that I can reuse for getting this information. Sorry for hijacking this from Logan.

I've been wrapped up with work, so if you're taking this over I'm good.

Hi,
Is this what you are looking to do ?

compressed tracemonkey pldi 09 pdf

@linton-Portman, in that case adding _Letter, portrait_ would be great.

I mean, with standard paper sizes it is useful to provide the standard paper name.

See what _Evince_ does:

evince-properties

And _SumatraPDF_:

sumatrapdf-properties

I mean, with standard paper sizes it is useful to provide the standard paper name.

That means we need to properly localize all of these name. Is there a short list of popular paper sizes?

Hi,
From what i have read so far the differences are in paper size ( US letter = 8.50 x 11.00 inch & UK A4 = 8.26 x 11.69 inch )... The way i was going to approach it was using a series of if/else statements checking for these sizes. Then this will determine what text is displayed .... By also using these sizes you can also determine if portrait or landscape....

This will be a lot of if/else statements depending on how many paper sizes are catered for....
I am a beginner/novice so maybe there is a better way :)

a4 portrait pdf
letter portrait pdf

Notice that A4 size is better to be expressed in mm, e.g. 210mm x 297mm

@yurydelendik Oh yes i can do that also.

Here you have a list (that I adapted from ConTeXt).

Standard European paper sizes:

[A0] [width=841mm, height=1189mm]
[A1] [width=594mm, height=841mm]
[A2] [width=420mm, height=594mm]
[A3] [width=297mm, height=420mm]
[A4] [width=210mm, height=297mm]
[A5] [width=148mm, height=210mm]
[A6] [width=105mm, height=148mm]
[A7] [width=74mm, height=105mm]
[A8] [width=52mm, height=74mm]
[A9] [width=37mm, height=52mm]
[A10] [width=26mm, height=37mm]

Standard European less common paper sizes:

[B0] [width=1000mm, height=1414mm]
[B1] [width=707mm, height=1000mm]
[B2] [width=500mm, height=707mm]
[B3] [width=353mm, height=500mm]
[B4] [width=250mm, height=353mm]
[B5] [width=176mm, height=250mm]
[B6] [width=125mm, height=176mm]
[B7] [width=88mm, height=125mm]
[B8] [width=62mm, height=88mm]
[B9] [width=44mm, height=62mm]
[B10] [width=31mm, height=44mm]

Standard European envelope sizes:

[C0] [width=917mm, height=1297mm]
[C1] [width=648mm, height=917mm]
[C2] [width=458mm, height=648mm]
[C3] [width=324mm, height=458mm]
[C4] [width=229mm, height=324mm]
[C5] [width=162mm, height=229mm]
[C6] [width=114mm, height=162mm]
[C7] [width=81mm, height=114mm]
[C8] [width=57mm, height=81mm]
[C9] [width=40mm, height=57mm]
[C10] [width=28mm, height=40mm]

CD cover paper size:

[CD] [width=120mm, height=120mm]

American standard paper sizes:

[letter] [width=8.5in, height=11in]
[ledger] [width=11in, height=17in]
[tabloid] [width=17in, height=11in]

[legal] [width=8.5in, height=14in]
[folio] [width=8.5in, height=13in]
[executive] [width=7.25in, height=10.5in]

Different sets of envelopes:

[envelope 9] [width=8.88in, height=3.88in]
[envelope 10] [width=9.5in, height=4.13in]
[envelope 11] [width=10.38in, height=4.5in]
[envelope 12] [width=11.0in, height=4.75in]
[envelope 14] [width=11.5in, height=5.0in]
[monarch] [width=7.5in, height=3.88in]
[check] [width=8.58in, height=3.88in]
[DL] [width=110mm, height=220mm]
[E4] [width=280mm, height=400mm]

I think it is useful to add whether the orientation is portrait or landscape.

@linton-Portman, from what I see in your screenshots, I have a question:

Wouldn’t the following info be better?

Letter, portrait (8.5  × 11 in)

I mean, these following adaptations:

  • Use paper name first.

  • Followed by _portrait_ or _landscape_.

    BTW, if the separation before is a comma, I think the word should begin with a lowercase letter.

  • Write the dimensions inside the parentheses.

    • It”s better to use the proper Unicode sign _×_ (instead of _x_).

    • Which paper size requires which dimensions is specified in the previous list.

    • I think it’s better to use abbreviations for length units (_in_ or _mm_ in this case).

If that weren’t possible, _portrait_ or _landscape_ should go inside the parentheses.

Many thanks for your contribution.

@linton-Portman, using switch might be a better alternative to if... else statements.

I cannot code, so experienced coders may give better advice. (I simply replied to your question, since it remained unanswered).

Here you have a list (that I adapted from ConTeXt).

All of that needs to be translated. Personally I prefer find 5-10 top most used page sizes and give them names.

Letter, portrait (8.5 × 11 in)

Due to that: "8.5 in × 11 in, Letter/Portrait" or just "100mm x 200mm" might look a better choice here

It”s better to use the proper Unicode sign × (instead of x).

Yes, but it's also up to localizers for specific culture

I cannot code, so experienced coders may give better advice.

I'm sure core contributors will have open mind about the way this problem will be addressed. Main concern here is that a solution must not be complex and not add burden in future maintenance of the project.

Personally, as a first step, I would just add page size information without "nice" names (to avoid their localization), maybe (auto) detect units. The second step would be discussion of the page size name and orientation.

I tried it, since there has not been communication here since 2 months or so. Apologizes if I should have asked before.
The PR adds the Page Size view in pt, which is a start. Next steps (like described here) would be converting these to cm/mm/inch/... and trying to find a matching standard document size. Both of these are not implemented in this PR.

The pull request above implemented most of this feature. What remains to be done is showing the page size of the currently visible page instead of only from the first one. Optionally letter/portrait and default page sizes (such as A4) can be added, but I don't see that as mandatory for resolving this issue.

More improvements have been made in #9577, which also implemented showing the page size of the currently visible page. For this issue, only letter/portrait and default page sizes remain.

The final part is implemented by the pull request above (page orientation and page size names). Closing as fixed.

Was this page helpful?
0 / 5 - 0 ratings