pdf.js relies on urls to contain the 'pdf' extension

Created on 12 Feb 2014  ·  13Comments  ·  Source: mozilla/pdf.js

When the server doesn't provide the Content-Disposition header, pdf.js relies on urls to contain the 'pdf' extension. But URLs are locators, not names.
Steps to reproduce:

mv web/compressed.tracemonkey-pldi-09.pdf web/compressed.tracemonkey-pldi-09
sed -i 's/compressed.tracemonkey-pldi-09.pdf/compressed.tracemonkey-pldi-09/g' web/viewer.js
firefox web/viewer.html

Now click on download. You will be offered a 'document.pdf' file. The name should be something more meaningful.
The bug also happens when I leave out the pdf extension on my apache web server.

Proposed solution:
Use the title of the pdf. (as this viewer.js code). The title is also used by firefox for the File -> "save page as" functionality when displaying a HTML page like http://en.wikipedia.org/wiki/Internet_media_type .

Not every html web page ends in .html. Instead by the extension, a document's type is specified by its MIME-type.
However, most pdf files have the pdf extension, and most pdfs online also have a good-to-store name in the url.
I don't know whether the new retrieval method should overwrite the url retrieval, or be a fallback to it.

See also #3455.

1-core 5-good-beginner-bug

All 13 comments

@timvandermeij

Any update on this? It has been open for more than two years now. Because my file param is a server call that sends a pdf file back, the pdf viewer is not able to detect the name of the file because it seems to be looking for a .pdf extension and so I'm stuck with "document.pdf" when downloading and "untitled.pdf" in the window bar when viewing.

It would be handy if we could also specify a "title" in the URI as well as the "file" such as .../pdf-viewer/viewer.html?file="..."&title="..."

I know that currently work is being done in #7554 to support the Content-Disposition header, which is a way to solve this issue. I do agree, however, that document.pdf is not the best possible name and we might need to improve the function for getting the (file)name. Patches for this are welcome, so I'm labeling this as a good beginner bug as it should not be too hard to implement.

@timvandermeij Excellent thank you, I believe supporting Content-Disposition would actually fix my issue.

I agree, as I was going through the code I noticed it should not be too difficult to just add another URL param for the filename. I'll give it a try in the next few days, Thanks.

Patches for this are welcome, so I'm labeling this as a good beginner bug as it should not be too hard to implement.

@timvandermeij Please remember that in PR #4956 we purposely moved away from letting various hash parameters affect the viewer (unless debugging is enabled, see https://github.com/mozilla/pdf.js/wiki/Debugging-pdf.js).
Hence I do not think that we should make it possible to specify the title using a hash parameter!

Especially considering that it would be non-standard (in the context of http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_open_parameters.pdf), and compared to the Content-Disposition approach in PR #7554, it really doesn't add much value.

Sorry, I should have been more clear. I meant that patches are welcome for improving the function that determines the file name from the URL. I think we can do better there instead of only relying on the file extension. I agree that we should not add more hash parameters.

What's the status of this issue? Is this still open?

What's the status of this issue? Is this still open?

@anirudhrb still opened, there was an attempt to implement that at #7554, would you like contribute to that?

@yurydelendik Yes, I would like to contribute. What is expected in a PR for this issue?

@anirudhrb, you may just take the above PR as a base since it has remoting of data somewhat right -- we would expect small patch with a unit tests. We don't need spec Content-Disposition parsing, but enough to get filename.

@yurydelendik I have started working on this. This is my first attempt at contributing to an open-source project. I'll need some time to get comfortable with the codebase. :)

@yurydelendik , @timvandermeij Could I take this issue up if it's okay with you all?

There is a pull request above which looks like the right direction, but there has been no more activity for it. If you're interested in fixing up that one, that sounds good. I'll ask if the original author is still planning to work on it.

Fixed in #9379.

Was this page helpful?
0 / 5 - 0 ratings