Pdf.js: The viewer.js can not take online PDF stream

Created on 30 Dec 2011  ·  29Comments  ·  Source: mozilla/pdf.js

The viewer.js can not take online PDF stream.

By default, viewer.js has:

var kDefaultURL = 'compressed.tracemonkey-pldi-09.pdf';

Now use PDF stream:
http://www.liferay.com/documents/31578/11925632/sample.pdf

as

var kDefaultURL = 'http://www.liferay.com/documents/31578/11925632/sample.pdf';

In FireFox 9.0.1 and Chrome 16.0.912.63 , it throws errors:

"
PDF.JS Build: 9161c2e
Message: Unexpected server response of 0.
".

Most helpful comment

@hashbyte You will need a proxy for the server. A very simple proxy (developed by me) is CORS Anywhere. Simply prepend the URL of the proxy before the URL to the PDF file, e.g.

"https://cors-anywhere.herokuapp.com/" + 
"http://bhpr.hrsa.gov/healthworkforce/rnsurveys/rnsurveyfinal.pdf" =
"https://cors-anywhere.herokuapp.com/http://bhpr.hrsa.gov/healthworkforce/rnsurveys/rnsurveyfinal.pdf"

Then URL-encode this URL and put it in the file parameter, and you will get a link that can open any page: https://mozilla.github.io/pdf.js/web/viewer.html?file=https%3A%2F%2Fcors-anywhere.herokuapp.com%2Fhttp%3A%2F%2Fbhpr.hrsa.gov%2Fhealthworkforce%2Frnsurveys%2Frnsurveyfinal.pdf

Note: If the URL for the PDF does not contain any percent signs or & characters, then an easier way to quickly get a link is to just prepend the viewer URL before the link (so, without URL-encoding first). Only do this if you manually type the URL (e.g. when doing a quick test):
https://mozilla.github.io/pdf.js/web/viewer.html?file=https://cors-anywhere.herokuapp.com/http://bhpr.hrsa.gov/healthworkforce/rnsurveys/rnsurveyfinal.pdf

Note: The CORS Anywhere demo is only provided to demonstrate the feature. If you are going to use this feature on a site with many visitors, please host the CORS Anywhere instance yourself, to avoid putting an unfair load on the public demo server. If I notice that the performance of CORS Anywhere is crawling due to abuse, your origin will be blacklisted. When hosting CORS Anywhere yourself, you can restrict access to your site only via the originWhitelist configuration parameter to avoid this kind of abuse.

All 29 comments

Related issues are #522, #586 and #842

As referenced in the above issues this is something that the user will have to fix on their own using a proxy or CORS.

Hi Brend,

what's the main reason that "the user will have to fix on their own using a proxy or CORS"?

It is important that the PDF could be coming from local (uploading), server as a file, server as stream like "http://".

As image URL, the PDF reader should support HTTP URL.

Thanks

@jonasyuandotcom the cors allows you get the pdf using http from the same server. however the browser protects the user from getting/sending the data to foreign servers. Those servers must use http headers to bypass this restriction.

Since your server side proxy will be located on same server as a viewer cors will be okay with that.

@notmasteryet thanks. It works when using same server like

var kDefaultURL = '/pdf-reader-web/sample.pdf';

Hi Jonas,

We won't be implementing this on our end because it's not possible for us to implement because of browser security restrictions. See http://en.wikipedia.org/wiki/XMLHttpRequest#Cross-domain_requests

Brendan

Hi @brendandahl

I was wondering if there is any update since 2011 ? Is it still impossible to fix the CORS problems ?

Thank you Tim !

Hi @timvandermeij. Thanks for your response. I've tried many solutions but I'm still not enable to allow CORS on my web server. Do you have any git example ?

@Dassine here you go http://mozilla.github.io/pdf.js/web/viewer.html?file=//async5.org/moz/pdfjs.pdf -- PDF Viewer loads http://async5.org/moz/pdfjs.pdf . Notice that async5.org allows mozilla.github.io to get files. Otherwise a web browser has to block accessing a remote file for security reasons. It is a standard practice on the web and there is nothing PDF.js can do to circumvent browsers security.

If you embedding browser control into desktop/mobile application, you can requests a binary data by using OS/Framework APIs and pass it to the PDF.js as Uint8Array.

Thanks @yurydelendik for your links. I know that PDF.js doesn't manage CORS. I've tried the solutions sent by @timvandermeij and other but they failed. I'm looking for the right implementation/modifications solutions once the pdf.js repo downloaded. Thanks

@yurydelendik I am also having the issue of loading remote PDF file. But this error happens only in Chrome.
PDF.js v1.0.1040 (build: 997096f)
Message: Unexpected server response (0) while retrieving PDF "http://bhpr.hrsa.gov/healthworkforce/rnsurveys/rnsurveyfinal.pdf".

You can see that the PDF I try to load is on another server upon which I don't have any control. But still I wish to show this PDF on my viewer.js

I have not much experience with CORS but I created a least security crossdomain.xml file on my server but even then it doesn't work.

@hashbyte You will need a proxy for the server. A very simple proxy (developed by me) is CORS Anywhere. Simply prepend the URL of the proxy before the URL to the PDF file, e.g.

"https://cors-anywhere.herokuapp.com/" + 
"http://bhpr.hrsa.gov/healthworkforce/rnsurveys/rnsurveyfinal.pdf" =
"https://cors-anywhere.herokuapp.com/http://bhpr.hrsa.gov/healthworkforce/rnsurveys/rnsurveyfinal.pdf"

Then URL-encode this URL and put it in the file parameter, and you will get a link that can open any page: https://mozilla.github.io/pdf.js/web/viewer.html?file=https%3A%2F%2Fcors-anywhere.herokuapp.com%2Fhttp%3A%2F%2Fbhpr.hrsa.gov%2Fhealthworkforce%2Frnsurveys%2Frnsurveyfinal.pdf

Note: If the URL for the PDF does not contain any percent signs or & characters, then an easier way to quickly get a link is to just prepend the viewer URL before the link (so, without URL-encoding first). Only do this if you manually type the URL (e.g. when doing a quick test):
https://mozilla.github.io/pdf.js/web/viewer.html?file=https://cors-anywhere.herokuapp.com/http://bhpr.hrsa.gov/healthworkforce/rnsurveys/rnsurveyfinal.pdf

Note: The CORS Anywhere demo is only provided to demonstrate the feature. If you are going to use this feature on a site with many visitors, please host the CORS Anywhere instance yourself, to avoid putting an unfair load on the public demo server. If I notice that the performance of CORS Anywhere is crawling due to abuse, your origin will be blacklisted. When hosting CORS Anywhere yourself, you can restrict access to your site only via the originWhitelist configuration parameter to avoid this kind of abuse.

Hello,
I have this error when i want to load pdf from foreign url

Error: file origin does not match viewer's
throw new Error('file origin does not match viewer\'s');

please help!

I have this error when i want to load pdf from foreign url

@gildassamuel see #6916 for details.

@jonasyuandotcom In the case that the file is managed yourself, you can put the file and pdfjs on a same file server.

Hey I have followed the instructions to set a "Access-Control-Allow-Origin" on the file server but I keep getting this error:
screen shot 2016-09-28 at 10 32 35 am
The http headers are as follows:
screen shot 2016-09-28 at 10 33 16 am
Any pointers would be appreciated, even if you are not sure can you provide some possible cause for the problem. Thank you very much!

@yjguoo The error message and the headers do not add up.

I think that the redirect target is lacking the expected headers.

Visit chrome://net-internals/#events and repeat the steps to see the actual headers of the blocked redirect in the log.

Hi Rob thanks for the quick response,

I notice the when I manually input requested url "https://files.dev52.slack.com/files-pri/T076SHX5W-F07CGBKK2/git-for-beginners-handout.pdf" I get different response header from the file server:
screen shot 2016-09-28 at 11 58 52 am
Notice that I got routed to a different location in response header. And at that new location I get a status 200 ok.
screen shot 2016-09-28 at 11 59 05 am
My first questions: is there a different between manually typing the url in the browser vs through XmlHttpRequest?
BTW I am using the default pdf.js viewer(html css js) all functionalities work except requesting a pdf from different origin (e.i. cross oring request issue)
Second question: do you think it's a problem on my end or the way the default viewer.js/ pdf.js is doing the XmlHttpRequest

Thank you :)

My first questions: is there a different between manually typing the url in the browser vs through XmlHttpRequest?

Yes, especially with cross-origin requests. When you perform a cross-origin request, the request is only accepted if the request is allowed by CORS. The browser will issue a request with the Origin header, the server can use this to decide whether to approve the request (by including the requested origin in the Access-Control-Allow-Origin response header).

Also, by default credentials are not included in cross-origin requests. To include cookies, the server must respond with Access-Control-Allow-Credentials: true and the XHR request must have the withCredentials attribute set to true.

Second question: do you think it's a problem on my end or the way the default viewer.js/ pdf.js is doing the XmlHttpRequest

I think that your server must be configured differently.

See the documentation on MDN for more info: https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS
Or read the specification of CORS: https://www.w3.org/TR/2014/REC-cors-20140116/

Hi Rob,

I think I fixed the issue by also setting Access-Control-Allow-Credentials: true. Thank you so much for the help <3. However I get another problem regarding a redirect and I don't know if you could give me some pointers. I do XHR
request using URL#1 (the issue that you help me fix). Then I get a redirect to URL#2(status 302). But both URL#1 and URL#2 points to the same server. Causing the following error:
screen shot 2016-09-29 at 9 40 41 am
I tried to set a the same response headers as I did to URL#1 but I realize that they are both on the same server and Origin is null.

I think it's because both URL's points to the same server so the redirect origin is NULL but i'm not too sure? Now how would I go about adding headers dealing with two different origin but both URLs points to the same file server. I tried using regex like * but apparently it is no allowed :( Again thank you!

After a cross-origin redirect, Chrome sets the Origin header to the "null" value instead of the actual URL - https://crbug.com/154967

You could conditionally return an Access-Control-Allow-Origin with value null, but only if you want all websites to be able to read that resource. If not (which is most likely), then you have to avoid the redirect. For example, by directly sending the request to the destination URL (if you don't know the URL in advance, add a new API endpoint to your server that returns the destination URL).

Thank you! I'll look into it :)

Hey Rob do you think there is another way to initialize the pdf viewer or pdf.js with the src already set and avoid XHR request all together?

@yjguoo You can base64-encode the PDF data and use a data-URL. For large PDF files, this results in a worse user experience (=blank page without progress bar) because encoding the data as base64 increases the file size by 33%. I recommend to continue using XHR for this reason.

Hi, just tested and it works if CORS is enabled, but what I see is that doesn't get the file in chunks/range if the file is very big.

Solved adding this in .htaccess:
Header set Accept-Ranges bytes
Header set Access-Control-Allow-Origin "*"
Header set Access-Control-Allow-Methods "GET"
Header set Access-Control-Allow-Headers "Content-Type, Range"
Header set Access-Control-Expose-Headers "Accept-Ranges, Content-Encoding, Content-Length, Content-Range"

Thank you!!!

@Rob--W It was very helpful. There is any possibility from it to work from inside the project(like changing the XHR headers) ? (inside worker.js)

Hi, for any one still having problems with this, I solved it with:

https://drive.google.com/viewerng/viewer?embedded=true&url=http://www.africau.edu/images/default/sample.pdf

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dmisdm picture dmisdm  ·  3Comments

THausherr picture THausherr  ·  3Comments

PeterNerlich picture PeterNerlich  ·  3Comments

timvandermeij picture timvandermeij  ·  4Comments

xingxiaoyiyio picture xingxiaoyiyio  ·  3Comments