Gunicorn: UnicodeEncodeError: 'ascii' codec can't encode characters in position 151-165: ordinal not in range(128)

Created on 8 Mar 2016  ·  13Comments  ·  Source: benoitc/gunicorn

i use nginx+gunicorn to deploay falsk app, when i accsess a method which is send_file(),an error come,the code is flow ,but it works well when i run flask standone
[2016-03-08 09:52:10 +0800] [4048] [ERROR] Error handling request /rest/api/admin/v1.2/shop/order/export
Traceback (most recent call last):
File "/app/soft/python3.5/lib/python3.5/site-packages/gunicorn/workers/async.py", line 52, in handle
self.handle_request(listener_name, req, client, addr)
File "/app/soft/python3.5/lib/python3.5/site-packages/gunicorn/workers/ggevent.py", line 163, in handle_request
super(GeventWorker, self).handle_request(*args)
File "/app/soft/python3.5/lib/python3.5/site-packages/gunicorn/workers/async.py", line 110, in handle_request
resp.write_file(respiter)
File "/app/soft/python3.5/lib/python3.5/site-packages/gunicorn/http/wsgi.py", line 396, in write_file
self.write(item)
File "/app/soft/python3.5/lib/python3.5/site-packages/gunicorn/http/wsgi.py", line 327, in write
self.send_headers()
File "/app/soft/python3.5/lib/python3.5/site-packages/gunicorn/http/wsgi.py", line 323, in send_headers
util.write(self.sock, util.to_bytestring(header_str, "ascii"))
File "/app/soft/python3.5/lib/python3.5/site-packages/gunicorn/util.py", line 508, in to_bytestring
return value.encode(encoding)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 151-165: ordinal not in range(128)

Most helpful comment

In case it's Content-Disposition header use urllib.parse.quote to quote the utf8 string.

All 13 comments

the method key workds is :
attachment_filename=file_name.encode('utf-8'),as_attachment=True, conditional=True)

That _looks_ like an error encoding a header. Are you sending any abnormal headers on this request? Do they contain unicode strings?

@zhguokai ping.

I'm having the same problem.

In Django, I have a view with the following method:

def download_anexo( request ):
    id   = request.GET[ 'id' ]

    anexo_arquivo = AnexoArquivo.objects.get( id = id )

    response = HttpResponse( anexo_arquivo.arquivo )
    response[ 'Content-Disposition' ] = 'attachment; filename=%s' % anexo_arquivo.nome

    return response

anexo_arquivo.nome can have the content "TAC nº 009.2016 - Mandaçaia Fest.doc". Because the unicode characters, ocoour the UnicodeEncodeError like this:

backend_1 | [2016-03-22 16:59:33 -0300] [9] [ERROR] Error handling request /novaintranet/api/transparencia/download_anexo/?id=3395768
backend_1 | Traceback (most recent call last):
backend_1 | File "/usr/local/lib/python3.5/site-packages/gunicorn/workers/sync.py", line 130, in handle
backend_1 | self.handle_request(listener, req, client, addr)
backend_1 | File "/usr/local/lib/python3.5/site-packages/gunicorn/workers/sync.py", line 177, in handle_request
backend_1 | resp.write(item)
backend_1 | File "/usr/local/lib/python3.5/site-packages/gunicorn/http/wsgi.py", line 327, in write
backend_1 | self.send_headers()
backend_1 | File "/usr/local/lib/python3.5/site-packages/gunicorn/http/wsgi.py", line 323, in send_headers
backend_1 | util.write(self.sock, util.to_bytestring(header_str, "ascii"))
backend_1 | File "/usr/local/lib/python3.5/site-packages/gunicorn/util.py", line 511, in to_bytestring
backend_1 | return value.encode(encoding)
backend_1 | UnicodeEncodeError: 'ascii' codec can't encode characters in position 213-214: ordinal not in range(128)

@rldourado it seems like you have a similar issue and from this comment it seems like @zhguokai was able to fix the issue by encoding the headers.

You may also want to look at RFC 6266 and see this issue from flask.

My understanding right now is that there is no bug in gunicorn here.

@rldourado were you able to fix your issue?

In case it's Content-Disposition header use urllib.parse.quote to quote the utf8 string.

closing an issue as this bug happen outside of gunicorn.

try this

    response["Content-Disposition"] = \
            "attachment; " \
            "filenane={ascii_filename};" \
            "filename*=UTF-8''{utf_filename}".format(
                ascii_filename=quote(filename),
                utf_filename=quote(filename)
            )

Inspired on @codeAndxv I had to take it one step further, modifying the file name. This only avoids the problem. Also, I'm using django.

        import os
        from django.utils.text import slugify

        file_name = os.path.splitext(str(filename))[0]
        extension = os.path.splitext(str(filename))[1]

        content_disposition =  \
            "attachment; " \
            "filename={ascii_filename}{extension};" \
            "filename*=UTF-8''{utf_filename}{extension}".format(
                ascii_filename=slugify(file_name),
                utf_filename=slugify(file_name),
                extension=extension
            )

the method key workds is :
attachment_filename=file_name.encode('utf-8'),as_attachment=True, conditional=True)

not correct! should use:

from urllib.parse import quote

quote(filename)

This is a bug, I should not have to use quote, which makes filenames unreadable, for it to work properly

@x011 if your header values can be encoded as latin1, then you could upgrade to Gunicorn 20. We relaxed the strictness on header encoding in the last release. If your header values cannot be encoded as latin1, then your issue is with the HTTP specification and not Gunicorn and there is nothing we can do in Gunicorn to change it. Here is some reading if you want more information: https://dzone.com/articles/utf-8-in-http-headers

Was this page helpful?
0 / 5 - 0 ratings