Kivy: Encoding problem on 'utf-8' kv file on Windows

Created on 16 Feb 2016  ·  24Comments  ·  Source: kivy/kivy

I'm developing a multi-platform app for Linux and Windows 7. All my files were first written on Linux and encoded as utf-8, but when I open this same project on Windows the kv file is read using the cp1252 encoding. The same thing does not seems to happen to my .py files maybe because I'm using python3.

As a consequence the Unicode characters written on the kv file won't render correctly on the Kivy app. The string 'Título' will show as TÃ-tulo.

My settings are: Kivy=1.9.1, Python=3.4.4, Windows 7 x64 Home Premium.

Also my python was installed using Anaconda, but this is probably unrelated.

To reproduce the problem:

Write a kv file encoded with utf-8:

# test.kv
<myButton@Button>:
    text: 'Título'

On python interpreter or .py script:

import kivy
from kivy.lang import Builder
from kivy.uix.button import Button

Builder.load_file('test.kv')
class myButton(Button):
    pass

print( myButton().text == 'Título' ) # False
print( myButton().text.encode('cp1252').decode() == 'Título' ) # True

The multi-platform workaround I found was this:

# test.kv
<myButton@Button>:
    text: str(b'T\xc3\xadtulo'.decode())

Most helpful comment

Yes @ChristianTremblay, this is only a Windows bug. This is actually because windows default encoding is cp1252, causing Kivy to read the .kv file as if encoded that way. Maybe the solution proposed by @KeyWeeUsr really helps, I haven't tried that, but might be cleaner than the other workaround proposed.

I agree, a good solution would be to to allow encode specification on the .kv file like the way we can do in python:

# -*- coding: utf-8 -*-
<MyWidget>:
    # ...

All 24 comments

FYI - a slightly easier way to work around this is to use Unicode literals:

<MyButton@Button>:
    text: u'T\u00edtulo'

Thank you, I'll use that =)

For someone else using this work around, if you want to find the Unicode escape sequence for your Unicode character you may find it like this:

>> hex( ord('ã') )
0xe3
>> u'\u00e3'
'ã'

Don't you think there should be a better way ? (earing Hettinger speaking...)

a # header that would tell how the unicode character should be handled ?

Even if this trick works... it makes text really hard to read when there's a lot of unicode characters...

Is this only a windows bug ?

Well, this isn't a recommended way how to handle encodings, but you can use this. It worked for me in py2.7 with kivy 1.8.0, but it could work even with py3. Files saved as utf-8, \u... symbols used directly as u'ä'

    import sys
    reload(sys)
    sys.setdefaultencoding("utf-8")

Yes @ChristianTremblay, this is only a Windows bug. This is actually because windows default encoding is cp1252, causing Kivy to read the .kv file as if encoded that way. Maybe the solution proposed by @KeyWeeUsr really helps, I haven't tried that, but might be cleaner than the other workaround proposed.

I agree, a good solution would be to to allow encode specification on the .kv file like the way we can do in python:

# -*- coding: utf-8 -*-
<MyWidget>:
    # ...

What if utf-8 was mandatory for kv files ?

    def load_file(self, filename, **kwargs):
        '''Insert a file into the language builder and return the root widget
        (if defined) of the kv file.

        :parameters:
            `rulesonly`: bool, defaults to False
                If True, the Builder will raise an exception if you have a root
                widget inside the definition.
        '''
        filename = resource_find(filename) or filename
        if __debug__:
            trace('Builder: load file %s' % filename)
        with open(filename, 'r', encoding='utf-8') as fd:
            kwargs['filename'] = filename
            data = fd.read()

            # remove bom ?
            if PY2:
                if data.startswith((codecs.BOM_UTF16_LE, codecs.BOM_UTF16_BE)):
                    raise ValueError('Unsupported UTF16 for kv files.')
                if data.startswith((codecs.BOM_UTF32_LE, codecs.BOM_UTF32_BE)):
                    raise ValueError('Unsupported UTF32 for kv files.')
                if data.startswith(codecs.BOM_UTF8):
                    data = data[len(codecs.BOM_UTF8):]

            return self.load_string(data, **kwargs)

This is found in kivy/lang/builder.py line 275 and up

@ChristianTremblay, I think this would not be for the best. Some users might still use text editors with default encodings different from 'utf-8', for example the default Windows encoding 'cp1252'. We must provide a solution for both.

The best would probably be to mimic python behavior:

  • Expect programmer to specify the encoding on the first line.
  • Or else use python default encoding (on python3 it would always be 'utf8' I think).

Having a problem on Android:

An unanticipated UnicodeDecodeError occurred: 'ascii' codec can't decode byte 0xef in position 564: ordinal not in range(128)
Traceback (most recent call last):  
  File "./pages.py", line 21, in <module>
    Builder.load_file('pages.kv')
  File "/data/data/.../files/app/crystax_python/site-packages/kivy/lang/builder.py", line 290, in load_file
    data = fd.read()
  File "/data/data/.../files/app/crystax_python/stdlib.zip/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 564: ordinal not in range(128)

((boggle)) Seems to work on my Linux box though.

The default should definitely be utf-8, though not forced as mandatory.

mixmastamyk i had the exact same problem.
My app works fine, but when I build the APK and run it on my android phone I get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 352: ordinal not in range(128)
This is because I have a couple of "É" in my file. Too bad I was coding in python3 to avoid these kind of annoying things.

Right. Looks like Builder.load_file() should default to utf8 and/or have an encoding parameter. In the meantime I did this:

 with open(filename, encoding='utf8') as f:
     Builder.load_string(f.read())

Make sure to disable the auto load for the kv file, or the error will still be thrown and make it look like this didn't work. '\u2026' form can also work if it is only a character or two.

Smart idea.thanks!
I will try this.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

How convenient, ignore bugs and let them close themselves.

Actually we have been digging up old issues to triage them, no valid and actionable issue will be closed. Though this kind of attitude does not help with the motivation needed to dedicate our afternoons to the project.

Wasn't irritated on the lack of work but rather a bot that closes ignored issues. This one might possibly take one or two lines of code to add an encoding parameter to load_file. Would have done it myself, but there are 70 outstanding pull requests.

A project with scarce manpower shouldn't have this type of bot.

I'd still encourage you to make that pull request. If it's a clean and easy fix, it has a good chance of being merged.

I have the same problem with kivy 1.10 , my app runs perfect on linux with python3 ( python3 main.py) but when i debug the deploy on my android phone, the app crash :/ very anoying if yo want to do something with quality

Still having this issue with kivy 1.10.1 and python 3.6.6 on Windows 10. Current workaround is not to auto load .kv file. Rename it to something that doesn't load by default, save it with utf8 encoding and do as shown in #5154

from kivy.lang import Builder
with open('MyApp.renamed.kv', encoding='utf8') as f: 
    Builder.load_string(f.read())

@carasuca solution is not working to me

kivy 1.10.1 python 3.5.3 windows 7
same behaviour on windows 10 and python 3.6.5

Same code works perfectly on osx and linux

EDIT: the problem is in #:include kv language directive
@carasuca solution works if and only if you have a single kv file and you load it using Builder.load_string(f.read()). If that kv uses #:include anotherfile.kv, that file gets loaded with the wrong charset.
Solution 1: put all you kv code into a single file
Solution 2:

for kvfile in ['file1.kv', 'file2.kv']:
         with open(kvfile, encoding='utf8') as f:
             Builder.load_string(f.read())

Well I had the same error with words like "Número" or "veículo". I tried with the following code:

from kivy.lang import Builder
with open('myApp.kv', encoding='utf8') as f: 
    Builder.load_string(f.read())

But I had a problem doing this. When I ran the App, I could see two different labels overlapped.
The solution was to save the kv file in a subdirectory and then call the with open like this:

with open('./kvfile/myApp.kv', encoding='utf-8') as f:
            Builder.load_string(f.read())

The problem was solved because the auto-load won't find the kv file into the same directory as main.py. So it won't duplicate the visualisation.

@piontk Hey, I thought this would be my salvation, but when I try it, it says 'encoding' is not a valid parameter for 'open'. Why? (Also, sorry if this is a dumb question, it's my first app ever)

Ok, I know this is not ideal but if none of these options are working for you, you can set your desired string to a variable in the main app class in your .py file, which has utf-8 encoding, and then access it in your .kv file.
#.py
class MainApp(App):
struser = ('Nome de usuário')

#.kv
Label:
text: app.struser

worked for me

Was this page helpful?
0 / 5 - 0 ratings

Related issues

phelantomas picture phelantomas  ·  5Comments

37 picture 37  ·  4Comments

hansent picture hansent  ·  3Comments

damienflament picture damienflament  ·  3Comments

KeyWeeUsr picture KeyWeeUsr  ·  5Comments