How to display utf-8 in windows console

I'm using Python 2.6 on Windows 7

I borrowed some code from here: Python, Unicode, and the Windows console

My goal is to be able to display uft-8 strings in the windows console.

Apparantly in python 2.6, the

sys.setdefaultencoding()

is no longer supported

However, I wrote reload(sys) before I tried to use it and it magically didn't error.

This code will NOT error, but it shows funny characters instead of japanese text. I believe the problem is because I have not successfully changed the codepage of the windows console.

These are my attempts, but they don't work:

reload(sys)
sys.setdefaultencoding('utf-8')

print os.popen('chcp 65001').read()

sys.stdout.encoding = 'cp65001'

Perhaps you can use win32console to change the codepage? I tried the code from the website I linked, but it also errored from the win32console.. maybe that code is obsolete.

Here's my code, that doesn't error but prints funny characters:

#coding=<utf8>
import os
import sys
import codecs



reload(sys)
sys.setdefaultencoding('utf-8')
sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)

#print os.popen('chcp 65001').read()
print(sys.stdout.encoding)
sys.stdout.encoding = 'cp65001'
print(sys.stdout.encoding)

x = raw_input('press enter to continue')

a = 'こんにちは世界'#.decode('utf8')
print a

x = raw_input()

I know you state you're using Python 2.6, but if you're able to use Python 3.3 you'll find that this is finally supported.

Use the command chcp 65001 before starting Python.

See http://docs.python.org/dev/whatsnew/3.3.html#codecs

In Python 3.6 it's no longer even necessary to use the chcp command, since Python bypasses the byte-level console interface entirely and uses a native Unicode interface instead. See PEP 528: Change Windows console encoding to UTF-8.

As noted in the comments by @mbom007, it's also important to make sure the console is configured with a font that supports the characters you're trying to display.

Never ever ever use setdefaultencoding. If you want to write unicode strings to stdio, encode them explicitly. Monkeying around with setdefaultencoding will cause stdlib modules and third-party modules alike to break in horrible subtle ways by allowing implicit conversion between str and unicode when it shouldn't happen.

Yes, the problem is most likely that your code page isn't set properly. However, using os.popen won't change the code page; it'll spawn a new shell, change its code page, and then immediately exit without affecting your console at all. I'm not personally very familiar with windows, so I couldn't tell you how to change your console's code page from within your python program.

The way to properly display unicode data via utf-8 from python, as mentioned before, is to explicitly encode your strings before printing them: print s.encode('utf-8')

Changing the console code page is both unnecessary and won't work (in particular, setting it to 65001 runs into a Python bug). See this question for details, and for how to print Unicode characters to the console regardless of the code page.

Windows doesn't support UTF-8 in a console properly. The only way I know of to display Japanese in the console is by changing (on XP) Control Panel's Regional and Language Options, Advanced Tab, Language for non-Unicode Programs to Japanese. After rebooting, open a console and run "chcp" to find out the Japanese console's code page. Then either print Unicode strings or byte strings explicitly encoded in the correct code page.

How to display utf-8 in windows console

Related

Recent Posts