Python 3: How to specify stdin encoding
Python 3 does not expect ASCII from sys.stdin
. It'll open stdin
in text mode and make an educated guess as to what encoding is used. That guess may come down to ASCII
, but that is not a given. See the sys.stdin
documentation on how the codec is selected.
Like other file objects opened in text mode, the sys.stdin
object derives from the io.TextIOBase
base class; it has a .buffer
attribute pointing to the underlying buffered IO instance (which in turn has a .raw
attribute).
Wrap the sys.stdin.buffer
attribute in a new io.TextIOWrapper()
instance to specify a different encoding:
import io
import sys
input_stream = io.TextIOWrapper(sys.stdin.buffer, encoding='utf-8')
Alternatively, set the PYTHONIOENCODING
environment variable to the desired codec when running python.
From Python 3.7 onwards, you can also reconfigure the existing std*
wrappers, provided you do it at the start (before any data has been read):
# Python 3.7 and newer
sys.stdin.reconfigure(encoding='utf-8')