What is the proper way to determine if an object is a bytes-like object in Python?

Solution 1:

There are a few approaches you could use here.

Duck typing

Since Python is duck typed, you could simply do as follows (which seems to be the way usually suggested):

try:
    data = data.decode()
except (UnicodeDecodeError, AttributeError):
    pass

You could use hasattr as you describe, however, and it'd probably be fine. This is, of course, assuming the .decode() method for the given object returns a string, and has no nasty side effects.

I personally recommend either the exception or hasattr method, but whatever you use is up to you.

Use str()

This approach is uncommon, but is possible:

data = str(data, "utf-8")

Other encodings are permissible, just like with the buffer protocol's .decode(). You can also pass a third parameter to specify error handling.

Single-dispatch generic functions (Python 3.4+)

Python 3.4 and above include a nifty feature called single-dispatch generic functions, via functools.singledispatch. This is a bit more verbose, but it's also more explicit:

def func(data):
    # This is the generic implementation
    data = data.decode()
    ...

@func.register(str)
def _(data):
    # data will already be a string
    ...

You could also make special handlers for bytearray and bytes objects if you so chose.

Beware: single-dispatch functions only work on the first argument! This is an intentional feature, see PEP 433.

Solution 2:

You can use:

isinstance(data, (bytes, bytearray))

Due to the different base class is used here.

>>> bytes.__base__
<type 'basestring'>
>>> bytearray.__base__
<type 'object'>

To check bytes

>>> by = bytes()
>>> isinstance(by, basestring)
True

However,

>>> buf = bytearray()
>>> isinstance(buf, basestring)
False

The above codes are test under python 2.7

Unfortunately, under python 3.4, they are same....

>>> bytes.__base__
<class 'object'>
>>> bytearray.__base__
<class 'object'>

Solution 3:

>>> content = b"hello"
>>> text = "hello"
>>> type(content)
<class 'bytes'>
>>> type(text)
<class 'str'>
>>> type(text) is str
True
>>> type(content) is bytes
True