Converting int to bytes in Python 3

I was trying to build this bytes object in Python 3:

b'3\r\n'

so I tried the obvious (for me), and found a weird behaviour:

>>> bytes(3) + b'\r\n'
b'\x00\x00\x00\r\n'

Apparently:

>>> bytes(10)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

I've been unable to see any pointers on why the bytes conversion works this way reading the documentation. However, I did find some surprise messages in this Python issue about adding format to bytes (see also Python 3 bytes formatting):

http://bugs.python.org/issue3982

This interacts even more poorly with oddities like bytes(int) returning zeroes now

and:

It would be much more convenient for me if bytes(int) returned the ASCIIfication of that int; but honestly, even an error would be better than this behavior. (If I wanted this behavior - which I never have - I'd rather it be a classmethod, invoked like "bytes.zeroes(n)".)

Can someone explain me where this behaviour comes from?


From python 3.2 you can do

>>> (1024).to_bytes(2, byteorder='big')
b'\x04\x00'

https://docs.python.org/3/library/stdtypes.html#int.to_bytes

def int_to_bytes(x: int) -> bytes:
    return x.to_bytes((x.bit_length() + 7) // 8, 'big')
    
def int_from_bytes(xbytes: bytes) -> int:
    return int.from_bytes(xbytes, 'big')

Accordingly, x == int_from_bytes(int_to_bytes(x)). Note that the above encoding works only for unsigned (non-negative) integers.

For signed integers, the bit length is a bit more tricky to calculate:

def int_to_bytes(number: int) -> bytes:
    return number.to_bytes(length=(8 + (number + (number < 0)).bit_length()) // 8, byteorder='big', signed=True)

def int_from_bytes(binary_data: bytes) -> Optional[int]:
    return int.from_bytes(binary_data, byteorder='big', signed=True)

That's the way it was designed - and it makes sense because usually, you would call bytes on an iterable instead of a single integer:

>>> bytes([3])
b'\x03'

The docs state this, as well as the docstring for bytes:

 >>> help(bytes)
 ...
 bytes(int) -> bytes object of size given by the parameter initialized with null bytes

You can use the struct's pack:

In [11]: struct.pack(">I", 1)
Out[11]: '\x00\x00\x00\x01'

The ">" is the byte-order (big-endian) and the "I" is the format character. So you can be specific if you want to do something else:

In [12]: struct.pack("<H", 1)
Out[12]: '\x01\x00'

In [13]: struct.pack("B", 1)
Out[13]: '\x01'

This works the same on both python 2 and python 3.

Note: the inverse operation (bytes to int) can be done with unpack.