What's the exact usage of __reduce__ in Pickler

I know that in order to be picklable, a class has to overwrite __reduce__ method, and it has to return string or tuple.

How does this function work? What the exact usage of __reduce__? When will it been used?


When you try to pickle an object, there might be some properties that don't serialize well. One example of this is an open file handle. Pickle won't know how to handle the object and will throw an error.

You can tell the pickle module how to handle these types of objects natively within a class directly. Lets see an example of an object which has a single property; an open file handle:

import pickle

class Test(object):
    def __init__(self, file_path="test1234567890.txt"):
        # An open file in write mode
        self.some_file_i_have_opened = open(file_path, 'wb')

my_test = Test()
# Now, watch what happens when we try to pickle this object:
pickle.dumps(my_test)

It should fail and give a traceback:

Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  --- snip snip a lot of lines ---
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle file objects

However, had we defined a __reduce__ method in our Test class, pickle would have known how to serialize this object:

import pickle

class Test(object):
    def __init__(self, file_path="test1234567890.txt"):
        # Used later in __reduce__
        self._file_name_we_opened = file_path
        # An open file in write mode
        self.some_file_i_have_opened = open(self._file_name_we_opened, 'wb')
    def __reduce__(self):
        # we return a tuple of class_name to call,
        # and optional parameters to pass when re-creating
        return (self.__class__, (self._file_name_we_opened, ))

my_test = Test()
saved_object = pickle.dumps(my_test)
# Just print the representation of the string of the object,
# because it contains newlines.
print(repr(saved_object))

This should give you something like: "c__main__\nTest\np0\n(S'test1234567890.txt'\np1\ntp2\nRp3\n.", which can be used to recreate the object with open file handles:

print(vars(pickle.loads(saved_object)))

In general, the __reduce__ method needs to return a tuple with at least two elements:

  1. A blank object class to call. In this case, self.__class__
  2. A tuple of arguments to pass to the class constructor. In the example it's a single string, which is the path to the file to open.

Consult the docs for a detailed explanation of what else the __reduce__ method can return.