Let a class behave like it's a list in Python
I have a class which is essentially a collection/list of things. But I want to add some extra functions to this list. What I would like, is the following:
- I have an instance
li = MyFancyList()
. Variableli
should behave as it was a list whenever I use it as a list:[e for e in li]
,li.expand(...)
,for e in li
. - Plus it should have some special functions like
li.fancyPrint()
,li.getAMetric()
,li.getName()
.
I currently use the following approach:
class MyFancyList:
def __iter__(self):
return self.li
def fancyFunc(self):
# do something fancy
This is ok for usage as iterator like [e for e in li]
, but I do not have the full list behavior like li.expand(...)
.
A first guess is to inherit list
into MyFancyList
. But is that the recommended pythonic way to do? If yes, what is to consider? If no, what would be a better approach?
If you want only part of the list behavior, use composition (i.e. your instances hold a reference to an actual list) and implement only the methods necessary for the behavior you desire. These methods should delegate the work to the actual list any instance of your class holds a reference to, for example:
def __getitem__(self, item):
return self.li[item] # delegate to li.__getitem__
Implementing __getitem__
alone will give you a surprising amount of features, for example iteration and slicing.
>>> class WrappedList:
... def __init__(self, lst):
... self._lst = lst
... def __getitem__(self, item):
... return self._lst[item]
...
>>> w = WrappedList([1, 2, 3])
>>> for x in w:
... x
...
1
2
3
>>> w[1:]
[2, 3]
If you want the full behavior of a list, inherit from collections.UserList
. UserList
is a full Python implementation of the list datatype.
So why not inherit from list
directly?
One major problem with inheriting directly from list
(or any other builtin written in C) is that the code of the builtins may or may not call special methods overridden in classes defined by the user. Here's a relevant excerpt from the pypy docs:
Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden
__getitem__
in a subclass of dict will not be called by e.g. the built-inget
method.
Another quote, from Luciano Ramalho's Fluent Python, page 351:
Subclassing built-in types like dict or list or str directly is error- prone because the built-in methods mostly ignore user-defined overrides. Instead of subclassing the built-ins, derive your classes from UserDict , UserList and UserString from the collections module, which are designed to be easily extended.
... and more, page 370+:
Misbehaving built-ins: bug or feature? The built-in dict , list and str types are essential building blocks of Python itself, so they must be fast — any performance issues in them would severely impact pretty much everything else. That’s why CPython adopted the shortcuts that cause their built-in methods to misbehave by not cooperating with methods overridden by subclasses.
After playing around a bit, the issues with the list
builtin seem to be less critical (I tried to break it in Python 3.4 for a while but did not find a really obvious unexpected behavior), but I still wanted to post a demonstration of what can happen in principle, so here's one with a dict
and a UserDict
:
>>> class MyDict(dict):
... def __setitem__(self, key, value):
... super().__setitem__(key, [value])
...
>>> d = MyDict(a=1)
>>> d
{'a': 1}
>>> class MyUserDict(UserDict):
... def __setitem__(self, key, value):
... super().__setitem__(key, [value])
...
>>> m = MyUserDict(a=1)
>>> m
{'a': [1]}
As you can see, the __init__
method from dict
ignored the overridden __setitem__
method, while the __init__
method from our UserDict
did not.
The simplest solution here is to inherit from list
class:
class MyFancyList(list):
def fancyFunc(self):
# do something fancy
You can then use MyFancyList
type as a list, and use its specific methods.
Inheritance introduces a strong coupling between your object and list
. The approach you implement is basically a proxy object.
The way to use heavily depends of the way you will use the object. If it have to be a list, then inheritance is probably a good choice.
EDIT: as pointed out by @acdr, some methods returning list copy should be overriden in order to return a MyFancyList
instead a list
.
A simple way to implement that:
class MyFancyList(list):
def fancyFunc(self):
# do something fancy
def __add__(self, *args, **kwargs):
return MyFancyList(super().__add__(*args, **kwargs))
If you don't want to redefine every method of list
, I suggest you the following approach:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
This would make methods like append
, extend
and so on, work out of the box. Beware, however, that magic methods (e.g. __len__
, __getitem__
etc.) are not going to work in this case, so you should at least redeclare them like this:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
def __len__(self):
return len(self.li)
def __getitem__(self, item):
return self.li[item]
def fancyPrint(self):
# do whatever you want...
Please note, that in this case if you want to override a method of list
(extend
, for instance), you can just declare your own so that the call won't pass through the __getattr__
method. For instance:
class MyList:
def __init__(self, list_):
self.li = list_
def __getattr__(self, method):
return getattr(self.li, method)
def __len__(self):
return len(self.li)
def __getitem__(self, item):
return self.li[item]
def fancyPrint(self):
# do whatever you want...
def extend(self, list_):
# your own version of extend
Based on the two example methods you included in your post (fancyPrint
, findAMetric
), it doesn't seem that you need to store any extra state in your lists. If this is the case, you're best off simple declaring these as free functions and ignoring subtyping altogether; this completely avoids problems like list
vs UserList
, fragile edge cases like return types for __add__
, unexpected Liskov issues, &c. Instead, you can write your functions, write your unit tests for their output, and rest assured that everything will work exactly as intended.
As an added benefit, this means your functions will work with any iterable types (such as generator expressions) without any extra effort.