Accessing dict keys like an attribute?

I find it more convenient to access dict keys as obj.foo instead of obj['foo'], so I wrote this snippet:

class AttributeDict(dict):
    def __getattr__(self, attr):
        return self[attr]
    def __setattr__(self, attr, value):
        self[attr] = value

However, I assume that there must be some reason that Python doesn't provide this functionality out of the box. What would be the caveats and pitfalls of accessing dict keys in this manner?

Solution 1:

Update - 2020

Since this question was asked almost ten years ago, quite a bit has changed in Python itself since then.

While the approach in my original answer is still valid for some cases, (e.g. legacy projects stuck to older versions of Python and cases where you really need to handle dictionaries with very dynamic string keys), I think that in general the dataclasses introduced in Python 3.7 are the obvious/correct solution to vast majority of the use cases of AttrDict.

Original answer

The best way to do this is:

class AttrDict(dict):
    def __init__(self, *args, **kwargs):
        super(AttrDict, self).__init__(*args, **kwargs)
        self.__dict__ = self

Some pros:

It actually works!
No dictionary class methods are shadowed (e.g. .keys() work just fine. Unless - of course - you assign some value to them, see below)
Attributes and items are always in sync
Trying to access non-existent key as an attribute correctly raises AttributeError instead of KeyError
Supports [Tab] autocompletion (e.g. in jupyter & ipython)

Cons:

Methods like .keys() will not work just fine if they get overwritten by incoming data
Causes a memory leak in Python < 2.7.4 / Python3 < 3.2.3
Pylint goes bananas with E1123(unexpected-keyword-arg) and E1103(maybe-no-member)
For the uninitiated it seems like pure magic.

A short explanation on how this works

All python objects internally store their attributes in a dictionary that is named __dict__.
There is no requirement that the internal dictionary __dict__ would need to be "just a plain dict", so we can assign any subclass of dict() to the internal dictionary.
In our case we simply assign the AttrDict() instance we are instantiating (as we are in __init__).
By calling super()'s __init__() method we made sure that it (already) behaves exactly like a dictionary, since that function calls all the dictionary instantiation code.

One reason why Python doesn't provide this functionality out of the box

As noted in the "cons" list, this combines the namespace of stored keys (which may come from arbitrary and/or untrusted data!) with the namespace of builtin dict method attributes. For example:

d = AttrDict()
d.update({'items':["jacket", "necktie", "trousers"]})
for k, v in d.items():    # TypeError: 'list' object is not callable
    print "Never reached!"

Solution 2:

You can have all legal string characters as part of the key if you use array notation. For example, obj['!#$%^&*()_']

Solution 3:

Wherein I Answer the Question That Was Asked

Why doesn't Python offer it out of the box?

I suspect that it has to do with the Zen of Python: "There should be one -- and preferably only one -- obvious way to do it." This would create two obvious ways to access values from dictionaries: obj['key'] and obj.key.

Caveats and Pitfalls

These include possible lack of clarity and confusion in the code. i.e., the following could be confusing to someone else who is going in to maintain your code at a later date, or even to you, if you're not going back into it for awhile. Again, from Zen: "Readability counts!"

>>> KEY = 'spam'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1

If d is instantiated or KEY is defined or d[KEY] is assigned far away from where d.spam is being used, it can easily lead to confusion about what's being done, since this isn't a commonly-used idiom. I know it would have the potential to confuse me.

Additonally, if you change the value of KEY as follows (but miss changing d.spam), you now get:

>>> KEY = 'foo'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AttributeError: 'C' object has no attribute 'spam'

IMO, not worth the effort.

Other Items

As others have noted, you can use any hashable object (not just a string) as a dict key. For example,

>>> d = {(2, 3): True,}
>>> assert d[(2, 3)] is True
>>>

is legal, but

>>> C = type('C', (object,), {(2, 3): True})
>>> d = C()
>>> assert d.(2, 3) is True
  File "<stdin>", line 1
  d.(2, 3)
    ^
SyntaxError: invalid syntax
>>> getattr(d, (2, 3))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: getattr(): attribute name must be string
>>>

is not. This gives you access to the entire range of printable characters or other hashable objects for your dictionary keys, which you do not have when accessing an object attribute. This makes possible such magic as a cached object metaclass, like the recipe from the Python Cookbook (Ch. 9).

Wherein I Editorialize

I prefer the aesthetics of spam.eggs over spam['eggs'] (I think it looks cleaner), and I really started craving this functionality when I met the namedtuple. But the convenience of being able to do the following trumps it.

>>> KEYS = 'spam eggs ham'
>>> VALS = [1, 2, 3]
>>> d = {k: v for k, v in zip(KEYS.split(' '), VALS)}
>>> assert d == {'spam': 1, 'eggs': 2, 'ham': 3}
>>>

This is a simple example, but I frequently find myself using dicts in different situations than I'd use obj.key notation (i.e., when I need to read prefs in from an XML file). In other cases, where I'm tempted to instantiate a dynamic class and slap some attributes on it for aesthetic reasons, I continue to use a dict for consistency in order to enhance readability.

I'm sure the OP has long-since resolved this to his satisfaction, but if he still wants this functionality, then I suggest he download one of the packages from pypi that provides it:

Bunch is the one I'm more familiar with. Subclass of dict, so you have all that functionality.
AttrDict also looks like it's also pretty good, but I'm not as familiar with it and haven't looked through the source in as much detail as I have Bunch.
Addict Is actively maintained and provides attr-like access and more.
As noted in the comments by Rotareti, Bunch has been deprecated, but there is an active fork called Munch.

However, in order to improve readability of his code I strongly recommend that he not mix his notation styles. If he prefers this notation then he should simply instantiate a dynamic object, add his desired attributes to it, and call it a day:

>>> C = type('C', (object,), {})
>>> d = C()
>>> d.spam = 1
>>> d.eggs = 2
>>> d.ham = 3
>>> assert d.__dict__ == {'spam': 1, 'eggs': 2, 'ham': 3}

Wherein I Update, to Answer a Follow-Up Question in the Comments

In the comments (below), Elmo asks:

What if you want to go one deeper? ( referring to type(...) )

While I've never used this use case (again, I tend to use nested dict, for consistency), the following code works:

>>> C = type('C', (object,), {})
>>> d = C()
>>> for x in 'spam eggs ham'.split():
...     setattr(d, x, C())
...     i = 1
...     for y in 'one two three'.split():
...         setattr(getattr(d, x), y, i)
...         i += 1
...
>>> assert d.spam.__dict__ == {'one': 1, 'two': 2, 'three': 3}