Is there a clever way to pass the key to defaultdict's default_factory?
A class has a constructor which takes one parameter:
class C(object):
def __init__(self, v):
self.v = v
...
Somewhere in the code, it is useful for values in a dict to know their keys.
I want to use a defaultdict with the key passed to newborn default values:
d = defaultdict(lambda : C(here_i_wish_the_key_to_be))
Any suggestions?
Solution 1:
It hardly qualifies as clever - but subclassing is your friend:
class keydefaultdict(defaultdict):
def __missing__(self, key):
if self.default_factory is None:
raise KeyError( key )
else:
ret = self[key] = self.default_factory(key)
return ret
d = keydefaultdict(C)
d[x] # returns C(x)
Solution 2:
No, there is not.
The defaultdict
implementation can not be configured to pass missing key
to the default_factory
out-of-the-box. Your only option is to implement your own defaultdict
subclass, as suggested by @JochenRitzel, above.
But that isn't "clever" or nearly as clean as a standard library solution would be (if it existed). Thus the answer to your succinct, yes/no question is clearly "No".
It's too bad the standard library is missing such a frequently needed tool.
Solution 3:
I don't think you need defaultdict
here at all. Why not just use dict.setdefault
method?
>>> d = {}
>>> d.setdefault('p', C('p')).v
'p'
That will of course would create many instances of C
. In case it's an issue, I think the simpler approach will do:
>>> d = {}
>>> if 'e' not in d: d['e'] = C('e')
It would be quicker than the defaultdict
or any other alternative as far as I can see.
ETA regarding the speed of in
test vs. using try-except clause:
>>> def g():
d = {}
if 'a' in d:
return d['a']
>>> timeit.timeit(g)
0.19638929363557622
>>> def f():
d = {}
try:
return d['a']
except KeyError:
return
>>> timeit.timeit(f)
0.6167065411074759
>>> def k():
d = {'a': 2}
if 'a' in d:
return d['a']
>>> timeit.timeit(k)
0.30074866358404506
>>> def p():
d = {'a': 2}
try:
return d['a']
except KeyError:
return
>>> timeit.timeit(p)
0.28588609450770264
Solution 4:
Here's a working example of a dictionary that automatically adds a value. The demonstration task in finding duplicate files in /usr/include. Note customizing dictionary PathDict only requires four lines:
class FullPaths:
def __init__(self,filename):
self.filename = filename
self.paths = set()
def record_path(self,path):
self.paths.add(path)
class PathDict(dict):
def __missing__(self, key):
ret = self[key] = FullPaths(key)
return ret
if __name__ == "__main__":
pathdict = PathDict()
for root, _, files in os.walk('/usr/include'):
for f in files:
path = os.path.join(root,f)
pathdict[f].record_path(path)
for fullpath in pathdict.values():
if len(fullpath.paths) > 1:
print("{} located in {}".format(fullpath.filename,','.join(fullpath.paths)))