Recipe4.13.Extracting a Subset of a Dictionary


Recipe 4.13. Extracting a Subset of a Dictionary

Credit: David Benjamin

Problem

You want to extract from a larger dictionary only that subset of it that corresponds to a certain set of keys.

Solution

If you want to leave the original dictionary intact:

def sub_dict(somedict, somekeys, default=None):     return dict([ (k, somedict.get(k, default)) for k in somekeys ])

If you want to remove from the original the items you're extracting:

def sub_dict_remove(somedict, somekeys, default=None):     return dict([ (k, somedict.pop(k, default)) for k in somekeys ])

Two examples of these functions' use and effects:

>>> d = {'a': 5, 'b': 6, 'c': 7} >>> print sub_dict(d, 'ab'), d {'a': 5, 'b': 6} {'a': 5, 'b': 6, 'c': 7} >>> print sub_dict_remove(d, 'ab'), d {'a': 5, 'b': 6} {'c': 7}

Discussion

In Python, I use dictionaries for many purposesdatabase rows, primary and compound keys, variable namespaces for template parsing, and so on. So, I often need to create a dictionary that is based on another, larger dictionary, but only contains the subset of the larger dictionary corresponding to some set of keys. In most use cases, the larger dictionary must remain intact after the extraction; sometimes, however, I need to remove from the larger dictionary the subset that I'm extracting. This recipe's solution shows both possibilities. The only difference is that you use method get when you want to avoid affecting the dictionary that you are getting data from, method pop when you want to remove the items you're getting.

If some item k of somekeys is not in fact a key in somedict, this recipe's functions put k as a key in the result anyway, with a default value (which I pass as an optional argument to either function, with a default value of None). So, the result is not necessarily a subset of somedict. This behavior is the one I've found most useful in my applications.

You might prefer to get an exception for "missing keys"that would help alert you to a bug in your program, in cases in which you know all ks in somekeys should definitely also be keys in somedict. Remember, "errors should never pass silently. Unless explicitly silenced," to quote The Zen of Python, by Tim Peters (enter the statement import this at an interactive Python prompt to read or re-read this delightful summary of Python's design principles). So, if a missing key is an error, from the point of view of your application, then you do want to get an exception that alerts you to that error at once, if it ever occurs. If this is what you want, you can get it with minor modifications to this recipe's functions:

def sub_dict_strict(somedict, somekeys):     return dict([ (k, somedict[k]) for k in somekeys ]) def sub_dict_remove_strict(somedict, somekeys):     return dict([ (k, somedict.pop(k)) for k in somekeys ])

As you can see, these strict variants are even simpler than the originalsa good indication that Python likes to raise exceptions when unexpected behavior occurs!

Alternatively, you might prefer missing keys to be simply omitted from the result. This, too, requires just minor modifications:

def sub_dict_select(somedict, somekeys):     return dict([ (k, somedict[k]) for k in somekeys if k in somedict]) def sub_dict_remove_select(somedict, somekeys):     return dict([ (k, somedict.pop(k)) for k in somekeys if k in somedict])

The if clause in each list comprehension does all we need to distinguish these _select variants from the _strict ones.

In Python 2.4, you can use generator expressions, instead of list comprehensions, as the arguments to dict in each of the functions shown in this recipe. Just change the syntax of the calls to dict, from dict([. . .]) to dict(. . .) (removing the brackets adjacent to the parentheses) and enjoy the resulting slight simplification and acceleration. However, these variants would not work in Python 2.3, which has list comprehensions but not generator expressions.

See Also

Library Reference and Python in a Nutshell documentation on dict.



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2004
Pages: 420

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net