Recipe5.15.Sorting Names and Separating Them by Initials


Recipe 5.15. Sorting Names and Separating Them by Initials

Credit: Brett Cannon, Amos Newcombe

Problem

You want to write a directory for a group of people, and you want that directory to be grouped by the initials of their last names and sorted alphabetically.

Solution

Python 2.4's new itertools.groupby function makes this task easy:

import itertools def groupnames(name_iterable):     sorted_names = sorted(name_iterable, key=_sortkeyfunc)     name_dict = {  }     for key, group in itertools.groupby(sorted_names, _groupkeyfunc):         name_dict[key] = tuple(group)     return name_dict pieces_order = { 2: (-1, 0), 3: (-1, 0, 1) } def _sortkeyfunc(name):     ''' name is a string with first and last names, and an optional middle         name or initial, separated by spaces; returns a string in order         last-first-middle, as wanted for sorting purposes. '''     name_parts = name.split( )     return ' '.join([name_parts[n] for n in pieces_order[len(name_parts)]]) def _groupkeyfunc(name):     ''' returns the key for grouping, i.e. the last name's initial. '''     return name.split( )[-1][0]

Discussion

In this recipe, name_iterable must be an iterable whose items are strings containing names in the form first - middle - last, with middle being optional and the parts separated by whitespace. The result of calling groupnames on such an iterable is a dictionary whose keys are the last names' initials, and the corresponding values are the tuples of all names with that last name's initial.

Auxiliary function _sortkeyfunc splits a name that's a single string, either "first last" or "first middle last," and reorders the part into a list that starts with the last name, followed by first name, plus the middle name or initial, if any, at the end. Then, the function returns this list rejoined into a string. The resulting string is the key we want to use for sorting, according to the problem statement. Python 2.4's built-in function sorted takes just this kind of function (to call on each item to get the sort key) as the value of its optional parameter named key.

Auxiliary function _groupkeyfunc takes a name in the same form and returns the last name's initialthe key on which, again according to the problem statement, we want to group.

This recipe's primary function, groupnames, uses the two auxiliary functions and Python 2.4's sorted and itertools.groupby to solve our problem, building and returning the required dictionary.

If you need to code this task in Python 2.3, you can use the same two support functions and recode function groupnames itself. In 2.3, it is more convenient to do the grouping first and the sorting separately on each group, since no groupby function is available in Python 2.3's standard library:

def groupnames(name_iterable):     name_dict = {  }     for name in name_iterable:         key = _groupkeyfunc(name)         name_dict.setdefault(key, [  ]).append(name)     for k, v in name_dict.iteritems( ):         aux = [(_sortkeyfunc(name), name) for name in v]         aux.sort( )         name_dict[k] = tuple([ n for _ _, n in aux ])     return name_dict

See Also

Recipe 19.21; Library Reference (Python 2.4) docs on module itertools.



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2004
Pages: 420

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net