Recipe19.6.Dividing an Iterable into n Slices of Stride n


Recipe 19.6. Dividing an Iterable into n Slices of Stride n

Credit: Gyro Funch, Alex Martelli

Problem

You have an iterable p and need to get the n non-overlapping extended slices of stride n, which, if the iterable was a sequence supporting extended slicing, would be p[0::n], p[1::n], and so on up to p[n-1::n].

Solution

While extended slicing would return sequences of the same type we start with, it's much more sensible to specify a strider function that, instead, solves this problem by returning a list of lists:

def strider(p, n):     """ Split an iterable p into a list of n sublists, repeatedly taking         the next element of p and adding it to the next sublist.  Example:         >>> strider('abcde', 3)         [['a', 'd'], ['b', 'e'], ['c']]         In other words, strider's result is equal to:             [list(p[i::n]) for i in xrange(n)]         if iterable p is a sequence supporting extended-slicing syntax.     """     # First, prepare the result, a list of n separate lists     result = [ [  ] for x in xrange(n) ]     # Loop over the input, appending each item to one of     # result's lists, in "round robin" fashion     for i, item in enumerate(p):         result[i % n].append(item)     return result

Discussion

The function in this recipe takes an iterable p and pulls it apart into a user-defined number n of pieces (specifically, function strider returns a list of sublists), distributing p's items into what would be the n extended slices of stride n if p were a sequence.

If we were willing to sacrifice generality, forcing argument p to be a sequence supporting extended slicing, rather than a generic iterable, we could use a very different approach, as the docstring of strider indicates:

def strider1(p, n):     return [list(p[i::n]) for i in xrange(n)]

Depending on our exact needs, with such a strong constraint on p, we might omit the list call to make each subsequence into a list, and/or code a generator to avoid consuming extra memory to materialize the whole list of results at once:

def strider2(p, n):     for i in xrange(n):         yield p[i::n]

or, equivalently:

import itertools def strider3(p, n):     return itertools.imap(lambda i: p[i::n], xrange(n))

or, in Python 2.4, with a generator expression:

def strider4(p, n):     return (p[i::n] for i in xrange(n))

However, none of these alternatives accepts a generic iterable as peach demands a full-fledged sequence.

Back to this recipe's exact specs, the best way to enhance the recipe is to recode it to avoid low-level fiddling with indices. While doing arithmetic on indices is conceptually quite simple, it can get messy and indeed is notoriously error prone. We can do better by a generous application of module itertools from the Python Standard Library:

import itertools def strider5(p, n):     result = [ [  ] for x in itertools.repeat(0, n) ]     resiter = itertools.cycle(result)     for item, sublist in itertools.izip(p, resiter):         sublist.append(item)     return result

This strider5 version uses three functions from module itertoolsall of the functions in module itertools return iterable objects, and, as we see in this case, their results are therefore typically used in for loops. Function repeat yields an object, repeatedly, a given number of times, and here we use it instead of the built-in function xxrange to control the list comprehension that builds the initial value for result. Function cycle takes an iterable object and returns an iterator that walks over that iterable object repeatedly and cyclicallyin other words, cycle performs exactly the round-robin effect that we need in this recipe. Function izip is essentially like the built-in function zip, except that it returns an iterator and thus avoids the memory-consumption overhead that zip incurs by building its whole result list in memory at once.

This version achieves deep elegance and conceptual simplicity (although you may need to gain some familiarity with itertools before you agree that this version is simple!) by foregoing all index arithmetic and leaving all of the handling of the round-robin issues to itertools.cycle. resiter, per se, is a nonterminating iterator, but the function deals effortlessly with that. Specifically, since we use resiter together with p as arguments to izip, termination is assured (assuming, of course, that p does terminate!) by the semantics of izip, which, just like built-in function zip, stops iterating as soon as any one of its arguments is exhausted.

See Also

The itertools module is part of the Python Standard Library and is documented in the Library Reference portion of Python's online documentation; the Library Reference and Python in a Nutshell docs about the built-ins zip and xrange, and extended-form slicing of sequences.



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2004
Pages: 420

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net