Recipe4.8.Transposing Two-Dimensional Arrays


Recipe 4.8. Transposing Two-Dimensional Arrays

Credit: Steve Holden, Raymond Hettinger, Attila Vàsàrhelyi, Chris Perkins

Problem

You need to transpose a list of lists, turning rows into columns and vice versa.

Solution

You must start with a list whose items are lists all of the same length, such as:

arr = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]

A list comprehension offers a simple, handy way to transpose such a two-dimensional array:

print [[r[col] for r in arr] for col in range(len(arr[0]))] [[1, 4, 7, 10], [2, 5, 8, 11], [3, 6, 9, 12]]

A faster though more obscure alternative (with exactly the same output) can be obtained by exploiting built-in function zip in a slightly strange way:

print map(list, zip(*arr))

Discussion

This recipe shows a concise yet clear way to turn rows into columns, and also a faster though more obscure way. List comprehensions work well when you want to be clear yet concise, while the alternative solution exploits the built-in function zip in a way that is definitely not obvious.

Sometimes data just comes at you the wrong way. For instance, if you use Microsoft's ActiveX Data Ojbects (ADO) database interface, due to array element-ordering differences between Python and Microsoft's preferred implementation language (Visual Basic), the Getrows method actually appears to return database columns in Python, despite the method's name. This recipe's two solutions to this common kind of problem let you choose between clarity and speed.

In the list comprehension solution, the inner comprehension varies what is selected from (the row), while the outer comprehension varies the selector (the column). This process achieves the required transposition.

In the zip-based solution, we use the *a syntax to pass each item (row) of arr to zip, in order, as a separate positional argument. zip returns a list of tuples, which directly achieves the required transposition; we then apply list to each tuple, via the single call to map, to obtain a list of lists, as required. Since we don't use zip's result as a list directly, we could get a further slight improvement in performance by using itertools.izip instead (because izip does not materialize its result as a list in memory, but rather yields it one item at a time):

import itertools print map(list, itertools.izip(*arr))

but, in this specific case, the slight speed increase is probably not worth the added complexity.

The *args and **kwds Syntax

*args (actually, * followed by any identifiermost usually, you'll see args or a as the identifier that's used) is Python syntax for accepting or passing arbitrary positional arguments. When you receive arguments with this syntax (i.e., when you place the star syntax within a function's signature, in the def statement for that function), Python binds the identifier to a tuple that holds all positional arguments not "explicitly" received. When you pass arguments with this syntax, the identifier can be bound to any iterable (in fact, it could be any expression, not necessarily an identifier, as long as the expression's result is an iterable).

**kwds (again, the identifier is arbitrary, most often kwds or k) is Python syntax for accepting or passing arbitrary named arguments. (Python sometimes calls named arguments keyword arguments, which they most definitely are notjust try to use as argument name a keyword, such as pass, for, or yield, and you'll see. Unfortunately, this confusing terminology is, by now, ingrained in the language and its culture.) When you receive arguments with this syntax (i.e., when you place the starstar syntax within a function's signature, in the def statement for that function), Python binds the identifier to a dict, which holds all named arguments not "explicitly" received. When you pass arguments with this syntax, the identifier must be bound to a dict (in fact, it could be any expression, not necessarily an identifier, as long as the expression's result is a dict).

Whether in defining a function or in calling it, make sure that both *a and **k come after any other parameters or arguments. If both forms appear, then place the **k after the *a.


If you're transposing large arrays of numbers, consider Numeric Python and other third-party packages. Numeric Python defines transposition and other axis-swinging routines that will make your head spin.

See Also

The Reference Manual and Python in a Nutshell sections on list displays (the other name for list comprehensions) and on the *a and *k notation for positional and named argument passing; built-in functions zip and map; Numeric Python (http://www.pfdubois.com/numpy/).



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2004
Pages: 420

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net