14.5 List Comprehensions
Because mapping operations over sequences and collecting results is such a common task in Python coding, Python 2.0 sprouted a new feature—the
list
comprehension
expression—that can make this even simpler than using
map
and
filter
. Technically, this feature is not tied to functions, but we've saved it for this point in the book, because it is usually best
understood
by analogy to function-based alternatives.
14.5.1 List Comprehension Basics
Let's work through an example that
demonstrates
the basics. Python's built-in
ord
function returns the integer ASCII code of a single character:
>>>
ord('s')
115
The
chr
built-in is the
converse
—it returns the character for an ASCII code integer. Now, suppose we wish to collect the ASCII codes of
all
characters
in an entire string. Perhaps the most straightforward approach is to use a simple
for
loop, and append results to a list:
>>>
res = [ ]
>>>
for x in 'spam'
:
...
res.append(ord(x))
...
>>>
res
[115, 112, 97, 109]
Now that we know about
map
, we can achieve similar results with a single function call without having to manage list construction in the code:
>>>
res = map(ord, 'spam') # Apply func to seq.
>>>
res
[115, 112, 97, 109]
But as of Python 2.0, we get the same results from a list comprehension expression:
>>>
res = [ord(x) for x in 'spam'] # Apply expr to seq.
>>>
res
[115, 112, 97, 109]
List comprehensions collect the results of applying an arbitrary expression to a sequence of values, and return them in a new list. Syntactically, list comprehensions are
enclosed
in square brackets (to
remind
you that they construct a list). In their simple form, within the brackets, you code an expression that
names
a variable, followed by what looks like a
for
loop header that names the same variable. Python collects the expression's results, for each iteration of the
implied
loop.
The effect of the example so far is similar to both the manual
for
loop, and the
map
call. List comprehensions become more handy, though, when we wish to apply an arbitrary expression to a sequence:
>>>
[x ** 2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Here, we've collected the squares of the
numbers
0 to 9. To do similar work with a
map
call, we would probably invent a little function to implement the square operation. Because we won't need this function elsewhere, it would typically be coded inline, with a
lambda
:
>>>
map((lambda x: x**2), range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
This does the same job, and is only a few keystrokes longer than the equivalent list comprehension. For more advanced kinds of expressions, though, list comprehensions will often be less for you to type. The
next
section shows why.
14.5.2 Adding Tests and Nested
Loops
List comprehensions are more general than shown so far. For instance, you can code an
if
clause after the
for
, to add selection logic. List comprehensions with
if
clauses can be thought of as analogous to the
filter
built-in of the prior section—they skip sequence items for which the
if
clause is not true. Here are both schemes picking up even numbers from 0 to 4; like
map
,
filter
invents a little
lambda
function for the test expression. For comparison, the equivalent
for
loop is shown here as well:
>>>
[x for x in range(5) if x % 2 == 0]
[0, 2, 4]
>>>
filter((lambda x: x % 2 == 0), range(5))
[0, 2, 4]
>>>
res = [ ]
>>>
for x in range(5):
...
if x % 2 == 0: res.append(x)
...
>>>
res
[0, 2, 4]
All of these are using
modulus
(remainder of division) to detect evens: if there is no remainder after dividing a number by two, it must be even. The
filter
call is not much longer than the list comprehension here either. However, the
combination
of an
if
clause and an arbitrary expression gives list comprehensions the effect of a
filter
and a
map
, in a single expression:
>>>
[x**2 for x in range(10) if x % 2 == 0]
[0, 4, 16, 36, 64]
This time, we collect the squares of the even numbers from 0 to 9—the
for
loop skips numbers for which the attached
if
clause on the right is false, and the expression on the left computes squares. The equivalent
map
call would be more work on our part: we would have to combine
filter
selections with
map
iteration, making for a noticeably more complex expression:
>>>
map((lambda x: x**2), filter((lambda x: x % 2 == 0), range(10)))
[0, 4, 16, 36, 64]
In fact, list comprehensions are even more general still. You may code nested
for
loops, and each may have an associated
if
test. The general structure of list comprehensions looks like this:
[ expression for target1 in sequence1 [if condition]
for
target2
in
sequence2
[if
condition
] ...
for
targetN
in
sequenceN
[if
condition
] ]
When
for
clauses are nested within a list comprehension, they work like equivalent nested
for
loop statements. For example, the following:
>>>
res = [x+y for x in [0,1,2] for y in [100,200,300]]
>>>
res
[100, 200, 300, 101, 201, 301, 102, 202, 302]
has the same effect as the substantially more verbose equivalent statements:
>>>
res = [ ]
>>>
for x in [0,1,2]:
...
for y in [100,200,300]:
...
res.append(x+y)
...
>>>
res
[100, 200, 300, 101, 201, 301, 102, 202, 302]
Although list comprehensions construct a list, remember that they can iterate over any sequence type. Here's a similar bit of code that traverses strings instead of lists of numbers, and so collects concatenation results:
>>>
[x+y for x in 'spam' for y in 'SPAM']
['sS', 'sP', 'sA', 'sM', 'pS', 'pP', 'pA', 'pM',
'aS', 'aP', 'aA', 'aM', 'mS', 'mP', 'mA', 'mM']
Finally, here is a much more complex list comprehension. It illustrates the effect of attached
if
selections on nested
for
clauses:
>>>
[(x,y) for x in range(5) if x%2 == 0 for y in range(5) if y%2 == 1]
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
This expression permutes even numbers from 0 to 4, with odd numbers from 0 to 4. The
if
clauses filter out items in each sequence iteration. Here's the equivalent
statement-based
code—nest the list comprehension's
for
and
if
clauses inside each other to derive the equivalent statements. The result is longer, but perhaps clearer:
>>>
res = [ ]
>>>
for x in range(5):
...
if x % 2 == 0:
...
for y in range(5):
...
if y % 2 == 1:
...
res.append((x, y))
...
>>>
res
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
The
map
and
filter
equivalent would be wildly complex and nested, so we won't even try showing it here. We'll leave its coding as an exercise for Zen masters, ex-LISP programmers, and the criminally insane.
14.5.3 Comprehending List Comprehensions
With such generality, list comprehensions can quickly become, well, incomprehensible,
especially
when nested. Because of that, our advice would normally be to use simple
for
loops when getting started with Python, and
map
calls in most other cases (unless they get too complex). The "Keep It Simple" rule applies here, as always; code conciseness is much less important a goal than code readability.
However, there is currently a substantial performance advantage to the extra complexity in this case: based on tests run under Python 2.2,
map
calls are
roughly
twice as fast as equivalent
for
loops, and list comprehensions are usually very slightly faster than
map
. This speed difference owes to the fact that
map
and list comprehensions run at C language speed inside the interpreter, rather than stepping through Python
for
loop code within the PVM.
Because
for
loops make logic more explicit, we recommend them in general on grounds of simplicity.
map
, and especially list comprehensions, are worth knowing if your application's speed is an important consideration. In addition, because
map
and list comprehensions are both expressions, they can show up syntactically in places that
for
loop statements cannot, such as in the bodies of
lambda
functions, within list and dictionary literals, and more. Still, you should try to keep your
map
calls and list comprehensions simple; for more complex
tasks
, use full statements instead.
|
Here's a more realistic example of list comprehensions and
map
in action. Recall that the file
readlines
method returns lines with their
\n
end-line character at the end:
>>> open('myfile').readlines( )
['aaa\n', 'bbb\n', 'ccc\n']
If you don't want the end-line, you can slice off all lines in a single step, with either a list comprehension or a
map
call:
>>> [line[:-1] for line in open('myfile').readlines( )]
['aaa', 'bbb', 'ccc']
>>> [line[:-1] for line in open('myfile')]
['aaa', 'bbb', 'ccc']
>>> map((lambda line: line[:-1]), open('myfile'))
['aaa', 'bbb', 'ccc']
The last two of these make use of
file iterators
(it
essentially
means you don't need a method call to grab all the lines, in iteration contexts such as these). The
map
call is just slightly longer than list comprehensions, but
neither
has to manage result list construction explicitly.
List comprehensions can also be used as a
sort
of column projection operation. Python's standard SQL database API returns query results as a list of tuples—the list is the table, tuples are rows, and items in tuples are column values, much like the following list:
listoftuple = [('bob', 35, 'mgr'), ('mel', 40, 'dev')]
A
for
loop could pick up all values from a selected column manually, but
map
and list comprehensions can do it in a single step, and faster:
>>> [age for (name, age, job) in listoftuple]
[35, 40]
>>> map((lambda (name, age, job): age), listoftuple)
[35, 40]
Both of these make use of tuple assignment to unpack row tuples in the list. See other books and resources for more on Python's database API.
|
|