A.4 Flow Control

Depending on how you count it, Python has about a half-dozen flow control mechanisms, which is much simpler than most programming languages. Fortunately, Python's collection of mechanisms is well chosen, with a high but not obsessively high degree of orthogonality between them.

From the point of view of this appendix, exception handling is mostly one of Python's flow control techniques. In a language like Java, an application is probably considered "happy" if it does not throw any exceptions at all, but Python programmers find exceptions less "exceptional" a perfectly good design might exit a block of code only when an exception is raised.

Two additional aspects of the Python language are not usually introduced in terms of flow control, but nonetheless amount to such when considered abstractly. Both functional programming style operations on lists and Boolean shortcutting are, at the heart, flow control constructs.

A.4.1 if/then/else Statements

Choice between alternate code paths is generally performed with the if statement and its optional elif and else components. An if block is followed by zero or more elif blocks; at the end of the compound statement, zero or one else blocks occur. An if statement is followed by a Boolean expression and a colon. Each elif is likewise followed by a Boolean expression and colon. The else statement, if it occurs, has no Boolean expression after it, just a colon. Each statement introduces a block containing one or more statements (indented on the following lines or on the same line, after the colon).

Every expression in Python has a Boolean value, including every bare object name or literal. Any empty container (list, dict, tuple) is considered false; an empty string or Unicode string is false; the number 0 (of any numeric type) is false. As well, an instance whose class defines a .__nonzero__() or .__len__() method is false if these methods return a false value. Without these special methods, every instance is true. Much of the time, Boolean expressions consist of comparisons between objects, where comparisons actually evaluate to the canonical objects "0" or "1". Comparisons are <, >, ==, >=, <=, <>, !=, is, is not, in, and not in. Sometimes the unary operator not precedes such an expression.

Only one block in an "if/elif/else" compound statement is executed during any pass if multiple conditions hold, the first one that evaluates as true is followed. For example:

 >>> if 2+2 <= 4: ...   print "Happy math" ... Happy math >>> x = 3 >>> if x > 4: print "More than 4" ... elif x > 3: print "More than 3" ... elif x > 2: print "More than 2" ... else: print "2 or less" ... More than 2 >>> if isinstance(2, int): ...     print "2 is an int"     # 2.2+ test ... else: ...     print "2 is not an int" 

Python has no "switch" statement to compare one value with multiple candidate matches. Occasionally, the repetition of an expression being compared on multiple elif lines looks awkward. A "trick" in such a case is to use a dict as a pseudo-switch. The following are equivalent, for example:

 >>> if var.upper() == 'ONE':     val = 1 ... elif var.upper() == 'TWO':   val = 2 ... elif var.upper() == 'THREE': val = 3 ... elif var.upper() == 'FOUR':  val = 4 ... else:                        val = 0 ... >>> switch = {'ONE':1, 'TWO':2, 'THREE':3, 'FOUR':4} >>> val = switch.get(var.upper(), 0) 

A.4.2 Boolean Shortcutting

The Boolean operators or and and are "lazy." That is, an expression containing or or and evaluates only as far as it needs to determine the overall value. Specifically, if the first disjoin of an or is true, the value of that disjoin becomes the value of the expression, without evaluating the rest; if the first conjoin of an and is false, its value likewise becomes the value of the whole expression.

Shortcutting is formally sufficient for switching and is sometimes more readable and concise than "if/elif/else" blocks. For example:

 >>> if this:          # 'if' compound statement ...     result = this ... elif that: ...     result = that ... else: ...     result = 0 ... >>> result = this or that or 0 # boolean shortcutting 

Compound shortcutting is also possible, but not necessarily easy to read; for example:

 >>> (cond1 and func1()) or (cond2 and func2()) or func3() 

A.4.3 for/continue/break Statements

The for statement loops over the elements of a sequence. In Python 2.2+, looping utilizes an iterator object (which may not have a predetermined length) but standard sequences like lists, tuples, and strings are automatically transformed to iterators in for statements. In earlier Python versions, a few special functions like xreadlines() and xrange() also act as iterators.

Each time a for statement loops, a sequence/iterator element is bound to the loop variable. The loop variable may be a tuple with named items, thereby creating bindings for multiple names in each loop. For example:

 >>> for x,y,z in [(1,2,3),(4,5,6),(7,8,9)]: print x, y, z, '*', ... 1 2 3 * 4 5 6 * 7 8 9 * 

A particularly common idiom for operating on each item in a dictionary is:

 >>> for key,val in dct.items(): ...     print key, val, '*', ... 1 2 * 3 4 * 5 6 * 

When you wish to loop through a block a certain number of times, a common idiom is to use the range() or xrange() built-in functions to create ad hoc sequences of the needed length. For example:

 >>> for _ in range(10): ...     print "X",      # '_' is not used in body ... X X X X X X X X X X 

However, if you find yourself binding over a range just to repeat a block, this often indicates that you have not properly understood the loop. Usually repetition is a way of operating on a collection of related things that could instead be explicitly bound in the loop, not just a need to do exactly the same thing multiple times.

If the continue statement occurs in a for loop, the next loop iteration proceeds without executing later lines in the block. If the break statement occurs in a for loop, control passes past the loop without executing later lines (except the finally block if the break occurs in a try).

A.4.4 map(), filter(), reduce(), and List Comprehensions

Much like the for statement, the built-in functions map(), filter(), and reduce() perform actions based on a sequence of items. Unlike a for loop, these functions explicitly return a value resulting from this application to each item. Each of these three functional programming style functions accepts a function object as a first argument and sequence(s) as a subsequent argument(s).

The map() function returns a list of items of the same length as the input sequence, where each item in the result is a "transformation" of one item in the input. Where you explicitly want such transformed items, use of map() is often both more concise and clearer than an equivalent for loop; for example:

 >>> nums = (1,2,3,4) >>> str_nums = [] >>> for n in nums: ...     str_nums.append(str(n)) ... >>> str_nums ['1', '2', '3', '4'] >>> str_nums = map(str, nums) >>> str_nums ['1', '2', '3', '4'] 

If the function argument of map() accepts (or can accept) multiple arguments, multiple sequences can be given as later arguments. If such multiple sequences are of different lengths, the shorter ones are padded with None values. The special value None may be given as the function argument, producing a sequence of tuples of elements from the argument sequences.

 >>> nums = (1,2,3,4) >>> def add(x, y): ...     if x is None: x=0 ...     if y is None: y=0 ...     return x+y ... >>> map(add, nums, [5,5,5]) [6, 7, 8, 4] >>> map(None, (1,2,3,4), [5,5,5]) [(1, 5), (2, 5), (3, 5), (4, None)] 

The filter() function returns a list of those items in the input sequence that satisfy a condition given by the function argument. The function argument must accept one parameter, and its return value is interpreted as a Boolean (in the usual manner). For example:

 >>> nums = (1,2,3,4) >>> odds = filter(lambda n: n%2, nums) >>> odds (1, 3) 

Both map() and filter() can use function arguments that have side effects, thereby making it possible but not usually desirable to replace every for loop with a map() or filter() function. For example:

 >>> for x in seq: ...     # bunch of actions ...     pass ... >>> def actions(x): ...     # same bunch of actions ...     return 0 ... >>> filter(actions, seq) [] 

Some epicycles are needed for the scoping of block variables and for break and continue statements. But as a general picture, it is worth being aware of the formal equivalence between these very different-seeming techniques.

The reduce() function takes as a function argument a function with two parameters. In addition to a sequence second argument, reduce() optionally accepts a third argument as an initializer. For each item in the input sequence, reduce() combines the previous aggregate result with the item, until the sequence is exhausted. While reduce() like map() and filter() has a loop-like effect of operating on every item in a sequence, its main purpose is to create some sort of aggregation, tally, or selection across indefinitely many items. For example:

 >>> from operator import add >>> sum = lambda seq: reduce(add, seq) >>> sum([4,5,23,12]) 44 >>> def tastes_better(x, y): ...     # some complex comparison of x, y ...     # either return x, or return y ...     # ... ... >>> foods = [spam, eggs, bacon, toast] >>> favorite = reduce(tastes_better, foods) 

List comprehensions (listcomps) are a syntactic form that was introduced with Python 2.0. It is easiest to think of list comprehensions as a sort of cross between for loops and the map() or filter() functions. That is, like the functions, listcomps are expressions that produce lists of items, based on "input" sequences. But listcomps also use the keywords for and if that are familiar from statements. Moreover, it is typically much easier to read a compound list comprehension expression than it is to read corresponding nested map() and filter() functions.

For example, consider the following small problem: You have a list of numbers and a string of characters; you would like to construct a list of all pairs that consist of a number from the list and a character from the string, but only if the ASCII ordinal is larger than the number. In traditional imperative style, you might write:

 >>> bigord_pairs = [] >>> for n in (95,100,105): ...     for c in 'aei': ...         if ord(c) > n: ...             bigord_pairs.append((n,c)) ... >>> bigord_pairs [(95, 'a'), (95, 'e'), (95, 'i'), (100, 'e'), (100, 'i')] 

In a functional programming style you might write the nearly unreadable:

 >>> dupelms=lambda lst,n: reduce(lambda s,t:s+t, ...                              map(lambda l,n=n: [l]*n, 1st)) >>> combine=lambda xs,ys: map(None,xs*len(ys), dupelms(ys,len(xs))) >>> bigord_pairs=lambda ns,cs: filter(lambda (n,c):ord(c)>n, ...                                   combine(ns,cs)) >>> bigord_pairs((95,100,105),'aei') [(95, 'a'), (95, 'e'), (100, 'e'), (95, 'i'), (100, 'i')] 

In defense of this FP approach, it has not only accomplished the task at hand, but also provided the general combinatorial function combine() along the way. But the code is still rather obfuscated.

List comprehensions let you write something that is both concise and clear:

 >>> [(n,c) for n in (95,100,105) for c in 'aei' if ord(c)>n] [(95, 'a'), (95, 'e'), (95, 'i'), (100, 'e'), (100, 'i')] 

As long as you have listcomps available, you hardly need a general combine() function, since it just amounts to repeating the for clause in a listcomp.

Slightly more formally, a list comprehension consists of the following: (1) Surrounding square brackets (like a list constructor, which it is). (2) An expression that usually, but not by requirement, contains some names that get bound in the for clauses. (3) One or more for clauses that bind a name repeatedly (just like a for loop). (4) Zero or more if clauses that limit the results. Generally, but not by requirement, the if clauses contain some names that were bound by the for clauses.

List comprehensions may nest inside each other freely. Sometimes a for clause in a listcomp loops over a list that is defined by another listcomp; once in a while a nested listcomp is even used inside a listcomp's expression or if clauses. However, it is almost as easy to produce difficult-to-read code by excessively nesting listcomps as it is by nesting map() and filter() functions. Use caution and common sense about such nesting.

It is worth noting that list comprehensions are not as referentially transparent as functional programming style calls. Specifically, any names bound in for clauses remain bound in the enclosing scope (or global if the name is so declared). These side effects put a minor extra burden on you to choose distinctive or throwaway names for use in listcomps.

A.4.5 while/else/continue/break Statements

The while statement loops over a block as long as the expression after the while remains true. If an else block is used within a compound while statement, as soon as the expression becomes false, the else block is executed. The else block is chosen even if the while expression is initially false.

If the continue statement occurs in a while loop, the next loop iteration proceeds without executing later lines in the block. If the break statement occurs in a while loop, control passes past the loop without executing later lines (except the finally block if the break occurs in a try). If a break occurs in a while block, the else block is not executed.

If a while statement's expression is to go from being true to being false, typically some name in the expression will be re-bound within the while block. At times an expression will depend on an external condition, such as a file handle or a socket, or it may involve a call to a function whose Boolean value changes over invocations. However, probably the most common Python idiom for while statements is to rely on a break to terminate a block. Some examples:

 >>> command = '' >>> while command != 'exit': ...     command = raw_input('Command > ') ...     # if/elif block to dispatch on various commands ... Command > someaction Command > exit >>> while socket.ready(): ...     socket.getdata()  # do something with the socket ... else: ...     socket.close()    # cleanup (e.g. close socket) ... >>> while 1: ...     command = raw_input('Command > ') ...     if command == 'exit': break ...     # elif's for other commands ... Command > someaction Command > exit 

A.4.6 Functions, Simple Generators, and the yield Statement

Both functions and object methods allow a kind of nonlocality in terms of program flow, but one that is quite restrictive. A function or method is called from another context, enters at its top, executes any statements encountered, then returns to the calling context as soon as a return statement is reached (or the function body ends). The invocation of a function or method is basically a strictly linear nonlocal flow.

Python 2.2 introduced a flow control construct, called generators, that enables a new style of nonlocal branching. If a function or method body contains the statement yield, then it becomes a generator function, and invoking the function returns a generator iterator instead of a simple value. A generator iterator is an object that has a .next() method that returns values. Any instance object can have a .next() method, but a generator iterator's method is special in having "resumable execution."

In a standard function, once a return statement is encountered, the Python interpreter discards all information about the function's flow state and local name bindings. The returned value might contain some information about local values, but the flow state is always gone. A generator iterator, in contrast, "remembers" the entire flow state, and all local bindings, between each invocation of its .next() method. A value is returned to a calling context each place a yield statement is encountered in the generator function body, but the calling context (or any context with access to the generator iterator) is able to jump back to the flow point where this last yield occurred.

In the abstract, generators seem complex, but in practice they prove quite simple. For example:

 >>> from __future__ import generators # not needed in 2.3+ >>> def generator_func(): ...     for n in [1,2]: ...         yield n ...     print "Two yields in for loop" ...     yield 3 ... >>> generator_iter = generator_func() >>> generator_iter.next() 1 >>> generator_iter.next() 2 >>> generator_iter.next() Two yields in for loop 3 >>> generator_iter.next() Traceback (most recent call last):   File "<stdin>", line 1, in ? StopIteration 

The object generator_iter in the example can be bound in different scopes, and passed to and returned from functions, just like any other object. Any context invoking generator_iter.next() jumps back into the last flow point where the generator function body yielded.

In a sense, a generator iterator allows you to perform jumps similar to the "GOTO" statements of some (older) languages, but still retains the advantages of structured programming. The most common usage for generators, however, is simpler than this. Most of the time, generators are used as "iterators" in a loop context; for example:

 >>> for n in generator_func(): ...     print n ... 1 2 Two yields  in for loop 3 

In recent Python versions, the StopIteration exception is used to signal the end of a for loop. The generator iterator's .next() method is implicitly called as many times as possible by the for statement. The name indicated in the for statement is repeatedly re-bound to the values the yield statement(s) return.

A.4.7 Raising and Catching Exceptions

Python uses exceptions quite broadly and probably more naturally than any other programming language. In fact there are certain flow control constructs that are awkward to express by means other than raising and catching exceptions.

There are two general purposes for exceptions in Python. On the one hand, Python actions can be invalid or disallowed in various ways. You are not allowed to divide by zero; you cannot open (for reading) a filename that does not exist; some functions require arguments of specific types; you cannot use an unbound name on the right side of an assignment; and so on. The exceptions raised by these types of occurrences have names of the form [A Z].*Error. Catching error exceptions is often a useful way to recover from a problem condition and restore an application to a "happy" state. Even if such error exceptions are not caught in an application, their occurrence provides debugging clues since they appear in tracebacks.

The second purpose for exceptions is for circumstances a programmer wishes to flag as "exceptional." But understand "exceptional" in a weak sense not as something that indicates a programming or computer error, but simply as something unusual or "not the norm." For example, Python 2.2+ iterators raise a StopIteration exception when no more items can be generated. Most such implied sequences are not infinite length, however; it is merely the case that they contain a (large) number of items, and they run out only once at the end. It's not "the norm" for an iterator to run out of items, but it is often expected that this will happen eventually.

In a sense, raising an exception can be similar to executing a break statement both cause control flow to leave a block. For example, compare:

 >>> n = 0 >>> while 1: ...     n = n+1 ...     if n > 10: break ... >>> print n 11 >>> n = 0 >>> try: ...     while 1: ...         n = n+1 ...         if n > 10: raise "ExitLoop" ... except: ...     print n ... 11 

In two closely related ways, exceptions behave differently than do break statements. In the first place, exceptions could be described as having "dynamic scope," which in most contexts is considered a sin akin to "GOTO," but here is quite useful. That is, you never know at compile time exactly where an exception might get caught (if not anywhere else, it is caught by the Python interpreter). It might be caught in the exception's block, or a containing block, and so on; or it might be in the local function, or something that called it, or something that called the caller, and so on. An exception is a fact that winds its way through execution contexts until it finds a place to settle. The upward propagation of exceptions is quite opposite to the downward propagation of lexically scoped bindings (or even to the earlier "three-scope rule").

The corollary of exceptions' dynamic scope is that, unlike break, they can be used to exit gracefully from deeply nested loops. The "Zen of Python" offers a caveat here: "Flat is better than nested." And indeed it is so, if you find yourself nesting loops too deeply, you should probably refactor (e.g., break loops into utility functions). But if you are nesting just deeply enough, dynamically scoped exceptions are just the thing for you. Consider the following small problem: A "Fermat triple" is here defined as a triple of integers (i,j,k) such that "i**2 + j**2 == k**2". Suppose that you wish to determine if any Fermat triples exist with all three integers inside a given numeric range. An obvious (but entirely nonoptimal) solution is:

 >>> def fermat_triple(beg, end): ...     class EndLoop(Exception): pass ...     range_ = range(beg, end) ...     try: ...         for i in range_: ...             for j in range_: ...                 for k in range_: ...                     if i**2 + j**2 == k**2: ...                         raise EndLoop, (i,j,k) ...     except EndLoop, triple: ...         # do something with 'triple' ...         return i,j,k ... >>> fermat_triple(1,10) (3, 4, 5) >>> fermat_triple(120,150) >>> fermat_triple(100,150) (100, 105, 145) 

By raising the EndLoop exception in the middle of the nested loops, it is possible to catch it again outside of all the loops. A simple break in the inner loop would only break out of the most deeply nested block, which is pointless. One might devise some system for setting a "satisfied" flag and testing for this at every level, but the exception approach is much simpler. Since the except block does not actually do anything extra with the triple, it could have just been returned inside the loops; but in the general case, other actions can be required before a return.

It is not uncommon to want to leave nested loops when something has "gone wrong" in the sense of an "*Error" exception. Sometimes you might only be in a position to discover a problem condition within nested blocks, but recovery still makes better sense outside the nesting. Some typical examples are problems in I/O, calculation overflows, missing dictionary keys or list indices, and so on. Moreover, it is useful to assign except statements to the calling position that really needs to handle the problems, then write support functions as if nothing can go wrong. For example:

 >>> try: ...     result = complex_file_operation(filename) ... except IOError: ...     print "Cannot open file", filename 

The function complex_file_operation() should not be burdened with trying to figure out what to do if a bad filename is given to it there is really nothing to be done in that context. Instead, such support functions can simply propagate their exceptions upwards, until some caller takes responsibility for the problem.

The try statement has two forms. The try/except/else form is more commonly used, but the try/finally form is useful for "cleanup handlers."

In the first form, a try block must be followed by one or more except blocks. Each except may specify an exception or tuple of exceptions to catch; the last except block may omit an exception (tuple), in which case it catches every exception that is not caught by an earlier except block. After the except blocks, you may optionally specify an else block. The else block is run only if no exception occurred in the try block. For example:

 >>> def except_test(n): ...     try: x = 1/n ...     except IOError: print "IO Error" ...     except ZeroDivisionError: print "Zero Division" ...     except: print "Some Other Error" ...     else: print "All is Happy" ... >>> except_test(l) All is Happy >>> except_test(0) Zero Division >>> except_test('x') Some Other Error 

An except test will match either the exception actually listed or any descendent of that exception. It tends to make sense, therefore, in defining your own exceptions to inherit from related ones in the exceptions module. For example:

 >>> class MyException(IOError): pass >>> try: ...    raise MyException ... except IOError: ...    print "got it" ... got it 

In the try/finally form of the try statement, the finally statement acts as general cleanup code. If no exception occurs in the try block, the finally block runs, and that is that. If an exception was raised in the try block, the finally block still runs, but the original exception is re-raised at the end of the block. However, if a return or break statement is executed in a finally block or if a new exception is raised in the block (including with the raise statement) the finally block never reaches its end, and the original exception disappears.

A finally statement acts as a cleanup block even when its corresponding try block contains a return, break, or continue statement. That is, even though a try block might not run all the way through, finally is still entered to clean up whatever the try did accomplish. A typical use of this compound statement opens a file or other external resource at the very start of the try block, then performs several actions that may or may not succeed in the rest of the block; the finally is responsible for making sure the file gets closed, whether or not all the actions on it prove possible.

The try/finally form is never strictly needed since a bare raise statement will reraise the last exception. It is possible, therefore, to have an except block end with the raise statement to propagate an error upward after taking some action. However, when a cleanup action is desired whether or not exceptions were encountered, the try/finally form can save a few lines and express your intent more clearly. For example:

 >>> def finally_test(x): ...     try: ...         y = 1/x ...         if x > 10: ...             return x ...     finally: ...         print "Cleaning up..." ...     return y ... >>> finally_test(0) Cleaning up... Traceback (most recent call last):   File "<stdin>", line 1, in ?   File "<stdin>", line 3, in finally_test ZeroDivisionError: integer division or modulo by zero >>> finally_test(3) Cleaning up... 0 >>> finally_test(100) Cleaning up... 100 

A.4.8 Data as Code

Unlike in languages in the Lisp family, it is usually not a good idea to create Python programs that execute data values. It is possible, however, to create and run Python strings during program runtime using several built-in functions. The modules code, codeop, imp, and new provide additional capabilities in this direction. In fact, the Python interactive shell itself is an example of a program that dynamically reads strings as user input, then executes them. So clearly, this approach is occasionally useful.

Other than in providing an interactive environment for advanced users (who themselves know Python), a possible use for the "data as code" model is with applications that themselves generate Python code, either to run later or to communicate with another application. At a simple level, it is not difficult to write compilable Python programs based on templatized functionality; for this to be useful, of course, you would want a program to contain some customization that was determinable only at runtime.

eval(s [,globals=globals() [,locals=locals()]])

Evaluate the expression in string s and return the result of that evaluation. You may specify optional arguments globals and locals to specify the namespaces to use for name lookup. By default, use the regular global and local namespace dictionaries. Note that only an expression can be evaluated, not a statement suite.

Most of the time when a (novice) programmer thinks of using eval() it is to compute some value often numeric based on data encoded in texts. For example, suppose that a line in a report file contains a list of dollar amounts, and you would like the sum of these numbers. A naive approach to the problem uses eval() :

 >>> line = "$47  $33  $51  $76" >>> eval("+".join([d.replace('$', '') for d in line.split()])) 207 

While this approach is generally slow, that is not an important problem. A more significant issue is that eval() runs code that is not known until runtime; potentially line could contain Python code that causes harm to the system it runs on or merely causes an application to malfunction. Imagine that instead of a dollar figure, your data file contained os.rmdir("/"). A better approach is to use the safe type coercion functions int(), float(), and so on.

 >>> nums = [int(d.replace('$', '')) for d in line.split()] >>> from operator import add >>> reduce(add, nums) 207 
exec

The exec statement is a more powerful sibling of the eval() function. Any valid Python code may be run if passed to the exec statement. The format of the exec statement allows optional namespace specification, as with eval() :

 exec code [in globals [,locals]] 

For example:

 >>> s = "for i in range(10):\n  print i,\n" >>> exec s in globals(), locals() 0 1 2 3 4 5 6 7 8 9 

The argument code may be either a string, a code object, or an open file object. As with eval(), the security dangers and speed penalties of exec usually outweigh any convenience provided. However, where code is clearly under application control, there are occasionally uses for this statement.

__import__(s [,globals=globals() [,locals=locals() [,fromlist]]])

Import the module named s, using namespace dictionaries globals and locals. The argument fromlist may be omitted, but if specified as a nonempty list of strings e.g., [""] the fully qualified subpackage will be imported. For normal cases, the import statement is the way you import modules, but in the special circumstance that the value of s is not determined until runtime, use __import__().

 >>> op = __import__('os.path',globals(),locals(),['']) >>> op.basename('/this/that/other') 'other' 
input([prompt])

Equivalent to eval(raw_input (prompt)), along with all the dangers associated with eval() generally. Best practice is to always use raw_input(), but you might see input() in existing programs.

raw_input([prompt])

Return a string from user input at the terminal. Used to obtain values interactive in console-based applications.

 >>> s = raw_input('Last Name: ') Last Name: Mertz >>> s 'Mertz' 


Text Processing in Python
Text Processing in Python
ISBN: 0321112547
EAN: 2147483647
Year: 2005
Pages: 59
Authors: David Mertz

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net