Section 11.10. Generators


11.10. Generators

Earlier in Chapter 8, we discussed the usefulness behind iterators and how they give non-sequence objects a sequence-like iteration interface. They are simple to understand because they only have one method, a next() that is called to get the next item.

However, unless you implement a class as an iterator, iterators really do not have much "intelligence." Would it not be much more powerful to call a function that somehow "generated" the next value in the iteration and returned with something as simple as a next() call? That is one motivation for generators.

Another aspect of generators is even more powerful ... the concept of coroutines. A coroutine is an independent function call that can run, be paused or suspended, and be continued or resumed where it left off. There is also communication between the caller and the (called) coroutine. For example, when a coroutine pauses, we can receive an intermediate return value from it, and when calling back into one, to be able to pass in additional or altered parameters, yet still be able to pick up where we last left it, with all state still intact.

Coroutines that are suspended yielding intermediate values and resumed multiple times are called generators, and that is exactly what Python generators do. Generators were added to Python in 2.2 and made standard in 2.3 (see PEP 255), and although powerful enough, they were significantly enhanced in Python 2.5 (see PEP 342). These enhancements bring generators even closer to being full coroutines because values (and exceptions) are allowed to be passed back into a resumed function. Also, generators can now yield control while waiting for a generator it has called to yield a result instead of blocking to wait for that result to come back before the calling generator can suspend (and yield a result). Let us take a closer look at generators starting from the top.

What is a generator Python-wise? Syntactically, a generator is a function with a yield statement. A function or subroutine only returns once, but a generator can pause execution and yield intermediate resultsthat is the functionality of the yield statement, to return a value to the caller and to pause execution. When the next() method of a generator is invoked, it resumes right where it left off (when it yielded [a value and] control back to the caller).

When generators were added back in 2.2, because it introduced a new keyword, yield, for backward compatibility, you needed to import generators from the __future__ module in order to use them. This was no longer necessary when generators became standard beginning with 2.3.

11.10.1. Simple Generator Features

Generators behave in another manner similar to iterators: when a real return or end-of-function is reached and there are no more values to yield (when calling next()), a StopIteration exception is raised. Here is an example, the simplest of generators:

def simpleGen():     yield 1     yield '2 --> punch!'


Now that we have our generator function, let us call it to get and save a generator object (so that we can call its next() method to get successive intermediate values from it):

>>> myG = simpleGen() >>> myG.next() 1 >>> myG.next() '2 --> punch!' >>> myG.next() Traceback (most recent call last):   File "", line 1, in ?     myG.next() StopIteration


Since Python's for loops have next() calls and a handler for StopIteration, it is almost always more elegant to use a for loop instead of manually iterating through a generator (or an iterator for that matter):

>>> for eachItem in simpleGen(): ...     print eachItem ... 1 '2 --> punch!'


Of course that was a silly example: why not use a real iterator for that? More motivation comes from being able to iterate through a sequence that requires the power of a function rather than static objects already sitting in some sequence.

In the following example, we are going to create a random iterator that takes a sequence and returns a random item from that sequence:

from random import randint def randGen(aList):     while len(aList) > 0:          yield aList.pop(randint(0, len(aList)))


The difference is that each item returned is also consumed from that sequence, sort of like a combination of list.pop() and random. choice():

>>> for item in randGen(['rock', 'paper', 'scissors']): ...     print item ... scissors rock paper


We will see a simpler (and infinite) version of this generator as a class iterator coming up in a few chapters when we cover Object-Oriented Programming. Several chapters ago in Section 8.12, we discussed the syntax of generator expressions. The object returned from using this syntax is a generator, but serves as a short form, allowing for the simplistic syntax of a list comprehension.

These simple examples should give you an idea of how generators work, but you may be asking, "Where can I use generators in my application?" Or perhaps, you may be asking, "What are the most appropriate places for using this powerful construct?"

The "best" places to use generators are when you are iterating through a large dataset that is cumbersome to repeat or reiterate over, such as a large disk file, or a complex database query. For every row of data, you wish to perform non-elementary operations and processing, but you "do not want to lose your place" as you are cursoring or iterating over it.

You want to grab a wad of data, yield it back to the caller for processing and possible insertion into a(nother) database for example, and then you want to do a next() to get the next wad of data, and so forth. The state is preserved across suspends and resumptions, so you are more comfortable that you have a safe environment in which to process your data. Without generators, you application code will likely have a very long function, with a very lengthy for loop inside of it.

Of course, just because a language has a feature does not mean you have to use it. If there does not appear to be an obvious fit in your application, then do not add any more complexity! You will know when generators are the right thing to use when you come across an appropriate situation.

11.10.2. Enhanced Generator Features

A few enhancements were made to generators in Python 2.5, so in addition to next() to get the next value generated, users can now send values back into generators [send()], they can raise exceptions in generators [tHRow()], and request that a generator quit [close()].

Due to the two-way action involved with code calling send() to send values to a generator (and the generator yielding values back out), the yield statement now must be an expression since you may be receiving an incoming object when resuming execution back in the generator. Below is a simple example demonstrating some of these features. Let us take our simple closure example, the counter:

def counter(start_at=0):     count = start_at     while True:         val = (yield count)         if val is not None:             count = val         else:             count += 1


This generator takes an initial value, and counts up by one for each call to continue the generator [next()]. Users also have the option to reset this value if they so desire by calling send() with the new counter value instead of calling next(). This generator runs forever, so if you wish to terminate it, call the close() method. If we run this code interactively, we get the following output:

>>> count = counter(5) >>> count.next() 5 >>> count.next() 6 >>> count.send(9) 9 >>> count.next() 10 >>> count.close() >>> count.next() Traceback (most recent call last):   File "<stdin>", line 1, in <module> StopIteration


You can read more about generators in PEPs 255 and 342, as well as in this Linux Journal article introducing readers to the new features in Python 2.2:

http://www.linuxjournal.com/article/5597



Core Python Programming
Core Python Programming (2nd Edition)
ISBN: 0132269937
EAN: 2147483647
Year: 2004
Pages: 334
Authors: Wesley J Chun

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net