Section 8.11. Iterators and the iter() Function


8.11. Iterators and the iter() Function

8.11.1. What Are Iterators?

Iterators were added to Python in version 2.2 to give sequence-like objects a sequence-like interface. We formally introduced sequences back in Chapter 6. They are just data structures that you can "iterate" over by using their index starting at 0 and continuing till the final item of the sequence. Because you can do this "counting," iterating over sequences is trivial. Iteration support in Python works seamlessly with sequences but now also allows programmers to iterate through non-sequence types, including user-defined objects.

Iterators come in handy when you are iterating over something that is not a sequence but exhibits behavior that makes it seem like a sequence, for example, keys of a dictionary, lines of a file, etc. When you use loops to iterate over an object item, you will not be able to easily tell whether it is an iterator or a sequence. The best part is that you do not have to care because Python makes it seem like a sequence.

8.11.2. Why Iterators?

The defining PEP (234) cites that iterators:

  • Provide an extensible iterator interface.

  • Bring performance enhancements to list iteration.

  • Allow for big performance improvements in dictionary iteration.

  • Allow for the creation of a true iteration interface as opposed to overriding methods originally meant for random element access.

  • Be backward-compatible with all existing user-defined classes and extension objects that emulate sequences and mappings.

  • Result in more concise and readable code that iterates over non-sequence collections (mappings and files, for instance).

8.11.3. How Do You Iterate?

Basically, instead of an index to count sequentially, an iterator is any item that has a next() method. When the next item is desired, either you or a looping mechanism like for will call the iterators next() method to get the next value. Once the items have been exhausted, a StopIteration exception is raised, not to indicate an error, but to let folks know that we are done.

Iterators do have some restrictions, however. For example, you cannot move backward, go back to the beginning, or copy an iterator. If you want to iterate over the same objects again (or simultaneously), you have to create another iterator object. It isn't all that bad, however, as there are various tools to help you with using iterators.

There is a reversed() built-in function that returns an iterator that traverses an iterable in reverse order. The enumerate() BIF also returns an iterator. Two new BIFs, any() and all(), made their debut in Python 2.5they will return true if any or all items traversed across an iterator have a Boolean true value, respectively. We saw earlier in the chapter how you can use it in a for loop to iterate over both the index and the item of an iterable. There is also an entire module called itertools that contains various iterators you may find useful.

8.11.4. Using Iterators with ...

Sequences

As mentioned before, iterating through Python sequence types is as expected:

>>> myTuple = (123, 'xyz', 45.67) >>> i = iter(myTuple) >>> i.next() 123 >>> i.next() 'xyz' >>> i.next() 45.67 >>> i.next() Traceback (most recent call last):   File "", line 1, in ? StopIteration


If this had been an actual program, we would have enclosed the code inside a try-except block. Sequences now automatically produce their own iterators, so a for loop:

for i in seq:     do_something_to(i)


under the covers now really behaves like this:

fetch = iter(seq) while True:     try:         i = fetch.next()     except StopIteration:         break     do_something_to(i)


However, your code does not need to change because the for loop itself calls the iterator's next() method (as well as monitors for StopIteration).

Dictionaries

Dictionaries and files are two other Python data types that received the iteration makeover. A dictionary's iterator traverses its keys. The idiom for eachKey in myDict.keys() can be shortened to for eachKey in myDict as shown here:

>>> legends = { ('Poe', 'author'): (1809, 1849, 1976), ...  ('Gaudi', 'architect'): (1852, 1906, 1987), ...  ('Freud', 'psychoanalyst'): (1856, 1939, 1990) ... } ... >>> for eachLegend in legends: ...    print 'Name: %s\tOccupation: %s' % eachLegend ...    print '  Birth: %s\tDeath: %s\tAlbum: %s\n' \ ...    % legends[eachLegend] ... Name: Freud     Occupation: psychoanalyst   Birth: 1856   Death: 1939     Album: 1990 Name: Poe       Occupation: author   Birth: 1809   Death: 1849     Album: 1976 Name: Gaudi     Occupation: architect   Birth: 1852   Death: 1906     Album: 1987


In addition, three new built-in dictionary methods have been introduced to define the iteration: myDict.iterkeys() (iterate through the keys), myDict.itervalues() (iterate through the values), and myDict.iteritems() (iterate through key/value pairs). Note that the in operator has been modified to check a dictionary's keys. This means the Boolean expression myDict.has_key(anyKey) can be simplified as anyKey in myDict.

Files

File objects produce an iterator that calls the readline() method. Thus, they loop through all lines of a text file, allowing the programmer to replace essentially for eachLine in myFile.readlines() with the more simplistic for eachLine in myFile:

>>> myFile = open('config-win.txt') >>> for eachLine in myFile: ...     print eachLine,   # comma suppresses extra \n ... [EditorWindow] font-name: courier new font-size: 10 >>> myFile.close()


8.11.5. Mutable Objects and Iterators

Remember that interfering with mutable objects while you are iterating them is not a good idea. This was a problem before iterators appeared. One popular example of this is to loop through a list and remove items from it if certain criteria are met (or not):

for eachURL in allURLs:     if not eachURL.startswith('http://'):         allURLs.remove(eachURL)       # YIKES!!


All sequences are immutable except lists, so the danger occurs only there. A sequence's iterator only keeps track of the Nth element you are on, so if you change elements around during iteration, those updates will be reflected as you traverse through the items. If you run out, then StopIteration will be raised.

When iterating through keys of a dictionary, you must not modify the dictionary. Using a dictionary's keys() method is okay because keys() returns a list that is independent of the dictionary. But iterators are tied much more intimately with the actual object and will not let us play that game anymore:

>>> myDict = {'a': 1, 'b': 2, 'c': 3, 'd': 4} >>> for eachKey in myDict: ...   print eachKey, myDict[eachKey] ...   del myDict[eachKey] ... a 1 Traceback (most recent call last):   File "", line 1, in ? RuntimeError: dictionary changed size during iteration


This will help prevent buggy code. For full details on iterators, see PEP 234.

8.11.6. How to Create an Iterator

You can take an item and call iter() on it to turn it into an iterator. Its syntax is one of the following:

iter(obj) iter(func, sentinel)


If you call iter() with one object, it will check if it is just a sequence, for which the solution is simple: It will just iterate through it by (integer) index from 0 to the end. Another way to create an iterator is with a class. As we will see in Chapter 13, a class that implements the __iter__() and next() methods can be used as an iterator.

If you call iter() with two arguments, it will repeatedly call func to obtain the next value of iteration until that value is equal to sentinel.



Core Python Programming
Core Python Programming (2nd Edition)
ISBN: 0132269937
EAN: 2147483647
Year: 2004
Pages: 334
Authors: Wesley J Chun

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net