11.3 Iterators | Programming for the Javaв„ў Virtual Machine

Iterators are a looping construct in the language Sather, a language developed at the International Computer Science Institute at Berkeley. Iterators are based on an idea in the language CLU. Iterators are designed to be safer than ordinary looping constructs, since creation, succession, and termination operations are defined where the iterator is defined, instead of depending on the code using the iterator to perform them.

11.3.1 Iterators in Sather

This example defines an iterator called fibonacci!, which yields Fibonacci numbers. In Sather, all iterators end with the character !.

 fibonacci! : INT          -- Declare an iterator which returns                           -- integer values    a : INT := 1;          -- Initialize a and b    b : INT := 2;    yield a;               -- Return 1 the first time    yield b;               -- Return 2 the second time    loop                   -- Loop forever        c : INT := a + b;  -- The next value is the sum                           -- of the two previous values        yield c;           -- Return that value        a := b;            -- Copy the values backwards,        b := c;            -- to set up the next iteration    end; end;

An iterator is used as a method is used in Java. A yield is similar to a Java return. A yield statement returns a value, and control returns to the calling procedure. However, when the calling procedure calls the iterator again, the iterator picks up at the same point that it left off, right after the yield.

For example, here's some code that prints out the first ten Fibonacci numbers:

 loop                         -- Loop until quitting    i : INT := range!(1, 10);  -- An iterator that yields values                               -- from 1 to 10 before quitting    f : INT := fibonacci!;     -- Get the next Fibonacci number    #OUT + i + ' ' + f + '\n'; -- Print it out end;

This code uses two iterators, range! and fibonacci!. The loop continues until one of them issues a quit command instead of yielding a value. The fibonacci! iterator never quits, so it's up to range! to terminate the loop.

The first time through the loop, range! yields the value 1, so i is set to 1. Next, it calls fibonacci!, which transfers control to the iterator. The iterator fibonacci! first sets the value of a to 1 and b to 2. The next thing it does is yield the value of a. This causes control to return to the caller, while the iterator is suspended. The value returned by the iterator call is 1, since that was the value of a when it was yielded. It is assigned to f. The expression #OUT + i + ' ' + f + ' \n' prints out the values of i and f, followed by a carriage return.

Next, the loop goes back to the beginning. This time, range! yields the value 2. Calling fibonacci! again, the iterator picks up where it left off after the line yield a. The next thing it does is yield the value of b, which is 2. This is printed out, and the outer loop continues.

The range! iterator yields 3 this time. Calling fibonacci! again, the iterator begins its own loop. It calculates c, which turns out to be 3, and yields that value. The iterator is suspended, right in the middle of the loop, while the outer loop continues with the yielded value 3.

The next time fibonacci! is called, it picks up right after yield c. The value of c is still 3. It copies b into a and c into b. Then it loops again, this time calculating c to be 5. It yields this value of c.

The iterator continues this way, with fibonacci! yielding 8, 13, 21, 34, 55, and 89. After it yields 89, the program loops back and calls range! again. This time, range! finds that it was going to yield 11, which is larger than its maximum range. Instead of yielding a value, it quits. This terminates the entire loop, so fibonacci! is not called again.

Iterators have many other uses. All arrays in Sather have an iterator called elt!, which yields successive elements of the array. This makes it easy to sum up all of the elements in an array:

 a:ARRAY{INT}:= | 43, 49649, 194 |;    -- Create an array of                                       -- integers with 3                                       -- elements total:INT := 0; loop    total:= total + a.elt!;            -- Sum up the elements end;

Each time elt! is called, it yields the next element of the array: first 43, then 49649, then 194. These are accumulated in total. After yielding 194, the elt! iterator quits, which terminates the enclosing loop.

The summation loop can be done even more succinctly with the sum! iterator from the Sather library:

 loop    total := sum!(a.elt!); end;

Each time sum! is called, it yields the sum of all of the values that have been passed in so far. The loop continues until elt! quits. It is not necessary to write a different sum! iterator to deal with linked lists, trees, or other data structures, since the semantics of element access are separated from the semantics of computing a sum. The semantics of summation are encapsulated within sum!, and the semantics of working with array elements are encapsulated within elt!.

This encapsulation prevents many common bugs that occur when loops are used. There is no possibility of failing to initialize the total to 0 or of accessing an invalid array element. By comparison, this is the equivalent Java code:

 int[] a = {43, 49649, 194 }; int total = 0; for(int i = 0; i < a.length; i++)    total += a[i];

In this code, the for loop is responsible for knowing that the array begins at 0 and ends one before a.length. The loop is also responsible for knowing that computation of a sum involves adding numbers successively, starting with 0. If a were a Vector or some other data structure, then code would have to be written differently.

The Sather sum! iterator encapsulates all the semantics of summation, and the elt! iterator encapsulates all the semantics of enumerating all of the elements of a data structure. This makes code not only more concise but also more likely to be correct. You have to write sum! correctly only once, then put it in the library; it can be used many times.

11.3.2 Implementing Iterators

To implement iterators in the Java virtual machine, each iterator must be able to keep the state of its local variables, as well as its current position. One way to do this is to make each iterator its own class, using an instance of the iterator inside the loop. The class keeps track of the state of the iterator with an integer value, using the tableswitch statement to continue where it left off.

Here is one possible implementation of the fibonacci! iterator. To get the next value from the iterator, call the method next. The variables a, b, and c are implemented as fields of the object. Because they are fields instead of local variables, they persist from one invocation of next to the next. The variable state$ is used to keep track of the last place that the next method was before it returned. The initial value of state$ is 0. There are four possible states: one for the beginning of the iterator, and one for each yield.

 .class fibonacci .field private a I                  ; Variables local to the .field private b I                  ; iterator .field private c I .field private state$ I             ; Keeps the state .method public next ()I .throws QuitException    aload_0    getfield fibonacci/state$ I      ; Jump to the current state    tableswitch 0                    ; Initally, the state is 0        state0        state1        state2        state3        default: done ; Start here, since state$ is initialized to 0 state0:    aload_0                          ; Initialize a and b    iconst_1    putfield fibonacci/a I    aload_0    iconst_2    putfield fibonacci/b I    aload_0    getfield fibonacci/a I           ; Compute return value (a)    aload_0    iconst_1    putfield fibonacci/state$ I      ; Change to state 1    ireturn                          ; Yield a ; Pick up here when state$ is 1 state1:    aload_0    getfield fibonacci/b I           ; Compute return value (b)    aload_0    iconst_2    putfield fibonacci/state$ I      ; Change to state 2    ireturn                          ; Yield b ; Begin here when state$ is 2 state2: loop:    aload_0    aload_0                          ; Compute c=a+b    getfield fibonacci/a I    aload_0    getfield fibonacci/b I    iadd    putfield fibonacci/c I    aload_0    getfield fibonacci/c I           ; Get c to return    aload_0                          ; Change to state 3    iconst_3    putfield fibonacci/state$ I    ireturn                          ; Yield c state3:    aload_0                          ; Shuffle b->a and c->b    aload_0    getfield fibonacci/b I    putfield fibonacci/a I    aload_0    aload_0    getfield fibonacci/c I    putfield fibonacci/b I    goto loop                        ; Continue with the loop done:                               ; Default case: quit by    new QuitException                ; throwing an exception    dup    invokespecial QuitException/<init> ()V    athrow .end method .method <init> ()V                  ; Basic constructor    aload_0    invokespecial java/lang/Object/<init> ()V    return .end method

Wherever the Sather code says to yield a value, the next method uses an ireturn. Before the return, the state$ variable is set to the next state. This variable corresponds to a label in the code that picks up right after the ireturn.

The code begins with a tableswitch, which jumps to some place within the body of the code. The first time through, state$ is 0, so the method starts at state0. It initializes a and b, then returns a. Before it returns, it sets state$ to 1. This yields the value of a and prepares the class to resume the next time the iterator is called.

The next time it is called, the tableswitch causes the code to start at state1, which returns b and sets state$ to 2.

The third call begins at state2, which corresponds to the beginning of the loop in the Sather program. The code computes c and returns the value. Before it returns, it sets the state to 3.

On all subsequent calls, the tableswitch causes the program to jump right into the middle of the loop at state 3, right after the ireturn which returned the value of c. It finishes the loop body, which updates a and b. Then it jumps back to the beginning of the loop, which computes and returns the next value of c.

All subsequent calls jump directly to state3, since the state never changes again. The default case at done never happens. Section 11.3.3 shows an iterator that does terminate.

An interesting thing to note is that this program doesn't correspond to any Java-language program, since you cannot write a switch that jumps directly into the middle of a loop. It is possible to write an equivalent method in Java, but the translation is less straightforward than this one.

11.3.3 Iterators with Arguments

The range! iterator is used to yield all numbers within a certain range. Unlike fibonacci!, the range! iterator requires arguments. In Sather, it is implemented as

 range!(min, max:INT) : INT is    -- Define the iterator range!                                  -- It takes 2 arguments, min                                  -- and max, which are both                                  -- integers    x : INT := min;               -- Initialize the loop variable    loop                          -- Loop until quitting        if x > max                -- Quit when x > max             then quit        end;        yield x;                  -- Yield the next value        x := x + 1;               -- Increment x    end; end;

This iterator initializes a loop variable x, adding 1 to x after each yield, until it reaches the upper bound max. Since there is only one yield statement, there are only two states: the beginning of the iterator and after the yield. The Oolong translation is

 .class range .field x I .field state$ I .method public next (II)I    aload_0                   ; Jump to the current state    getfield range/state$ I    tableswitch 0        state0        state1        default: done state0:    aload_0                   ; Initialize x to min    iload_1    putfield range/x I loop:    aload_0    getfield range/x I    iload_2    if_icmpgt done            ; Break the loop if x>max    aload_0    getfield range/x I        ; Get return value (x)    aload_0    iconst_1    putfield range/state$ I   ; Change to state 1    ireturn                   ; Yield x state1:    aload_0                   ; Increment x    aload_0    getfield range/x I    iconst_1    iadd    putfield range/x I    goto loop                 ; Loop again done:    new QuitException         ; Throw a QuitException when    dup                       ; done    invokespecial QuitException/<init> ()V    athrow .end method

In this iterator, min and max are local variables, which are initialized each time the iterator is called. We represent x as a field, since its state is persistent across invocations.

The loop first checks to see if the termination condition is met. If the test fails, then the iterator returns the latest value of x. On subsequent calls to this iterator, control begins at state1, right after the yield. This increments x and jumps back to the termination test. It continues to do this, yielding subsequent values of x, until the termination test succeeds.

When the termination test succeeds, then control passes to done. At done, the method throws a QuitException. This signals to the enclosing loop that it is time to stop calling the iterator.

11.3.4 Using Iterators

To use the iterators implemented here, it is necessary to create an instance of each iterator contained in the program, calling next each time a value is desired. This Sather program uses the fibonacci! and range! iterators to print the first ten Fibonacci numbers.

 loop                          -- Loop until quitting    i : INT := range!(1, 10);  -- An iterator that yields values                               -- from 1 to 10 before quitting    f : INT := fibonacci!;     -- Get the next fibonacci number    #OUT + i + ' ' + f + '\n'; -- Print it out end;

This code can be implemented in Oolong as

 .catch QuitException from loop to end using end .var 1 is fibonacci Lfibonacci; .var 2 is range Lrange; .var 3 is i I .var 4 is f I    new fibonacci                        ; Create the iterators,    dup                                  ; storing them in    invokespecial fibonacci/<init> ()V   ; variables 1 and 2    astore_1    new range    dup    invokespecial range/<init> ()V    astore_2 loop:                                   ; Beginning of the loop    aload_2                              ; Call range! with 1, 10    iconst_1    bipush 10    invokevirtual range/next (II)I    istore_3                             ; Put that into i    aload_1                              ; Call fibonacci!    invokevirtual fibonacci/next ()I    istore 4                             ; Store the result in f ;; Print out i and f (code omitted)    goto loop                            ; Loop again end:    pop                        ; Remove the exception    ;; Control continues here after the loop

The code begins by creating instances of the range and fibonacci iterator classes to represent the two iterators.

After initializing the iterators, the code goes into an infinite loop. It continues until an iterator terminates by throwing a QuitException. When the exception is thrown, it is caught out of the loop at end, and control continues there. The exception itself is unimportant, so it is removed from the stack.

In this code, the variables fibonacci, range, i, and f were represented by locals, since this code never yields. If there were a yield in the loop, some of the variables would have to be represented instead by fields, which would ensure their persistence across yields.