7.9 State Machines and Indirect Jumps

Another control structure commonly found in assembly language programs is the state machine. A state machine uses a state variable to control program flow. The FORTRAN programming language provides this capability with the assigned GOTO statement. Certain variants of C (e.g., GNU's GCC from the Free Software Foundation) provide similar features. In assembly language, the indirect jump provides a mechanism to easily implement state machines.

So what is a state machine? In very basic terms, it is a piece of code^[5] that keeps track of its execution history by entering and leaving certain "states." For the purposes of this chapter, we'll not use a very formal definition of a state machine. We'll just assume that a state machine is a piece of code that (somehow) remembers the history of its execution (its state) and executes sections of code based upon that history.

In a very real sense, all programs are state machines. The CPU registers and values in memory constitute the "state" of that machine. However, we'll use a much more constrained view. Indeed, for most purposes only a single variable (or the value in the EIP register) will denote the current state.

Now let's consider a concrete example. Suppose you have a procedure that you want to perform one operation the first time you call it, a different operation the second time you call it, yet something else the third time you call it, and then something new again on the fourth call. After the fourth call it repeats these four different operations in order. For example, suppose you want the procedure to add EAX and EBX the first time, subtract them on the second call, multiply them on the third, and divide them on the fourth. You could implement this procedure as follows:

 procedure StateMachine; static      State:byte := 0; begin StateMachine;      cmp( State, 0 );      jne TryState1;           // State 0: Add EBX to EAX and switch to state 1:           add( ebx, eax );           inc( State );           exit StateMachine;      TryState1:      cmp( State, 1 );      jne TryState2;           // State 1: subtract ebx from EAX and switch to state 2:           sub( ebx, eax );           inc( State );        // State 1 becomes State 2.           exit StateMachine;      TryState2:      cmp( State, 2 );      jne MustBeState3;           // If this is state 2, multiply EAX by EAX and switch to state 3:           intmul( ebx, eax );           inc( State );       // State 2 becomes State 3.           exit StateMachine;      // If it isn't one of the above states, we must be in state 3      // So divide eax by ebx and switch back to state zero.      MustBeState3:      push( edx );         // Preserve this 'cause it gets whacked by DIV.      xor( edx, edx );     // Zero extend EAX into EDX.      div( ebx, edx:eax);      pop( edx );          // Restore EDX's value preserved above.      mov( 0, State );     // Reset the state back to zero. end StateMachine;

Technically, this procedure is not the state machine. Instead, it is the variable State and the cmp/jne instructions that constitute the state machine.

There is nothing particularly special about this code. It's little more than a switch statement implemented via the if..then..elseif construct. The only thing special about this procedure is that it remembers how many times it has been called^[6] and behaves differently depending upon the number of calls. While this is a correct implementation of the desired state machine, it is not particularly efficient. The astute reader, of course, would recognize that this code could be made a little faster using an actual switch statement rather than the if..then..elseif implementation. However, there is a better way… .

The more common implementation of a state machine in assembly language is to use an indirect jump. Rather than having a state variable that contains a value like zero, one, two, or three, we could load the state variable with the address of the code to execute upon entry into the procedure. By simply jumping to that address, the state machine could save the tests above needed to execute the proper code fragment. Consider the following implementation using the indirect jump:

 procedure StateMachine; static      State:dword := &State0; begin StateMachine;      jmp( State );           // State 0: Add EBX to EAX and switch to state 1:      State0:           add( ebx, eax );           mov( &State1, State );           exit StateMachine;      State1:           // State 1: subtract ebx from EAX and switch to state 2:           sub( ebx, eax );           mov( &State2, State );     // State 1 becomes State 2.           exit StateMachine;      State2:           // If this is state 2, multiply EAX by EAX and switch to state 3:           intmul( ebx, eax );           mov( &State3, State );     // State 2 becomes State 3.           exit StateMachine;      // State 3: divide eax by ebx and switch back to state zero.      State3:           push( edx );               // Preserve this 'cause it gets whacked by DIV.           xor( edx, edx );           // Zero extend EAX into EDX.           div( ebx, edx:eax);           pop( edx );                // Restore EDX's value preserved above.           mov( &State0, State );     // Reset the state back to zero. end StateMachine;

The jmp instruction at the beginning of the StateMachine procedure transfers control to the location pointed at by the State variable. The first time you call StateMachine it points at the State0 label. Thereafter, each subsection of code sets the State variable to point at the appropriate successor code.

^[5]Note that state machines need not be software based. Many state machines' implementation are hardware based.

^[6]Actually, it remembers how many times, MOD 4, that it has been called.