Chapter 11: Concurrent Architectures | Concurrency: State Models and Java Programs

The term architecture is used in this chapter to mean the process structure of a concurrent program together with the way in which the elements of the program interact. For example, the client – server architecture of Chapter 10 is a structure consisting of one or more client processes interacting with a single server process. The interaction is bi-directional consisting of a request from a client to the server and a reply from the server to the client. This organization is at the heart of many distributed computing applications. The client – server architecture can be described independently of the detailed operation of client and server processes. We do not need to consider the service provided by the server or indeed the use the client makes of the result obtained from requesting the service. In describing the concurrent architecture of a program, we can ignore many of the details concerned with the application that the program is designed to implement. The advantage of studying architecture is that we can examine concurrent program structures that can be used in many different situations and applications. In the following, we look at some architectures that commonly occur in concurrent programs.

11.1 Filter Pipeline

A filter is a process that receives a stream of input values, performs some computation on these values and sends a stream of results as its output. In general, a filter can have more than one input stream and produce results on more than one output stream. Filters can easily be combined into larger computations by connecting the output stream from one filter to the input stream of another. Where filters have more than one input and output, they can be arranged into networks with complex topologies. In this section, we restrict the discussion to filters that have a single input and a single output. Such filters can be combined into pipeline networks. Many of the user-level commands in the UNIX operating system are filter processes, for example the text formatting programs tbl, eqn and troff. In UNIX, filter processes can be combined using pipes. A UNIX pipe is essentially a bounded buffer that buffers bytes of data output by one filter until they are input to the next filter. We will see in the following that the pipes that interconnect filters do not always need to include buffering.

To illustrate the use of filter pipelines, we develop a program with this architecture that computes prime numbers. The program is a concurrent implementation of a classic algorithm known as the Primes Sieve of Eratosthenes, after the Greek mathematician who developed it. The algorithm to determine all the primes between 2 and n proceeds as follows. First, write down a list of all the numbers between 2 and n:

 2 3 4 5 6 7 ... n

Then, starting with the first uncrossed-out number in the list, 2, cross out each number in the list which is a multiple of 2:

 2 3  5  7 ...n

Now move to the next uncrossed-out number, 3, and repeat the above by crossing out multiples of 3. Repeat the procedure until the end of the list is reached. When finished, all the uncrossed-out numbers are primes. The primes form a sieve which prevents their multiples falling through into the final list.

11.1.1 Primes Sieve Model

The concurrent version of the primes sieve algorithm operates by generating a stream of numbers. The multiples are removed by filter processes. The outline architecture of the program is depicted in Figure 11.1. It is essentially a process structure diagram from which we have omitted the details of action and process labels.

image from book
Figure 11.1: Primes Sieve process architecture.

The diagram describes the high-level structure of the program. The process GEN generates the stream of numbers from which multiples are filtered by the FILTER processes. To fully capture the architecture of the program, we must also consider how the application-specific processes interact. Interaction between these elements in the example is described by the PIPE processes. In the terminology of Software Architecture, these processes are termed connectors. Connectors encapsulate the interaction between the components of the architecture. Connectors and components are both modeled as processes. The distinction with respect to modeling is thus essentially methodological. We model the pipe connectors in the example as one-slot buffers as shown below:

 const MAX = 9 range NUM = 2..MAX set S = {[NUM],eos} PIPE = (put[x:S]->get[x]->PIPE).

The PIPE process buffers elements from the set S which consist of the numbers 2..MAX and the label eos, which is used to signal the end of a stream of numbers. This end of stream signal is required to correctly terminate the program.

To simplify modeling of the GEN and FILTER processes, we introduce an additional FSP construct – the conditional process. The construct can be used in the definition of both primitive and composite processes.

The process if B then P else Q behaves as the process P if the condition B is true otherwise it behaves as Q. If the else Q is omitted and B is false, then the process behaves as STOP.

The definition of GEN using the conditional process construct is given below:

 GEN       = GEN[2], GEN[x:NUM] = (out.put[x] ->                if x<MAX then                GEN[x+1]                else                (out.put.eos->end->GEN)              ).

The GEN process outputs the numbers 2 to MAX, followed by the signal eos. The action end is used to synchronize termination of the GEN process and the filters. After end occurs, the model re-initializes rather than terminates. This is done so that, if deadlock is detected during analysis, it will be an error and not because of correct termination. The FILTER process records the first value it gets and subsequently filters out multiples of that value from the numbers it receives and forwards to the next filter in the pipeline.

 FILTER = (in.get[p:NUM]->prime[p]->FILTER[p]          |in.get.eos->ENDFILTER          ), FILTER[p:NUM] = (in.get[x:NUM] ->                   if x%p!=0 then                     (out.put[x]->FILTER[p])                   else                     FILTER[p]                 |in.get.eos->ENDFILTER                 ), ENDFILTER     = (out.put.eos->end->FILTER).

The composite process that conforms to the structure given in Figure 11.1 can now be defined as:

 ||PRIMES(N=4) =    ( gen:GEN    ||pipe[0..N-1]:PIPE    ||filter[0..N-1]:FILTER    )/{ pipe[0]/gen.out,        pipe[i:0..N-1]/filter[i].in,        pipe[i:1..N-1]/filter[i-1].out,        end/{filter[0..N-1].end,gen.end}      }@{filter[0..N-1].prime,end}.

Safety analysis of this model detects no deadlocks or errors. The minimized LTS for the model is depicted in Figure 11.2. This confirms that the model computes the primes between 2 and 9 and that the program terminates correctly.

image from book
Figure 11.2: Minimized PRIMES LTS.

In fact, the concurrent version of the primes sieve algorithm does not ensure that all the primes between 2 and MAX are computed: it computes the first N primes where N is the number of filters in the pipeline. This can easily be confirmed by changing MAX to 11 and re-computing the minimized LTS, which is the same as Figure 11.2. To compute all the primes in the range 2 to 11, five filters are required.

Unbuffered Pipes

We have modeled the filter pipeline using single-slot buffers as the pipes connecting filters. Would the behavior of the overall program change if we used unbuffered pipes? This question can easily be answered by constructing a model in which the PIPE processes are omitted and instead, filters communicate directly by shared actions. This model is listed below. The action pipe[i] relabels the out.put action of filter[i-1] and the in.get action of filter[i].

 ||PRIMESUNBUF(N=4) =   (gen:GEN || filter[0..N-1]:FILTER)     /{ pipe[0]/gen.out.put,       pipe[i:0..N-1]/filter[i].in.get,       pipe[i:1..N-1]/filter[i-1].out.put,       end/{filter[0..N-1].end,gen.end}     }@{filter[0..N-1].prime,end}.

The minimized LTS for the above model is depicted in Figure 11.3.

image from book
Figure 11.3: Minimized PRIMESUNBUF LTS.

The reader can see that the LTS of Figure 11.3 is identical to that of Figure 11.2. The behavior of the program with respect to generating primes and termination has not changed with the removal of buffers. Roscoe (1998) describes a program as being buffer tolerant if its required behavior does not change with the introduction of buffers. We have shown here that the primes sieve program is buffer tolerant to the introduction of a single buffer in each pipe. Buffer tolerance is an important property in the situation where we wish to distribute a concurrent program and, for example, locate filters on different machines. In this situation, buffering may be unavoidable if introduced by the communication system.

Abstracting from Application Detail

We showed above that the primes sieve program is tolerant to the introduction of a single buffer in each pipe. The usual problem of state space explosion arises if we try to analyze a model with more buffers. To overcome this problem, we can abstract from the detailed operation of the primes sieve program. Instead of modeling how primes are computed, we concentrate on how the components of the program interact, independently of the data values that they process. The abstract versions of components can be generated mechanically by relabeling the range of values NUM to be a single value. This is exactly the same technique that we used in section 10.2.2 to generate an abstract model of an asynchronous message port. The abstract versions of GEN, FILTER and PIPE are listed below.

 ||AGEN    = GEN/{out.put/out.put[NUM]}. ||AFILTER = FILTER/{out.put/out.put[NUM],                     in.get /in.get.[NUM],                     prime /prime[NUM]                    }. ||APIPE   = PIPE/{put/put[NUM],get/get[NUM]}.

The LTS for the abstract version of a filter process is shown in Figure 11.4. In the detailed version of the filter, the decision to output a value depends on the computation as to whether the value is a multiple of the filter’s prime. In the abstract version, this computation has been abstracted to the non-deterministic choice as to whether, in state (4), after an in.get action, the LTS moves to state (5) and does an out.put action or remains in state (4). This is a good example of how nondeterministic choice is used to abstract from computation. We could, of course, have written the abstract versions of the elements of the primes sieve program directly, rather than writing detailed versions and abstracting mechanically as we have done here. In fact, since FSP, as a design choice, has extremely limited facilities for describing and manipulating data, for complex programs abstraction is usually the best way to proceed.

image from book
Figure 11.4: Minimized AFILTER LTS.

To analyze the primes sieve program with multi-slot pipes, we can use a pipeline of APIPE processes defined recursively as follows:

 ||MPIPE(B=4) =   if B==1 then     APIPE   else     (APIPE/{mid/get} || MPIPE(B-1)/{mid/put})   @{put,get}.

The abstract model for the primes program, with exactly the architecture of Figure 11.1, can now be defined as:

 ||APRIMES(N=4,B=3) =    (gen:AGEN || PRIMEP(N)    || pipe[0..N-1]:MPIPE(B)    || filter[0..N-1]:AFILTER)    /{ pipe[0]/gen.out,       pipe[i:0..N-1]/filter[i].in,       pipe[i:1..N-1]/filter[i-1].out,       end/{filter[0..N-1].end,gen.end}     }.

where PRIMEP is a safety property which we define in the following discussion.

Architectural Property Analysis

We refer to the properties that we can assert for the abstract model of the primes sieve program as architectural properties since they are concerned with the concurrent architecture of the program – structure and interaction – rather than its detailed operation. The general properties we wish to assert at this level are absence of deadlock and eventual termination (absence of livelock). Eventual termination is checked by the progress property END, which asserts that, in all executions of the program, the terminating action end must always occur.

 progress END = {end}

The property specific to the application is that the prime from filter[0] should be produced before the prime from filter[1] and so on. The following safety property asserts this:

 property PRIMEP(N=4)   = PRIMEP[0], PRIMEP[i:0..N]= (when (i<N)                   filter[i].prime->PRIMEP[i+1]                 |end -> PRIMEP                 ).

The property does not assert that all the filters must produce primes before end occurs since the model can no longer determine that there are four primes between 2 and 9.

Analysis of APRIME using LTSA determines that there are no deadlocks, safety violations or progress violations for four filters and three slot pipes. The reader should verify that the safety and progress properties hold for other combinations of filters and buffering. When building this model, it is important that the LTSA option Minimize during composition is set, otherwise the minimized models for the abstracted elements are not built and consequently, the reduction in state space is not realized.

11.1.2 Primes Sieve Implementation

Figure 11.5 is a screen shot of the Primes Sieve applet display. The implementation supports both a buffered and unbuffered implementation of the pipe connector. The figure depicts a run using unbuffered pipes. The box in the top left hand of the display depicts the latest number generated by the thread that implements GEN. The rest of the boxes, at the top, display the latest number received by a filter. The boxes below display the prime used by that filter to remove multiples.

image from book
Figure 11.5: Primes Sieve applet display.

The implementation follows in a straightforward way from the model developed in the previous section. The number generator and filter processes are implemented as threads. As mentioned above, we have provided two implementations for the pipe connector. The classes involved in the program and their inter-relationships are depicted in the class diagram of Figure 11.6. The display is handled by a single class, PrimesCanvas. The methods provided by this class, together with a description of their functions, are listed in Program 11.1.

image from book
Figure 11.6: Primes class diagram.

Program 11.1: PrimesCanvas class.

 class PrimesCanvas extends Canvas {   // display val in an upper box numbered index   // boxes are numbered from the left synchronized void print(int index, int val){...}   // display val in a lower box numbered index   // the lower box indexed by 0 is not displayed synchronized void prime(int index, int val){...}   // clear all boxes synchronized void clear(){...} }

The code for the Generator and Filter threads is listed in Programs 11.2 and 11.3. The implementation of these threads corresponds closely to the detailed models for GEN and FILTER developed in the previous section. Additional code has been added only to display the values generated and processed. To simplify the display and the pipe implementations, the end-of-stream signal is an integer value that does not occur in the set of generated numbers. The obvious values to use are 0, 1, -1 or MAX+1. In the applet class, Primes.EOS is defined to be -1.

Program 11.2: Generator class.

 class Generator extends Thread {   private PrimesCanvas display;   private Pipe<Integer> out;   static int MAX = 50;   Generator(Pipe<Integer> c, PrimesCanvas d)     {out=c; display = d;}   public void run() {     try {       for (int i=2;i<=MAX;++i) {         display.print(0,i);         out.put(i);         sleep(500);       }       display.print(0,Primes.EOS);       out.put(Primes.EOS);     } catch (InterruptedException e){}   } }

Program 11.3: Filter class.

 class Filter extends Thread {   private PrimesCanvas display;   private Pipe<Integer> in,out;   private int index;   Filter(Pipe<Integer> i, Pipe<Integer> o,     int id, PrimesCanvas d)     {in = i; out=o;display = d; index = id;}   public void run() {     int i,p;     try {       p = in.get();       display.prime(index,p);       if (p==Primes.EOS && out!=null) {         out.put(p); return;       }       while(true) {         i= in.get();         display.print(index,i);         sleep(1000);         if (i==Primes.EOS) {           if (out!=null) out.put(i); break;         } else if (i%p!=0 && out!=null)           out.put(i);       }     } catch (InterruptedException e){}   } }

Instead of implementing buffered and unbuffered pipes from scratch, we have reused classes developed in earlier chapters. The synchronous message-passing class, Channel, from Chapter 10 is used to implement an unbuffered pipe and the bounded buffer class, BufferImpl, from Chapter 5 is used to implement buffered pipes. The Pipe interface and its implementations are listed in Program 11.4.

Program 11.4: Pipe, PipeImplUnBuf and PipeImplBuf classes.

 public interface Pipe<T> {   public void put(T o)      throws InterruptedException; // put object into buffer   public T get()      throws InterruptedException; // get object from buffer } // Unbuffered pipe implementation public class PipeImplUnBuf<T> implements Pipe<T> {   Channel<T> chan = new Channel<T>();   public void put(T o)     throws InterruptedException {     chan.send(o);   }   public T get()     throws InterruptedException {     return chan.receive();   } } // Buffered pipe implementation public class PipeImplBuf<T> implements Pipe<T> {   Buffer<T> buf = new BufferImpl<T>(10);   public void put(T o)     throws InterruptedException {     buf.put(o);   }   public T get()     throws InterruptedException {     return buf.get();   } }

The structure of a generator thread and N filter threads connected by pipes is constructed by the go() method of the Primes applet class. The code is listed below:

 private void go(boolean buffered) {     display.clear();     //create channels     ArrayList<Pipe<Integer>> pipes =       new ArrayList<Pipe<Integer>>();     for (int i=0; i<N; ++i)       if (buffered)         pipes.add(new PipeImplBuf<Integer>());       else         pipes.add(new PipeImplUnBuf<Integer>());    //create threads    gen = new Generator(pipes.get(0),display);    for (int i=0; i<N; ++i)      filter[i] = new Filter(pipes.get(i),             i<N-1?pipes.get(i+1):null,i+1,display);      gen.start();      for (int i=0; i<N; ++i) filter[i].start(); }

Why Use Buffering?

We saw from modeling the primes sieve program that it computed the correct result, whether or not the pipes connecting filter processes were buffered. In line with the model, the implementation also works correctly with and without buffers. Why then should we ever use buffering in this sort of architecture when the logical behavior is independent of buffering? The answer is concerned with the execution efficiency of the program.

When a process or thread suspends itself and another is scheduled, the operating system performs a context switch which, as discussed in Chapter 3, involves saving the registers of the suspended process and loading the registers for the newly scheduled process. Context switching consumes CPU cycles and, although the time for a thread switch is much less than that for an operating system process, it is nevertheless an overhead. A concurrent program runs faster if we can reduce the amount of context switching. With no buffering, the generator and filter threads are suspended every time they produce an item until that item is consumed by the next thread in the pipeline. With buffers, a thread can run until the buffer is full. Consequently, in a filter pipeline, buffering can reduce the amount of context switching. In our implementation, this benefit is not actually realized since we have introduced delays for display purposes. However, it is generally the case that a pipeline architecture performs better with buffered pipes.

If filters are located on physically distributed processors, buffering has an additional advantage. When a message is sent over a communication link, there is a fixed processing and transmission overhead that is independent of message size. Consequently, when transmitting a lot of data, it is better to transmit a few large messages rather than many small messages. With buffering in the filter pipeline, it is easy to arrange that a sequence of items be sent in the same message.