2.4 A Simple Execution Model

< Day Day Up >

How a program is actually instantiated and executed on a computer is a complex subject often requiring several semesters' study of operating systems, programming languages, and compiler theory to sort out all the details. The JVM is no exception. Like most computer architectures, it is elegant and complex. Fortunately, for the purposes of this text, the basic behavior of concurrent programs can be explained without the complexity of creating a working computer architecture. This chapter develops such a model that is used to explain the behavior of programs in this and later chapters. At the end of this section, this simple model is used to explain the basic behavior of current-generation Web servers.

While the model developed in this chapter is true to the spirit of the JVM and explains all the behavior necessary to talk about the threads in this chapter, it does not claim to be completely true to the JVM. In some cases, it alters details when those details add unneeded complexity, and large sections of the JVM are not discussed at all. In order to make it clear that the model used is not the JVM and does not implement the actual JVM memory model, the virtual machine described here is called a simple virtual machine (SVM) and implements a simple memory model (SMM). Readers interested in more in-depth treatment of the JVM or computer architectures in general should refer to the Further Reading section at the end of this chapter.

2.4.1 A Simple Memory Model (SMM)

Like most architectures, the SVM uses several different types of memory when running a program. The purpose of each part of this memory, as well as the ability of parts of a program to access it, is to logically partition memory to allow the program to execute correctly. The SMM used by the SVM will use four different types of memory:

A program counter (pc) register keeps track of the current instructions being executed; this would be an address in the JVM, but in this chapter it is a line of program code.
A heap is used to store all objects and object data; this area is shared by all the SVM threads.
A method area contains class definitions and compile program instructions for running the programs; this area is also shared by all SVM threads.
A program context contains information unique to each thread, such as the SVM stack. It can also contain other information, such as the PC associated with this thread. The SVM stack contains information such as the variables local to this method and the return address for the calling method.

Each separate SVM program (called a process) contains one PC, heap, method area, and one or more thread contexts, one for each thread. Exhibit 8 shows graphically how this would look on a computer running two SVM processes, the first with threads main and t1 and the second with threads main, t1, and t2.

Exhibit 8: Graphic Representation of Two Processes in the SVM

click to expand

One analogy that may help you to understand this concept of a process is to imagine a single builder who is working on several houses. He can only work on one task at a time. Assume that the builder is building a row of townhouses, using the same plans for each, and that he has a single set of tools for working on the houses; therefore, the builder can share the plans and tools. Because not all material to build the houses is available all the time, the builder moves between the houses, working first on one and then another as material for a specific task for a particular house becomes available.

By our analogy, the builder is the CPU, which can be working on one task at a time. The tools are the objects (heap), and the plans are the instructions we are using (the method area). Each townhouse (a thread) has its own status (PC), and the PC used by the builder (CPU) is the one for the townhouse on which he is currently working. The overall construction site is the process, with shared plans and tools but separate townhouses (threads). When the builder moves from working on one townhouse to working on another, this shift is called a context switch, as the context (townhouse) that is being worked on has changed.

This analogy can be extended to a process. If the builder is working on two different sets of townhouses at different locations, each construction site becomes a process. Each of these sites will have its own plans (method area) and tools (heap), so the plans and tools are not shared between the sites but are shared within the sites. Each site also has a set of townhouses to be constructed (threads). When the builder switches from working on one townhouse to another, either at the current site or at the other site, this change is still referred to as a context switch.

As with any analogy, this one between building townhouses and processes and threads can be carried only so far. At some point, the actual workings of the real system must be understood because extending the analogy without understanding the underlying system is likely to result in an invalid understanding of the how the real system works. So, the analogy should never be extended without an understanding of how the real system works.

2.4.2 Threads, Processes, and Web Servers

The SMM is a very simple model of memory with too many details left out to be of much use in actually implementing a real computer architecture. However, as will be shown in Section 2.4.3, it provides sufficient detail to explain the behavior of a concurrent program. This section uses it to explain how Web servers work and why mechanisms such as servlets are replacing older technologies such as the common gateway interface (CGI).

Operating systems were first written so that multiple programs could be run at the same time on a single computer. There was no thought given to the possibility that a single program could use more than a single thread or require more than one context. Each program was considered to be a single, complete entity that did not need to communicate with any other programs while running. Therefore, these operating systems combined the context into the process, and a process had a single heap, method area, and only one thread (or context). If a separate thread was needed, a new process was started. It was much later that programmers began to realize that they could effectively use more than a single thread in a program. A thread is much more effective than a process for handling multiple contexts for two reasons:

Every process has its own heap and method area and therefore has much more baggage (or weight) than a thread. To create a thread, only a new context is created which shares the heap and method area with the context that created it; hence, a thread is more "lightweight" than a process. Threads are often referred to as lightweight processes and processes as heavyweight processes. Because a thread has much less overhead, it is less expensive to create a thread and to switch between two threads in the same process. In particular, creation of a thread can require several orders of magnitude fewer resources to create than a process.
Because threads share a heap, they can communicate using the variables in the heap. This makes communication between threads easier than between processes, because processes do not share any runtime memory. To share data between processes, external data stores, such as files, must be used. These can be computationally expensive options for sharing data.

When Web servers were first introduced, they were used with programs such as Perl or C/C++ to process data from Web pages. These early programs were implemented as processes using the CGI mechanism. CGI always created a process to run the program, much as early operating systems did, which resulted in two major drawbacks to implementing Web servers as processes:

Each time a new request came from a Web page, a new process had to be created to handle it. Because every Web page would produce a request, this required that a process be started for every Web page requested from the server. When the Web was new, the number of requests was small, and this was not a problem; however, as use of the Web grew, some Web servers began to receive a million or more requests an hour. The overhead in creating a process became very large and threatened to swamp even very fast servers.
Because all data associated with a request was contained in a process, when the process exited the data was no longer available on the server; however, persistent data between Web pages was necessary for tracking the current status of the user to know what actions to take when a request was processed. Several solutions to this problem were attempted by programmers. The data could be written to a file and retrieved with each request, but access to a disk is slow and bogged down the system. The data needed was sometimes put in hidden fields on the HTML page sent to the client computer, but this data was easily hacked and changed and was simply not secure. Other mechanisms to handle the need for persistent data were tried, but all had significant problems.

The problems of large overhead in starting a process and the inability to share data effectively have been solved by applying technologies that use threads, such as servlets. ^[1] The big difference between CGI and servlets is that, while CGI uses a separate process for each request, a servlet processes each request as a thread inside of a process, which solves the two problems encountered with CGI. Because threads require much fewer resources for creation and somewhat fewer resources for context switches, servlets and other technologies that use threads require much fewer resources than process-based technologies such as CGI.

Because many threads can share the heap, each request can create a thread and cache information in the heap. When the next request comes from the user, this cached information can be retrieved and the new thread created to process this request. This allows for the use of a simple mechanism for maintaining a persistent store on the server.

Using threads has solved a number of problems encountered when implementing programs to run on Web servers, but it has created many others. Because the requests running on the Web servers are now sharing data inside of a process, all the problems that occur with communicating asynchronous activities are present and need to be accounted for. The last two sections of this chapter outline these problems, and the rest of this textbook is concerned with how to handle such problems.

2.4.3 Program Execution for a Procedural Program

This section explains how asynchronous activities, such as threads and processes, can interleave execution. It begins by explaining how a simple procedural program executes and then uses the same model to show how the SVM can use thread contexts in the execution of a concurrent program. The model of execution is expanded in subsequent sections to explain concepts such as process states, the Java synchronized modifier, and wait, notify, and notifyAll methods.

This discussion begins by considering the procedural program in Exhibit 9. In this figure, the lines in the program have been numbered to show the statement in the program that is being executed in the PC. In a real program, the PC would be set to the address of an instruction in the method area.

Exhibit 9: Procedural Program for Exhibit 10

  1 public class PExec {  2   public void run() {  3     int counter = 0;  4     System.out.println("In run, counter = "+ counter);  5     counter++;  6     System.out.println("In run, counter = "+ counter);  7     return;  8   }  9   public static void main(String args[]) { 10     PExec pe = new PExec(); 11     pe.run(); 12     return; 13   } 14}

The following details how the execution of the program would proceed. Each step corresponds to a frame in Exhibit 10, which shows the SMM state of the process after that step has been executed. This can lead to some confusion, as the PC always points to the next line to be executed, not the line that was just executed.

Exhibit 10: Steps in Executing a Procedural Program

click to expand

The program begins execution. The SVM creates the heap and thread context for main, initializing them to null. It also creates the method area and loads the methods for the PExec class into the method area. Finally, it creates a PC. Because the program is still starting up, the PC exists and might even be used, but its value for now is undefined.
The SVM pushes an activation record for the main method on the stack. As part of this activation record, a reference to the PExec object is created; however, the object itself is not yet created, so the reference is null. We will represent this reference using a convention similar to that in C: *pe. The SVM at this point also assigns the process PC and the thread PC to the first executable line in the main method, line 10.
The SVM now executes the code at line 10, which creates an instance of the PExec class in the heap. The *pe reference points to this object in the heap (note that it is no longer null), and the SVM updates the PC in this context and the process PC to point to the next executable line in the program, line 11.
The SVM prepares to call the run method. First, the PC of the next line to be executed in the main, line 12, is stored in a special variable in the activation record of the main method that we will call the retPC. This is the line where execution will continue when the run method returns and is not accessible by a user program. The SVM then creates an activation record for the run method, creates the local variable named counter that is declared in this method, and sets the counter to 0. Finally, the SVM updates the PC to point to the first line of code in the run method, line 4.
Steps 5, 6, and 7 are all combined in Exhibit 10, and the values used are the ones present at the end of step 7. In steps 4, 5, and 6, the SVM executes in the run method, outputting a line with the counter, updating the counter variable, and outputting a second line. At the end of this series of steps the PC points to line 7, and the counter has been incremented to 1 on the stack.
The SVM now executes the return from the run method. This causes the run methods activation record to be popped (and deleted) from the stack, and the PC to be set to the retPC that was saved when this method was called. The program is now executing back in the main method.
The main method executes a return, and the main methods activation record is popped from the stack. This means that the reference to the PExec object is deleted, but the actual object is deleted later when the stack is cleaned up using a process called garbage collection. Also, the information in the method area is not deleted. This leaves the process without any context. The PC is now undefined, as the SVM is doing internal housekeeping to clean up the process.
The SVM now destroys this process, and any information left in the heap or the method area will be destroyed when the process is destroyed.

This example shows how the SVM can be used to explain the execution of a procedural program. The next section adds a few details to the SVM and explains the execution of a concurrent program.

2.4.4 Program Execution and Context Switching with Threads

The execution of a concurrent process in the SVM is the same as for a procedural program with one major difference. Because a concurrent program has multiple threads, multiple contexts will exist in a concurrent process, and, because each context has its own PC, the SVM must choose which thread context to run at each execution cycle and hence which context's PC to use. Many factors can go into this decision, some of which can only be decided at run-time. For example, if a context has had several statements run without interruption, the SVM may decide to allow another context to run on the next cycle. Because it is impossible for the programmer to know all the factors that will go into deciding what thread context will be selected next, the rule for the SVM will be that at any time any valid and ready thread context can be selected. The programmer should never make any assumptions about what context, and hence what PC, will be selected.

This discussion of threads begins by answering the question of how a new thread context gets created and the thread started. When a thread is created, it is initially in a born state (see Exhibit 11). The start method is then called on the thread, and the thread is moved to the ready state, a state from which it can be run. Often, many threads are ready to run and must be managed by the SVM. This is done by placing these threads in a ready set, ^[2] as shown in Exhibit 12. The thread that is currently executing using the CPU is in the running state. For a computer with a single CPU, only one thread can be in the running state at any time.

Exhibit 11: SVM Thread States

click to expand

Exhibit 12: Relationship between Ready and Running States in SVM

click to expand

When the SVM chooses the next thread to execute, it can choose the thread that is currently in the running state or any thread that is in the ready set; the programmer cannot influence this decision, ^[3] and any one of these threads is equally likely to be chosen. If the SVM chooses the thread already running, it simply executes the instruction in the process PC. If it chooses a thread from the ready set, the process PC is written to the context for the currently running thread, its state is changed to ready, the PC of the new thread is loaded into the process PC, and the state of the newly executing thread is changed to running. Because the context (PC and stack that are used) has been changed when this occurs, the process of suspending one thread and starting another is referred to as a context switch. When a thread completes running, it is moved to the dead state. In the dead state, a thread still exists but it cannot be executed. The dead state is necessary to allow the SVM to maintain the context of the thread until it is safe to get rid of the thread.

Using these relationships, the execution of a concurrent program is explained in a manner similar to that used in Exhibit 10. A few things are added to the SMM to be able to support the addition of concurrency. First, the thread context must have a state variable added so that the state of the context can be tracked. Because this variable is used only by the SVM and must only be changed when a thread is moved to a running state or moved to the ready set, it cannot be accessed directly by the programmer. Second, the current ready set and the thread that is running must be maintained by the SVM process.

The rest of this section discusses the execution of the concurrent program shown in Exhibit 13. The steps involved in running this program are given in Exhibit 14 and are covered in the following discussion. As in Exhibit 10, the steps in Exhibit 14 are after the step or as completed, in this case including the choice of the next thread to be run. So, the running thread and PC represent the situation that will be run on the next step.

Exhibit 13: Concurrent Program for Exhibit 14

  1 public class CExec implements Runnable{  2   public void run() {  3     int counter = 0;  4     System.out.println("In run, counter = "+ counter);  5     counter++;  6     System.out.println("In run, counter = "+ counter);  7     return;  8   }  9   public static void main(String args[]) { 10     CExec ce = new CExec(); 11     Thread t1 = new Thread(ce); 12     t1.start(); 13     System.out.println("in main"); 14     return; 15   } 16 }

Exhibit 14: Steps in Executing a Concurrent Program

click to expand

The program begins execution. The SVM creates the heap and thread context for main, initializing them to null. It also creates the method area and loads the methods for the CExec class into the method area. Finally, it creates a PC. Because the program is still starting up, the PC exists and might even be used, but its value for now is undefined.
The SVM pushes an activation record for the main method on the stack. As part of this activation record, a reference to the CExec object is created; however, this object is not yet created so the reference is null. We will represent this reference using a convention similar to that in C: *ce. The SVM at this point also assigns the process PC and the thread PC to the first executable line in the main method (line 10).
The SVM now executes the code at line 10, which creates an instance of the CExec class in the heap. The *ce reference points to this object in the heap, and the SVM updates the PC in this context and the process PC to point to the next executable line in the program (line 11).
The SVM executes the code at line 11, which creates an instance of the Thread class in the heap. When the thread object is created, a context for the thread is also created, but the status is born, and this thread is not yet executable. To actually be eligible to run, a thread must be ready or running; therefore, only one thread, the main thread, can be run and execution continues in the main thread.
Line 12 is executed, which puts the t1 thread in the ready set and sets the status of the t1 queue to ready. The SVM is now free to choose either the main thread or the t1 thread to run. In this example, the t1 thread is chosen, but the choice of t1 is completely arbitrary. Because t1 has been chosen, a context switch from the main thread to the t1 thread is necessary. The process PC is set to the PC from the context of the t1 thread, and the current process PC is saved in the context of the main thread. Finally, the state of the two threads is updated.
Line 5 in the t1 thread is executed, which produces a line of output, part of which is the value of the local variable counter. The SVM now chooses which thread to run. In this example, thread t1 is chosen to be run again, but, again, this is completely arbitrary.
Line 4 in the t1 thread is executed, which adds the local variable counter, located in the stack of thread t1. The SVM now chooses to run the main thread again and performs a context switch.
Line 13 in the main thread is executed and produces a line of output. The SVM chooses to continue to run in the main thread, updating the PC to line 14.
Line 14 is executed, which completes the main thread. This thread is now dead, and it cannot be executed; however, it can also not finish. A thread that creates another thread is called a parent thread, and a parent thread must remain until all of its children are dead. Therefore, this thread is moved to state dead and sits idle until all of its children also die. Only one thread can be run now (thread t1), so the SVM must select it to run.
Line 6 in thread t1 is executed and produces an output. The SVM continues to execute in thread t1, updating the PC to line 7.
Line 7 is executed, which completes thread t1. Thread t1 changes to state dead; because it has no children, it can complete and its context is deleted. The main thread now has no children, and its context is also deleted. The program is now free to exit.

2.4.5 Sleeping and Blocking

This model is good enough to explain a program where all the threads are always either ready or running. However, often a thread will be suspended for some period of time, such as when a call is made to Thread.sleep or an I/O is requested. When this occurs, the thread is moved into a sleeping or blocked state until the condition that moved it to this state has been satisfied, as shown in Exhibit 15. When a thread is in a sleeping or blocked state, it cannot be run, so it would be a waste of time to have the CPU check whether or not to run it in each execution cycle. Threads that are sleeping or blocked really do not belong in the ready set, but they still have to be handled and accounted for by the SVM. Therefore, the SVM creates separate queues for threads that are in these states, as shown in Exhibit 16. This method of handling suspended threads will be used later to explain the behavior of the Java wait and notify methods.

Exhibit 15: State Model with Sleep and Blocked States

click to expand

Exhibit 16: SVM with Sleep and Blocked Queues

click to expand

2.4.6 Nondeterminism and Concurrency

As shown by the output of Program2.5 (Exhibit 6) and the discussion in Section 2.4.4, because the SVM (and the JVM and most concurrent execution environments) can choose the thread to run at each step in the program execution, it is possible to have more than one possible valid output from a concurrent program. As shown in the discussion of Program2.5 (Exhibit 6), the number of different possible execution paths in a program can be very large even for very simple programs with a very small number of threads. This property of a program being able to produce a number of different execution paths not completely controlled by the programmer is referred to as nondeterminism because the programmer cannot determine the absolute order of the statements.

Programmers who have never programmed with concurrency would not be familiar with nondeterministic behavior in a program. If a procedural program is run a million times, it will always produce the same answer. However, a concurrent program that runs correctly a million times could produce an error on the millionth and first run. This is the problem with nondeterminism; a programmer cannot test to show that a program is correct even for a limited subset of input. The only way to show a program is correct is to execute all possible paths through the program, and hence produce all results that can be produced by any possible interleaving of threads. While this is theoretically possible, it is completely impractical because of the large number of possible ways the threads can interleave. And, if even one of these ways to interleave is invalid, the possibility exists that the program will produce an invalid answer. The need to have a program always produce a correct answer for a given set of input is referred to as program safety.

Maintaining program safety is difficult because problems can become more prevalent as the number of interacting threads is increased. Normally, when a program is tested, a test suite is developed that is more limited than the environment in which the program will actually be run, so some problems occur only when the program is put into production and often can only be repeated in production environments. The problem of unsafe programs is one of the central issues in concurrent programming and is examined in more detail in Section 2.5.

^[1]Servlets are not the only technology that uses threads in a server. Languages such as Perl that originally were run only as processes can now be run as threads.

^[2]Note that the term set is intentionally used here to show that the threads are not ordered. This is consistent with the terminology used in [LIN99]. The more commonly used term in texts on operating systems is a ready list, but this implies an ordering that is not necessarily present.

^[3]In the JVM, the programmer can influence this decision by the use of thread priorities; however, it is incorrect to use priorities to control the interactions between them. Priorities serve no useful purpose in this discussion and so are dropped from the SVM.

< Day Day Up >