1.4 Concurrency at the Applications Level | UNIX Systems Programming: Communication, Concurrency and Threads

Team-FLY

Concurrency occurs at the hardware level because multiple devices operate at the same time. Processors have internal parallelism and work on several instructions simultaneously , systems have multiple processors, and systems interact through network communication. Concurrency is visible at the applications level in signal handling, in the overlap of I/O and processing, in communication, and in the sharing of resources between processes or among threads in the same process. This section provides an overview of concurrency and asynchronous operation.

1.4.1 Interrupts

The execution of a single instruction in a program at the conventional machine level is the result of the processor instruction cycle . During normal execution of its instruction cycle, a processor retrieves an address from the program counter and executes the instruction at that address. (Modern processors have internal parallelism such as pipelines to reduce execution time, but this discussion does not consider that complication.) Concurrency arises at the conventional machine level because a peripheral device can generate an electrical signal, called an interrupt , to set a hardware flag within the processor. The detection of an interrupt is part of the instruction cycle itself. On each instruction cycle, the processor checks hardware flags to see if any peripheral devices need attention. If the processor detects that an interrupt has occurred, it saves the current value of the program counter and loads a new value that is the address of a special function called an interrupt service routine or interrupt handler . After finishing the interrupt service routine, the processor must be able to resume execution of the previous instruction where it left off.

An event is asynchronous to an entity if the time at which it occurs is not determined by that entity. The interrupts generated by external hardware devices are generally asynchronous to programs executing on the system. The interrupts do not always occur at the same point in a program's execution, but a program should give a correct result regardless of where it is interrupted . In contrast, an error event such as division by zero is synchronous in the sense that it always occurs during the execution of a particular instruction if the same data is presented to the instruction.

Although the interrupt service routine may be part of the program that is interrupted, the processing of an interrupt service routine is a distinct entity with respect to concurrency. Operating-system routines called device drivers usually handle the interrupts generated by peripheral devices. These drivers then notify the relevant processes, through a software mechanism such as a signal, that an event has occurred.

Operating systems also use interrupts to implement timesharing . Most machines have a device called a timer that can generate an interrupt after a specified interval of time. To execute a user program, the operating system starts the timer before setting the program counter. When the timer expires , it generates an interrupt that causes the CPU to execute the timer interrupt service routine. The interrupt service routine writes the address of the operating system code into the program counter, and the operating system is back in control. When a process loses the CPU in the manner just described, its quantum is said to have expired . The operating system puts the process in a queue of processes that are ready to run. The process waits there for another turn to execute.

1.4.2 Signals

A signal is a software notification of an event. Often, a signal is a response of the operating system to an interrupt (a hardware event). For example, a keystroke such as Ctrl-C generates an interrupt for the device driver handling the keyboard. The driver recognizes the character as the interrupt character and notifies the processes that are associated with this terminal by sending a signal. The operating system may also send a signal to a process to notify it of a completed I/O operation or an error.

A signal is generated when the event that causes the signal occurs. Signals can be generated either synchronously or asynchronously. A signal is generated synchronously if it is generated by the process or thread that receives it. The execution of an illegal instruction or a divide-by-zero may generate a synchronous signal. A Ctrl-C on the keyboard generates an asynchronous signal. Signals (Chapter 8) can be used for timers (Chapter 10), terminating programs (Section 8.2), job control (Section 11.7) or asynchronous I/O (Section 8.8).

A process catches a signal when it executes a handler for the signal. A program that catches a signal has at least two concurrent parts , the main program and the signal handler. Potential concurrency restricts what can be done inside a signal handler (Section 8.6). If the signal handler modifies external variables that the program can modify elsewhere, then proper execution may require that those variables be protected.

1.4.3 Input and output

A challenge for operating systems is to coordinate resources that have greatly differing characteristic access times. The processor can perform millions of operations on behalf of other processes while a program waits for a disk access to complete. Alternatively, the process can avoid blocking by using asynchronous I/O or dedicated threads instead of ordinary blocking I/O. The tradeoff is between the additional performance and the extra programming overhead in using these mechanisms.

A similar problem occurs when an application monitors two or more input channels such as input from different sources on a network. If standard blocking I/O is used, an application that is blocked waiting for input from one source is not able to respond if input from another source becomes available.

1.4.4 Processes, threads and the sharing of resources

A traditional method for achieving concurrent execution in UNIX is for the user to create multiple processes by calling the fork function. The processes usually need to coordinate their operation in some way. In the simplest instance they may only need to coordinate their termination. Even the termination problem is more difficult than it might seem. Chapter 3 addresses process structure and management and introduces the UNIX fork , exec and wait system calls.

Processes that have a common ancestor can communicate through pipes (Chapter 6). Processes without a common ancestor can communicate by signals (Chapter 8), FIFOs (Section 6.3), semaphores (Sections 14.2 and 15.2), shared address space (Section 15.3) or messages (Section 15.4 and Chapter 18).

Multiple threads of execution can provide concurrency within a process. When a program executes, the CPU uses the program counter to determine which instruction to execute next . The resulting stream of instructions is called the program's thread of execution . It is the flow of control for the process. If two distinct threads of execution share a resource within a time frame, care must be taken that these threads do not interfere with each other. Multiprocessor systems expand the opportunity for concurrency and sharing among applications and within applications. When a multithreaded application has more than one thread of execution concurrently active on a multiprocessor system, multiple instructions from the same process may be executed at the same time.

Until recently there has not been a standard for using threads, and each vendor's thread package behaved differently. A thread standard has now been incorporated into the POSIX standard. Chapters 12 and 13 discuss this new standard.

1.4.5 Multiple processors with shared memory

How many CPUs does a typical home computer have? If you think the answer is one, think again. In early machines, the main CPU handled most of the decision making. As machine design evolved, I/O became more complicated and placed more demands on the CPU. One way of enhancing the performance of a system is to determine which components are the bottlenecks and then improve or replicate these components . The main I/O controllers such as the video controller and disk controller took over some of the processing related to these peripherals, relieving the CPU of this burden . In modern machines, these controllers and other I/O controllers have their own special purpose CPUs.

What if after all this auxiliary processing has been offloaded, the CPU is still the bottleneck? There are two approaches to improving the performance. Admiral Grace Murray Hopper, a pioneer in computer software, often compared computing to the way fields were plowed in the pioneer days: "If one ox could not do the job, they did not try to grow a bigger ox, but used two oxen." It was usually cheaper to add another processor or two than to increase the speed of a single processor. Some problems do not lend themselves to just increasing the number of processors indefinitely. Seymour Cray, a pioneer in computer hardware, is reported to have said, "If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?"

The optimal tradeoff between more CPUs and better CPUs depends on several factors, including the type of problem to be solved and the cost of each solution. Machines with multiple CPUs have already migrated to the desktop and are likely to become more common as prices drop. Concurrency issues at the application level are slightly different when there are multiple processors, but the methods discussed in this book are equally applicable in a multiprocessor environment.

1.4.6 The network as the computer

Another important trend is the distribution of computation over a network. Concurrency and communication meet to form new applications. The most widely used model of distributed computation is the client-server model . The basic entities in this model are server processes that manage resources, and client processes that require access to shared resources. (A process can be both a server and a client.) A client process shares a resource by sending a request to a server. The server performs the request on behalf of the client and sends a reply to the client. Examples of applications based on the client-server model include file transfer ( ftp ), electronic mail, file servers and the World Wide Web. Development of client-server applications requires an understanding of concurrency and communication.

The object-based model is another model for distributed computation. Each resource in the system is viewed as an object with a message-handling interface, allowing all resources to be accessed in a uniform way. The object-based model allows for controlled incremental development and code reuse. Object frameworks define interactions between code modules, and the object model naturally expresses notions of protection. Many of the experimental distributed operating systems such as Argus [74], Amoeba [124], Mach [1], Arjuna [106], Clouds [29] and Emerald [11] are object based. Object-based models require object managers to track the location of the objects in the system.

An alternative to a truly distributed operating system is to provide application layers that run on top of common operating systems to exploit parallelism on the network. The Parallel Virtual Machine (PVM) and its successor, Message Passing Interface (MPI), are software libraries [10, 43] that allow a collection of heterogeneous workstations to function as a parallel computer for solving large computational problems. PVM manages and monitors tasks that are distributed on workstations across the network. Chapter 17 develops a dispatcher for a simplified version of PVM. CORBA (Common Object Request Broker Architecture) is another type of software layer that provides an object-oriented interface to a set of generic services in a heterogeneous distributed environment [104].

Team-FLY