Modes of Multitasking | Concurrent Programming on Windows

In the early days of the PC, some people advocated multitasking for the future, but many others scratched their heads in puzzlement: Of what use is multitasking on a single-user personal computer? Well, it turned out that multitasking was something users wanted without really knowing it.

Multitasking Under DOS?

The Intel 8088 microprocesssor used in the original PC was not exactly built for multitasking. Part of the problem was inadequate memory management. As multiple programs are started up and ended, a multitasking operating system is often called upon to move memory blocks around to consolidate free space. This was not possible on the 8088 in a manner transparent to applications.

DOS itself didn't help much. Designed to be small and to stay out of the way of applications, DOS supported very little beyond loading programs and providing them with access to the file system.

Still, however, creative programmers in the early days of DOS found a way to overcome those obstacles, mostly with terminate-and-stay-resident (TSR) programs. Some TSRs, such as print spoolers, hooked into the hardware timer interrupt to perform true background processing. Others, like popup utilities such as SideKick, could perform a type of task switching—suspending an application while the popup was running. DOS was also progressively enhanced to provide support for TSRs.

Some software vendors attempted to mold task-switching or multitasking shells on top of DOS (such as Quarterdeck's DesqView), but only one of these environments eventually achieved a large market penetration. That, of course, is Windows.

Nonpreemptive Multitasking

When Microsoft introduced Windows 1.0 in 1985, it was the most sophisticated solution yet devised to go beyond the limitations of DOS. Back then, Windows ran in real mode, but even so, it was able to move memory blocks around in physical memory—a prerequisite for multitasking—in a way that was not quite transparent to applications but almost tolerable.

Multitasking makes a lot more sense in a graphical windowing environment than it does in a command-line single-user operating system. For example, in classical command-line UNIX, it is possible to execute programs off the command line so that they run in the background. However, any display output from the program must be redirected to a file or the output will get mixed up with whatever else the user is doing.

A windowing environment allows multiple programs to run together on the same screen. Switching back and forth becomes trivial, and it is also possible to quickly move data from one program to another; for example, to imbed a picture created in a drawing program into a text file maintained by a word processing program. Data transfer has been supported in various ways under Windows, first with the clipboard, later through Dynamic Data Exchange (DDE), and now through Object Linking and Embedding (OLE).

Yet the multitasking implemented in the early versions of Windows was not the traditional preemptive time-slicing found in multiuser operating systems. Those operating systems use a system clock to periodically interrupt one task and restart another. The 16-bit versions of Windows supported something called "nonpreemptive multitasking." This type of multitasking is made possible because of the message-based architecture of Windows. In the general case, a Windows program sits dormant in memory until it receives a message. These messages are often the direct or indirect result of user input through the keyboard or mouse. After processing the message, the program returns control back to Windows.

The 16-bit versions of Windows did not arbitrarily switch control from one Windows program to another based on a timer tick. Instead, any task switching took place when a program had finished processing a message and had returned control to Windows. This nonpreemptive multitasking is also called "cooperative multitasking" because it requires some cooperation on the part of applications. One Windows program could tie up the whole system if it took a long time processing a message.

Although nonpreemptive multitasking was the general rule in 16-bit Windows, some forms of preemptive multitasking were also present. Windows used preemptive multitasking for running DOS programs and also allowed dynamic-link libraries to receive hardware timer interrupts for multimedia purposes.

The 16-bit Windows included several features to help programmers solve—or at least cope with—the limitations of nonpreemptive multitasking. The most notorious is, of course, the hourglass mouse cursor. This is not a solution, of course, but just a way of letting the user know that a program is busy working on a lengthy job and the system will be otherwise unusable for a little awhile. Another partial solution is the Windows timer, which allows a program to receive a message and do some work at periodic intervals. The timer is often used for clock applications and animation.

Another solution to the limitations of preemptive multitasking is the PeekMessage function call, as we saw in Chapter 5 in the RANDRECT program. Normally, a program uses the GetMessage call to retrieve the next message from its message queue. However, if there are no messages in the message queue, then GetMessage will not return until a message is present. PeekMessage, on the other hand, returns control to the program even if no messages are pending. Thus, a program can perform a long job and intermix PeekMessage calls in the code. The long job will continue running as long as there are no pending messages for the program or any other program.

PM and the Serialized Message Queue

The first attempt by Microsoft (in collaboration with IBM) to implement mulittasking in a quasi-DOS/Windows environment was OS/2 and the Presentation Manager (PM). Although OS/2 certainly supported preemptive multitasking, it often didn't seem as if this preemption was carried over into the Presentation Manager. The problem is that PM serialized user input messages from the keyboard and mouse. What this means is that PM would not deliver a keyboard or mouse message to a program until the previous user input message had been fully processed.

Although keyboard and mouse messages are just a few of the many messages a PM (or Windows) program can receive, most of the other messages are the result of a keyboard or mouse event. For example, a menu command message is the result of the user making a menu selection using the keyboard or mouse. The keyboard or mouse message is not fully processed until the menu command message is processed.

The primary reason for the serialized message queue was to allow predictable "type-ahead" and "mouse-ahead" actions by the user. If one of the keyboard or mouse messages caused a shift in input focus from one window to another, subsequent keyboard messages should go to the window with the new input focus. So, the system doesn't know where to send a subsequent user input message until the previous ones have been processed.

The common consensus these days is that it should not be possible for one application to be able to tie up the entire system; that requires a deserialized message queue, which is supported by the 32-bit versions of Windows. If one program is busy doing a lengthy job, you can switch the input focus to another program.

The Multithreading Solution

I've been discussing OS/2 Presentation Manager only because it was the first environment that provided some veteran Windows programmers (such as myself) with their first introduction to multithreading. Interestingly enough, the limitations of PM's implementation of multithreading provided programmers with essential clues to how multithreaded programs should be architected. Even though these limitations have now largely been lifted from the 32-bit versions of Windows, the lessons learned from more limited environments are still quite valid. So let's proceed.

In a multithreaded environment, programs can split themselves into separate pieces, called "threads of execution," that run concurrently. The support of threads turned out to be the best solution to the problem of the serialized message queue in Presentation Manager and continues to make a whole lot of sense under Windows.

In terms of code, a thread is simply represented by a function that might also call other functions in the program. A program begins execution with its main (or primary) thread, which in a traditional C program is the function called main and which in Windows is WinMain. Once running, the program can create new threads of execution by making a system call (CreateThread) specifying the name of initial Thread function. The operating system preemptively switches control among the threads in much the same way it switches control among processes.

In the OS/2 Presentation Manager, each thread could either create a message queue or not. A PM thread must create a message queue if it wishes to create windows from that thread. Otherwise, a thread needn't create a message queue if it's just doing a lot of data crunching or graphics output. Because the non-message-queue threads do not process messages, they cannot hang the system. The only restriction is that a non-message-queue thread cannot send a message to a window in a message-queue thread or make any function call that causes a message to be sent. (They can, however, post messages to message-queue threads.)

Thus, PM programmers learned how to divide their programs into one message-queue thread that created all the windows and processed messages to them, and one or more non-message-queue threads that performed lengthy background tasks. PM programmers also learned about the "1/10-second rule." Basically, they were advised that a message-queue thread should spend no more than 1/10 of a second processing a message. Anything that takes longer should be done in a different thread. If all programmers followed this rule, no PM program could hang the system for more than 1/10 of a second.

Multithreaded Architecture

I said that the limitations of PM provided programmers with essential clues to understanding how to use multiple threads of execution in a program running under a graphical environment. So here's what I recommend for the architecture of your programs: Your primary thread creates all the windows that your program needs, includes all the window procedures for these windows, and processes all the messages for these windows. Any other threads are simply background crunchers. They do not interact with the user except through communication with the primary thread.

One way to think of this is that the primary thread handles user input (and other messages), perhaps creating secondary threads in the process. These additional threads do the non-user-related tasks.

In other words, your program's primary thread is a governor, and your secondary threads are the governor's staff. The governor delegates all the big jobs to his or her staff while maintaining contact with the outside world. Because they are staff members, the secondary threads do not hold their own press conferences. They discreetly do their work, report back to the governor, and await their next assignment.

Threads within a particular program are all parts of the same process, so they share the process's resources, such as memory and open files. Because threads share the program's memory, they also share static variables. However, each thread has its own stack, so automatic variables are unique to each thread. Each thread also has its own processor state (and math coprocessor state) that is saved and restored during thread switches.

Thread Hassles

Properly designing, coding, and debugging a complex multithreaded application is conceivably one of the most difficult jobs a Windows programmer can encounter. Because a preemptive multitasking system can interrupt a thread at any point to switch control to another thread, any undesirable interaction between two threads might not be obvious and might show up only occasionally, seemingly on a random basis.

One common bug in a multithreaded program is called a "race condition." This happens when a programmer assumes that one thread will finish doing something—for example, preparing some data—before another thread needs that data. To help coordinate thread activity, operating systems require various forms of synchronization. One is the semaphore, which allows the programmer to block the execution of a thread at a certain point in the code until another thread signals that it can resume. Similar to semaphores are "critical sections," which are sections of code that cannot be interrupted.

But semaphores can also introduce another common thread-related bug, which is called a "deadlock." This occurs when two threads have blocked each other's execution and they can only unblock that execution by proceeding.

Fortunately, 32-bit programs are more immune to certain problems involving threads than 16-bit programs. For example, suppose one thread executes the simple statement

lCount++ ;

where lCount is a long 32-bit global variable that is used by other threads. In a 16-bit program, that single statement in C is compiled to two machine code instructions, the first one incrementing the low 16 bits of the variable, and the second adding any carry into the high 16 bits. Suppose the operating system interrupted the thread between those two machine code instructions. If lCount were 0x0000FFFF before the first machine code instruction, then lCount would be zero at the time the thread was interrupted, and that's the value another thread would see. Only when the thread resumed would lCount be incremented to its proper value of 0x00010000.

This is one of those bugs that might cause an operational problem so infrequently that it would never be detected. In a 16-bit program, the proper way to solve it would be to enclose the statement in a critical section, during which the thread cannot be interrupted. In a 32-bit program, however, the statement is fine because it is compiled to a single machine code instruction.

The Windows Advantage

The 32-bit versions of Windows (including Microsoft Windows NT and Windows 98) have a deserialized message queue. The implementation of this seems very good: If a program is taking a long time processing a message, the mouse cursor appears as an hourglass when the mouse is over that program's window but it changes to a normal arrow when positioned over another program's window. A simple click can bring that other window to the foreground.

However, the user is still prevented from working with the program doing the big job because the big job is preventing the program from receiving other messages. This is undesirable. A program should be always open to messages, and that often requires the use of secondary threads.

In Windows NT and Windows 98, there is no distinction between message-queue threads and non-message-queue threads. Each thread gets its own message queue when the thread is created. This reduces some of the awkward rules for threads in a PM program. (However, in most cases you'll want to process input through message procedures in one thread and pass off long jobs to other threads that do not maintain windows. This structure almost always makes the best sense, as we'll see.)

Still more good news: Windows NT and Windows 98 have a function that allows one thread to kill another thread in the same process. As you'll discover when you begin writing multithreaded code, this is sometimes convenient. The early versions of OS/2 did not include a "kill thread" function.

The final good news (at least for this topic) is that Windows NT and Windows 98 have implemented something called "thread local storage" (TLS). To understand this, recall that I mentioned earlier that static variables, both global and local to a function, are shared among threads because they sit in the process's data memory space. Automatic variables, which are always local to a function, are unique to each thread because they occupy space on the stack, and each thread has its own stack.

It is sometimes convenient for two or more threads to use the same function and for these threads to use static variables that are unique to the thread. That's thread local storage. There are a few Windows function calls involved, but Microsoft has also added an extension to the C compiler that makes the use of TLS more transparent to the programmer.

New! Improved! Now with Threads!

Now that I've made the case for threads, let's put the subject in proper perspective. Sometimes there's a tendency for programmers to use every feature that an operating system has to offer. But the worst case is when your boss comes to your desk and says, "I've heard that this new Whatsit thing is really hot. Let's incorporate some Whatsit in our program." And then you spend a week trying to figure out how (and if) Whatsit can possibly benefit the application.

The point is—it just doesn't make sense to add multithreading to an application that doesn't need it. Some applications just can't benefit from multithreading. If your program displays the hourglass cursor for an annoying period of time, or if it uses the PeekMessage call to avoid the hourglass cursor, then restructuring the program for multithreading is probably a good idea. Otherwise, you're just making things hard for yourself and possibly introducing new bugs into the code.

There are even some cases where the hourglass cursor might be entirely appropriate. I mentioned earlier the 1/10-second rule. Well, loading a large file into memory can take longer than 1/10 seconds. Does this mean that file-loading routines should be implemented in separate threads? Not necessarily. When a user commands a program to open a file, he or she usually wants that operation to be carried out immediately. Putting the file-loading routines in a separate thread simply adds overhead. It's just not worth it, even if you want to boast to your friends that you write multithreaded programs!