Multithreading Basics | Comprehensive VB .NET Debugging

Figure 14-1 shows a simple multithreaded program using two threads that access a single instance of three separate classes.

Figure 14-1: A simple multithreaded application

Thread A executes the code in class objects 1 and 2, and thread B runs through the code in class objects 2 and 3. Both threads access a common method in the instance of class 2, thereby sharing the code and data that this method uses.

Why Multithreading Is So Difficult

Of course, on a single-processor machine, the two threads don't actually execute simultaneously . What happens is that the operating system interleaves instructions from the two threads to give the impression that the threads are executing together. Whenever control is switched from thread A to thread B, the processor saves the context of thread A, restores the context of thread B, and then starts running it. As soon as an instruction from thread A is reached, the same process happens in reverse. Because this all happens so fast, you receive the impression that both threads are executing simultaneously.

Say that you have two threads containing just ten source code instructions. How many ways can these ten instructions be interleaved together? The answer, in case you haven't got a calculator handy, is 184,756! This starts to look worrying. How can you possibly test 184,756 possible code paths? Unfortunately, the situation is actually much worse than this. Threads aren't interleaved together at the level of source code, or even CIL, but at the assembly code level. When a single source statement can translate to dozens of native code instructions, you can see the impossibility of using execution testing to verify that your multithreaded application is working correctly.

So if you can't test a multithread program using code coverage tools, how about trying to desk-check it? Is it possible to examine the source code thoroughly enough to be able to predict problems resulting from multiple threads? Well, it's a nice idea that combining lots of brainpower with lots of multithreading experience can help you to find and remove the problems. The truth is that hard-earned evidence gained from very experienced developers has demonstrated that this doesn't work. Developers brains simply aren't equipped to cope with understanding how multiple execution threads can interact with each other.

Because multithreading involves a nonlinear process, and a developer is unable to establish the exact flow of execution because the interleaving of threaded code happens at the level of assembly code, many of the bugs occur rarely and seemingly at random. Even moving a multithreaded program from a machine with a slow processor to one with a much faster processor can cause bugs to appear and disappear, because the processor speed can prevent or cause "racing" between threads.

Before looking at "racing" and other bug types that can be caused by the use of multiple threads, it's worth examining when multithreading can be useful to you, and when it doesn't give you the benefits that you might think.

Multithreading Advantages

One of the best and most common uses of multithreading is to keep the user interface of your program responsive to the user while also performing one or more background tasks . For instance, the end user might ask your application to calculate the current market value of a sophisticated financial derivative instrument. During this lengthy calculation, you want the user to be able to cancel the operation if it's taking too long, perhaps by clicking a Cancel button. This chapter's last example application shows a very similar scenario to this, and multi-threading done properly works very well for this type of situation.

Another good use of multithreading is to keep the user interface updated with the intermediate results of a background task that's running. For instance, in the scenario just discussed, you might want to display intermediate results from the option valuation on the user interface while the calculation continues. If you don't use multithreading and the option calculation runs on the same thread as the user interface, you'll find that the application's window will go blank and won't be repainted. This is because the single thread can't cope with performing both tasks simultaneously.

An associated advantage of multithreading comes when you have a task that's going to take a long time to complete. In this case, you can fire off a background thread to perform the task and forget about it until the task completes at some time in the future. In the case of a background thread, this thread will be terminated automatically if the process that launched it is finished. Some dangers are associated with this automatic termination of a background thread because the termination is done through the CLR calling Thread.Abort . For a discussion of the dangers associated with Thread.Abort , please see the section titled "Terminating a Managed Thread" later in this chapter.

Yet another advantageous use of multithreading is to spawn a new thread for each user request to a server application. This allows multiple users to be serviced without the delay that might happen if the user requests are serialized and processed just one at a time. The built-in thread pool supplied by .NET is often an excellent solution to this situation.

If your application is doing input/output (I/O) work, such as accessing a disk, a printer, or the network, these resources can have unpredictable delays. Multiple threads can help to prevent I/O latency affecting other parts of your application.

You can use threads to isolate critical subsystems of your application from noncritical subsystems. Because most thread exceptions won't propagate out of the thread, this prevents an error in, say, the printing subsystem of your application affecting the radiation dosage monitoring subsystem.

A final reason for using multithreading is to establish the priority of an application's competing tasks. You can set the priority of a thread when it's created, so an important task can be assigned a high priority while less important tasks can be given a lower priority.

Multithreading Disadvantages

This section presents some of the disadvantages of writing multithreading code. There's certainly no need to use multithreading just because it's there and it's cool.

Using multithreading on a single-processor machine to process multiple tasks where each task takes approximately the same time isn't always very effective. For example, you might decide to spawn ten threads within your program in order to process ten separate tasks. If each task takes approximately 1 minute to process, and you use ten threads to do this processing, you won't have access to any of the task results for the whole 10 minutes. If instead you processed the same tasks using just a single thread, you would see the first result in 1 minute, the next result 1 minute later, and so on. If you can make use of each result without having to rely on all of the results being ready simultaneously, the single thread might be the better way of implementing the program.

If you launch a large number of threads within a process, the overhead of thread housekeeping and context switching can become significant. The processor will spend considerable time in switching between threads, and many of the threads won't be able to make progress. In addition, a single process with a large number of threads means that threads in other processes will be scheduled less frequently and won't receive a reasonable share of processor time.

If multiple threads have to share many of the same resources, you're unlikely to see performance benefits from multithreading your application. Many developers see multithreading as some sort of magic wand that gives automatic performance benefits. Unfortunately multithreading isn't the magic wand that it's sometimes perceived to be. If you're using multithreading for performance reasons, you should measure your application's performance very closely in several different situations, rather than just relying on some nonexistent magic.

Coordinating thread access to common data can be a big performance killer. Achieving good performance with multiple threads isn't easy when using a coarse locking plan, because this leads to low concurrency and threads waiting for access. Alternatively, a fine-grained locking strategy increases the complexity and can also slow down performance unless you perform some sophisticated tuning.

Using multiple threads to exploit a machine with multiple processors sounds like a good idea in theory, but in practice you need to be careful. To gain any significant performance benefits, you need to learn about thread balancing. For instance, imagine an application that receives incoming price information from the network, aggregates and sorts that information, and then displays the results on the screen for the end user. With a dual-processor machine, it makes sense to split the task into, say, three threads. The first thread deals with storing the incoming price information, the second thread processes the prices, and the final thread handles the display of the results. After implementing this solution, you find that the price processing is by far the longest stage, so you decide to rewrite that thread's code to improve its performance by a factor of three. Unfortunately, this performance benefit in a single thread may not be reflected across your whole application. This is because the other two threads may not be able to keep pace with the improved thread. If the user interface thread is unable to keep up with the faster flow of processed information, the other threads now have to wait around for the new bottleneck in the system.

When you have a bug in multithreading code, it's really easy to blame the multiple threads and immediately start looking for data races and deadlocking. You should remember not to overlook the possibility of bugs in the single-threaded sequential code.

As I've already discussed, controlling code execution with multiple threads can be complex and is likely to result in hard-to-find software defects. To avoid these bugs by the use of good design, you need to understand them in some detail. The next section looks at typical bug types related to multithreading.