|[ LiB ]|
To be sure you know what multithreading is, let me give you a quick rundown.
Multithreading is the ability of a computer to run more than one thread of execution at the same time. (These threads can be part of the same program or separate programs entirely.) Back in the bad old days, only one program could be running at any given time. You loaded the program and had to wait until it completed before you went on to anything else. Computers back then were expensive investments; people would pay millions of dollars for a computer that ran as fast as your average graphing calculator. To increase the return on the investment, people were allowed to use the computers for specified tasks , and users were charged by the amount of time they used.
Pocket PCs and PalmPCs are already faster than some of the first super computers. Isn't that nice to know?
Unfortunately, this system was inadequate. Imagine this situation: You've got a brand new computer capable of doing a million calculations per second, and someone loads a program and tries to run it. During the program's execution, it needs to stop and ask the operator for input. How fast can the operator respond? How many millions of CPU cycles are wasted waiting for the operator to type an entry? Obviously, when computing power was rare, this was seen as an obscene waste, so something had to be done.
The first time-sharing systems were invented to solve this problem; they were referred to as round- robin . In case you don't already know, every computer program consists of many instructions that the processor goes through and executes one by one. This process is called serial execution , because it executes instructions in a series, as shown in Figure 3.1.
For round-robin execution, however, the CPU switches to the first program, executes a few instructions, switches to the next one, executes a few instructions, and continues on through all open programs before returning to the first program. Whenever the CPU switches from one thread to another, the action is known as a context switch. Round-robin execution is shown in Figure 3.2.
Multithreading did wonders for enhancing the efficiency of computers.
Modern systems no longer use simple round-robin processing because more efficient methods for handling tasks have been discovered . These days, most multithreaded systems assign priority values to threads and use priority queues to spend more cycles on the higher-priority threads. Addi tionally, programs can be interrupted by important events, and a context switch can be forced out of order; this is called pre-emption .
As computer chips became cheaper, the Personal Computer (PC) concept became popular. Instead of logging into a public supercomputer to execute your programs, you can now use a PC any time you want. Since PCs were designed mainly for single users, they once again devolved into the world of serial execution. If you remember old operating systems such as DOS, you can remember being able to run only a single program at any given time. If you were running a word processor and needed to perform a quick math calculation, you needed to save your work, quit, open a calculator program, then quit that, and go back to your document. Incredible pain in the butt!
It is important to note that single processors do not execute multiple threads at the same time. They work on one thread for a bit, and then stop working on that thread completely when it switches to another thread. This gives the appearance that the computer is running many programs at the same time, but it's really just switching between them really quickly.
Fortunately, within the past few years , multithreading has come back into favor, and our modern operating systems have supremely efficient threading capabilities. For example, as I write this, Windows XP reports that I have a total of 329 threads running at the same time. Every single thread is doing something different, such as running my calculator, my word processor, my paint program, Internet Explorer, downloads, and so on. It's just amazing what the systems can accomplish now. You probably think nothing of being able to just press Alt-Tab and move instantly to another program. I barely remember what it was like before we had systems like this, and I shudder when I think about the possibility of ever going back.
Multithreading is here to stay.
There are some pretty cool proces sor technologies on the road ahead. Intel recently released its hyperthreading processor, which can actually run more than one thread at the same time on a single proces sor. AMD has its own solution in the works for the future, which actually has more than one execution core on the same chip.
A thread is basically just a small independent section of code that the processor runs concurrently with other threads. A thread isn't necessarily a whole program (programs are commonly referred to as processes when dealing with multithreading), since a process can create as many threads as necessary. Figure 3.3 shows the relationship between a process and its threads.
When you begin a process, it spawns one thread, which is the main thread. This main thread is really just the main() function in a program. You can elect not to spawn more threads (for single-threaded applications), or you can spawn as many threads as memory allows.
This is where things get sticky, however. At the level of the operating system (which manages all the threads), threading is rather easy. If every program is single threaded, you can pretty much assume that these programs will not be sharing memory, so you can set them off on their own threads and execute them as you see fit.
However, it's not so easy once you have multiple threads within one program. For example, say you have a program with two threads: the main game loop of your program, and a thread that receives data from an open socket.
Most of the time, the socket thread is going to be sitting there waiting for data to arrive , and not wasting CPU cycles by being constantly polled by the main thread. However, how does the main thread ever find out when data arrives? You need some way to tell the main thread about data arrival, at which time the main thread can do something with the data. This is usually accomplished using a shared variable or even a global variable. (This isn't recommended, though.) I'll get into the specifics later.
The main problem with multithreading, however, is synchronization . For example, in the same system, whenever you receive data, you shove it into a 128-byte buffer, and then tell the main thread that data has arrived. What happens if you start receiving more data while the main thread is still reading the old data? You might accidentally overwrite the data while it's being read! Areas in your code where these kinds of problems can occur are usually referred to as critical sections .
This is the biggest frustration when dealing with multithreaded applications. It is almost impossible to keep track of when data is going to be modified by each thread, so you need to enlist the help of objects that can help you synchronize your threads, without worrying about reading data that is being written to, or writing to data that is being read.
A mutex (which stands for mut ual ex clusion ) is the simplest synchronization structure available. It's also the most useful. Essentially , a mutex is a Boolean variable associated with an object, and it determines if the object is locked . If an object is locked, it is being used by a thread, and any other threads that try to use the object must wait until it is unlocked.
In Figure 3.4 , I show five different states relating to the mutex: check, wait, lock, process, and unlock. In reality, almost every mutex imple mentation out there implements the first three states (check, wait, lock) in one function and calls that "lock." I separated them in the figure to show you exactly what is happening behind the scenes.
Figure 3.4 shows an example of two threads using a mutex to prevent stepping on each others' toes when trying to process a single object. The first thread locks the mutex, and then goes on to process the object. After the mutex is locked, the second thread gets to a point where it wants to process the object as well, so it checks to see if the mutex is locked. When the second thread sees that the lock is in place, so it decides to wait for the mutex to unlock. Meanwhile, the first thread completes its processing and unlocks the mutex. Then the second thread locks the mutex and starts processing the object. Finally, when the second thread is done, it unlocks the mutex as well.
Essentially, a mutex allows only one thread to lock it, so that whenever other threads try to lock the same mutex, they must wait until it is unlocked. In an efficient system, the thread that is waiting for the mutex to unlock should usually be sleeping , which means that the operating system doesn't waste time processing that thread until the mutex is unlocked. This makes multithreading efficient, because you don't have to keep polling a variable to see if it is available for use.
Unfortunately, mutexes have some drawbacks. For example, suppose you forget to unlock a mutex? If so, any threads that attempt to lock the mutex when it is already locked wait forever. They become zombie threads (yes, that's a technical term ) threads that exist, but cannot be used for anything. Obviously, you don't want this to happen.
Always unlock your mutexes when you're done with them.
There's also another problem. What happens if you've got two threads that depend on a single object, and one of the threads needs to be constantly accessing the object (say, the state of a player in a game, accessed once per frame, which is about 30 to 60 times a second in a normal game). At the same time, another thread needs to do a lot of processing on the state. (This processing could take up to 30 seconds if the function is programmed inefficiently, and assumes that it can lock the mutex as long as it wants to, even if it isn't using the locked object.) What would happen? The thread that needs lots of updates would essentially be halted for a long time, waiting for the hog thread to finish, and your game would appear laggy.
Don't lock your mutexes for a long time. Lock them only when you are directly accessing the object, and unlock them the moment you are finished accessing the object.
A semaphore object is pretty easy to understand as well. Basically, a semaphore is an object that allows a certain number of threads to access an object before it starts blocking.
If you think about it, a mutex is really a semaphore that allows just one thread to access an object. A semaphore with a value of two would allow the first two threads to access the object, but any after that would have to wait until one of the first two threads unlocks itself.
Semaphores are the flags that were waved by sailors to communicate with other ships on the sea, before wireless radio communication became popular. In essence, sema- phore objects in computers do the same thing: They act as signals to other threads.
Figure 3.5 shows a thread diagram for three threads that want to access a semaphore that only allows two threads to access it.
Semaphores, while more flexible than mutexes, really aren't as useful as mutexes. Sema-phores are most useful when used in conjunction with mutexes, so that you can create a system that allows many threads to read an object and only one to write to it when nothing is reading it.
This situation doesn't come up too often; indeed, some threading libraries don't even have semaphores, so I think that's about as much as I want to say about them.
The third and final synchronization object I want to show you (don't worry, there are many more if you're interested) is called a condition object . These things are pretty cool and give you lots of control over the timing of thread execution.
Basically, a condition object has two main commands: wait and signal.
Say you have a bunch of threads that control the Artificial Intelligence (AI) of a group of bad guys. They're all lounging in the Bad Guy Headquarters Recreation Room, waiting for someone to break in. There's no need to continuously process their AIs, so you want to put them to sleep. But, you want an easy way to wake them up when the alarm is pulled, right?
So, you have a condition object representing the alarm, and then tell all the threads of the bad guy to wait on the alarm condition. When the alarm is tripped, the alarm condition is signaled , and all of the threads waiting on the condition are woken up.
Figure 3.6 shows this process with four threads.
Multithreading is wonderful! Multithreading is great! Multithreading is a pain in the a**. When you first start using it, multithreading seems like a Holy Grail to programmers. Who wouldn't want to have hundreds of threads running at the same time? Who wouldn't want to have programs seamlessly scale upward in performance when moved onto two-, four-, or even eight-processor servers?
Then you start experimenting with threads. You create some test programs and run them and then slam your keyboard against your monitor when half of your threads stop working for no reason at all. Ah yes, threads can be the bane of your existence. Let me show you a few ways in which threads can turn you into a frustrated person.
If you know how modern computers work, you are aware that each program has a data structure called a stack that it uses to keep track of local data for every function it is executing. If you don't know much about stacks, it's no big deal; all you need to know is that your program has a stack. Depending on how many functions are called, program stacks usually take up a fair amount of memory. Some systems use resizable stacks, and some use fixed size. Either way, the operating system tries to make absolutely certain that the stack is big enough to prevent overflowing when many layers of functions are being called at the same time.
Each thread keeps track of other things as well, such as the exact state of the CPU. However, in relation to everything else, the stack is by far the largest piece of a thread's overhead.
When you get into multithreading and many threads are acting like little programs of their own, you suddenly realize that every thread needs its own stack. This can be a significant hindrance, if you planned on having threads for every little task in the entire program, because even the smallest of threads needs its own stack.
Depending on the implementation, stack sizes can range from a few kilobytes to multiple megabytes. So you always need to keep that in mind before you go crazy creating thousands of threads.
On a single-processor system, threads require additional processing overhead, because the operating system performs calculations to figure out which thread should run when and how long each thread should run. If you have thousands of threads, this takes a fair amount of processing power and may slow down your program in the long run. However, for systems that use blocking IO calls, threading is an absolute necessity, so there is really no way around it.
An important consolation exists, however. The great thing about multithreading is that theoretically, if you run your program on a two-processor machine, most (not all) of the processing overhead disappears, and your program runs faster than it would on a one-processor machine.
It is a common mistake to think that if you run a program on a two-processor machine, it will be twice as fast. How ever, this is definitely not the case. The same goes with four- or eight- processor machines as well. This is caused by the law of diminishing returns. Every time you introduce a new processor on a machine, you introduce more over head as well. By far, the largest problem for n- processor machines is memory bandwidth . Only a certain amount of memory can be transmitted from main memory to each of the processors in a given amount of time, so when you start adding processors, they often want more memory than the memory bus can handle. In those cases, you'll end up with a system in which most of the processors will be waiting to read from or write to memory most of the time, instead of actually performing processing tasks.
Deadlock is an extremely nasty issue, and just mentioning it is enough to cause most programmers to run away screaming hysterically. If you do not think about every single way your threads will interact with other threads, I guarantee you will run into deadlock.
So what is deadlock? Imagine you have two resources, A and B. You also have two threads, 1 and 2. Thread 1, using a mutex, locks resource A and uses it. At the same time, thread 2 needs resource B, so B is locked by thread 2. A little bit later, thread 1 decides it needs to use resource B, so it tries to lock B as well. Because a mutex waits until the resource is available, thread 1 starts to wait. Then, a little bit later, thread 2 decides it needs resource A, so it tries to lock resource A and enters a mutex-waiting loop as well.
So what happens? Thread 1 is waiting for B, which 2 owns, but since 2 is waiting for A, which 1 owns, it cannot unlock B. Therefore, both threads wait for the other to release its resources, which never happens. This is deadlock .
Figure 3.7 shows this.
There are many ways deadlock can occur, and they may not all be immediately apparent. This is one of the most difficult problems you will face when developing multithreaded programs.
The programs in this book, however, won't be incredibly complex (in regards to threading) so deadlock shouldn't be a serious concern.
I have mentioned data corruption before, but it is such a serious concern with multithreading that I feel I must mention it again. Data corruption is a huge problem with multithreading if you do not use proper synchronization structures such as mutexes to control exclusive access to your data. You must be absolutely certain at all times that data is modified only when the thread modifying the data knows that nothing else is trying to read or write from it at the same time. Data contention can be the source of many unexplainable bugs .
Debugging a multithreaded program is hell. It is extremely difficult to step through a multithreaded program because most debuggers don't support multithreading.
To make things worse , some debuggers that do support multithreading let other threads execute normally, while the thread you are debugging is essentially stopped . This makes it extremely difficult to time things correctly in a normal program.
Finally, multithreaded programs are nondeterministic . This means that the operating system controls when threads are started and stopped, and you have absolutely no control over it. In a singlethreaded program, you can simulate circumstances that led to a crash in your game by repeating the same inputs and using the same seed for the random number generator, but you can't simulate the execution order of threads. Because of this, your program may crash one time out of a hundred, but you cannot track down the cause of the crash easily because the circumstances that led to the crash will rarely repeat themselves !
Feel free to whack your head against the table when things like this happen.
It's always amusing to watch a game programmer write his first nongame application. You see, as game programmers, we're accustomed to havingactually, we demand having complete control of the computer. We need to squeeze every last bit of processing power out of the machine, and if the player is dumb enough to be running Word or Excel in the background, well, that's his fault!
So when a game programmer starts working on nongame applications, he usually takes the same approach to the program and assumes that he has complete control over the machine. Usually we have a loop-based program that endlessly runs, waiting for something to happen or updating the screen at a blazing 60 frames per second.
Well, regular applications don't need this kind of power; and more often than not, we're wasting cycles when we program like this. Nothing is funnier than seeing a game programmer's first Telnet client application eat up 100% of the CPU cycles because it was programmed like a game.
To see an example of what I am talking about, go back to Chapter 2, "Winsock/Berkeley Sockets Programming," for a minute. I want you to run Demo 2.1. Once it is running, go about your regular computing tasks, such as opening up a word processor, or something like that. Nice and fast, eh? Now close the demo, and run Demo 2.3 instead. Now try doing your regular computing tasks. Not so fast, is it? In fact, even on a top-of-the-line computer system, your entire computer will chug along as if it's 10-years-old already! This, ladies and gentlemen, is not a good thing.
What was going on with these demos? Demo 2.1 uses blocking functions, which I've explained earlier. When a blocking function is called, the operating system puts that thread to sleep until something important happens, and then wakes it up.
Demo 2.3, however, doesn't play nicely with the operating system. Instead, it just endlessly loops , asking the Socket Library, "Did anything happen yet? Did anything happen yet?" until something actually happens. The application doesn't take into consideration that it's just running along, wasting everyone's time.
Sometimes, you've got to play along nicely with the operating system, even if you don't want to. Enter the concept of manual sleeping ; almost every threading library comes with the ability to manually sleep. At the end of your game loop, you should tell the operating system that you want the thread to sleep. You're almost always allowed to specify the minimum amount of time you want a thread to sleep, which guarantees that the thread will sleep for at least that long, but possibly longer. Whenever you manually tell the operating system to put a thread to sleep, you're handing control over to the OS.
|[ LiB ]|