Page #76 (Chapter 6. Concurrency)


Multithread Programming

First, a little background on processes and threads

A thread is the basic unit to which the Windows operating system allocates processor time. A process, that is, a single instance of a running application, consists of at least one thread of execution known as the primary thread. A thread can execute just one task at a time. To perform multiple tasks concurrently, a process can create multiple threads. Even though only one thread can execute at any time, [1] the Windows OS preemptively switches execution from one thread to another. The switching is so fast that all the threads appear to run at the same time.

[1] More specifically, thread execution is processor-based. A multiprocessor machine can have multiple threads executing simultaneously.

All the threads within a process share the virtual address space and global variables of that process. However, each thread has its own stack. Therefore, when a thread is executing, any program variables that are created on the stack are local to the thread.

Often times, it is necessary to maintain thread-specific data. However, a static or global variable cannot be used for this purpose because it has the same value across all the threads. To address this problem, the OS provides a feature called thread local storage (TLS). With TLS, you can create a unique copy of a variable for each thread.

The Win32 SDK provides APIs that deal with thread creation, manipulation, and synchronization. The SDK also provides comprehensive documentation on their usage.

With this brief background information, let s develop a simple application that demonstrates the use of Win32 APIs in creating and manipulating threads. As we go along in the chapter, we will pick up other thread-related information that we need to know.

A Simple Example

The objective of this example is to illustrate the technique of creating and using threads in a program. In this simple example, the main thread will create a secondary thread that computes the product of two numbers. The main thread will just wait until the secondary thread completes the computation, whereupon the main thread will display the computed value.

The code for this example can be found on the CD.

The SDK provides two APIs, CreateThread and ExitThread, to create a thread or terminate the thread, respectively. Following are their prototypes:

 HANDLE CreateThread(     LPSECURITY_ATTRIBUTES lpThreadAttributes, //pointer to security                                                //attributes      DWORD dwStackSize,                 // initial thread stack size      LPTHREAD_START_ROUTINE lpStartAddress,    // pointer to thread                                                // function      LPVOID lpParameter,                  // argument for new thread      DWORD dwCreationFlags,                    // creation flags      LPDWORD lpThreadId                //pointer to return thread ID  );  VOID ExitThread(DWORD dwExitCode);            // Exit a thread. 

Parameter lpStartAddress is an application-defined function that serves as the starting address for a thread. When the application-defined function returns, or explicitly calls ExitThread, the thread terminates.

The thread function takes a single argument. Parameter lpParameter specifies the value for this argument.

Note that only one argument can be passed to the thread function. The current example, however, requires three logical arguments the two numbers whose product needs to be computed and a return variable that holds the computed value. The easiest way to solve this problem is to create a data structure containing the three variables and pass a pointer to an instance of the structure as the argument to the function, as shown in the following code snippet:

 struct MYPARAM {   int nVal1;    int nVal2;    int nProduct;  };  int APIENTRY WinMain(HINSTANCE hInstance,    HINSTANCE hPrevInstance,    LPSTR lpCmdLine,    int nCmdShow)  {   // Step 1: Initialize the data to be passed to the thread    MYPARAM data;    data.nVal1 = 5;    data.nVal2 = 7;    data.nProduct = 0;    // Step 2: Create and run the thread    DWORD dwThreadID;    HANDLE hThread = ::CreateThread(NULL, 0, MyThreadProc,      &data, 0, &dwThreadID);    if (NULL == hThread) {     MessageBox(NULL, _T("Cannot create thread"), NULL,  MB_OK);      return 1;    }    ...  } 

In this code snippet, the function that will be executed by the thread is MyThreadProc. The implementation for this function follows:

 DWORD WINAPI MyThreadProc(LPVOID pData)  {   ::Sleep(30 * 1000); // Sleep for 30 seconds (for our test)    MYPARAM* pParam = reinterpret_cast<MYPARAM*>(pData);    pParam->nProduct = pParam->nVal1 * pParam->nVal2;    return 0;  } 

Note that the thread function definition requires the argument to be of type LPVOID. Consequently, the argument has to be reinterpreted to whatever it represents within the function scope. In the example, the parameter is reinterpreted as a pointer to MYPARAM data structure.

The WinMain thread now waits for the secondary thread to quit. This can be done by calling the Win32 API WaitForSingleObject on the handle that was returned by the earlier call to CreateThread, as shown below:

 // Step 3: Wait for the thread to quit  ::WaitForSingleObject(hThread, INFINITE); // wait infinitely 

Once the thread quits, the parent thread has to clean up the thread resource by calling a Win32 API, CloseHandle:

 // Step 4: Release the thread handle  ::CloseHandle(hThread); hThread = NULL; 

The computed value is ready to be displayed:

 // Step 5: Display the product  TCHAR buf[100];  _stprintf(buf, _T("The product is %d"), data.nProduct);  ::MessageBox(NULL, buf, _T("Compute Product"), MB_OK); 

Tada! We just finished writing our first multithreaded application.

Although one can use the SDK APIs directly to create and manipulate threads, it is more convenient to wrap the APIs in a C++ class. As part of the CPL toolkit, I have developed a C++ abstract base class, CCPLWinThrea d, to simplify dealing with threads. Following is its definition. For clarity, I have removed some methods that are not relevant to the current discussion.

 class CCPLWinThread  { public:    CCPLWinThread();                // Constructor    HRESULT Init();                 // Initializer    virtual ~CCPLWinThread();       // Destructor    HRESULT StartThread();        // Start the thread    bool IsThreadActive();        // Check if the thread is running    void StopThread();            // Stop the thread    // Wait for the thread to quit    // (within the specified timeout interval)    bool WaitForCompletion(DWORD dwTimeOut = INFINITE);  protected:    virtual void Proc() = 0;      // The thread entry point    ...  }; 

To use this class, you have to create a derived class, add your methods and data members to the class, and implement the virtual method Proc, the entry point for the thread.

The following class definition captures the same essence of the secondary thread as our earlier example:

 class CMyProductThread : public CCPLWinThread  { public:    int m_nVal1;    int m_nVal2;    int m_nProduct;    void Proc()    {     m_nProduct = m_nVal1 * m_nVal2;    }  }; 

During run time, the primary thread would create an instance of this class, set the appropriate data members, and call the StartThread method, as shown below:

 int APIENTRY WinMain(HINSTANCE hInstance,    HINSTANCE hPrevInstance,    LPSTR lpCmdLine,    int nCmdShow)  {   // Step 1: Create an instance and initialize it    CMyProductThread myThread;      myThread.Init();    // Step 2: Initialize the data to be passed to the thread    myThread.m_nVal1 = 5;    myThread.m_nVal2 = 7;    // Step 3: Start the thread    myThread.StartThread();    // Step 4: Wait for the thread to quit    myThread.WaitForCompletion();    // Step 5: Display the product    TCHAR buf[100];    _stprintf(buf, _T("The product is %d"), myThread.m_nProduct);    ::MessageBox(NULL, buf, _T("Compute Product"), MB_OK);    return 0;  } 

As can be seen, the steps are similar to those used in the earlier code snippet, except that the thread creation, destruction, and cleanup (closing the thread handle, etc.) has been abstracted in the CCPLWinThread class.

Abstracting thread creation and destruction buys us one more thing. It turns out that if a thread intends to use any C run-time (CRT) library methods, it is better to use the thread manipulation routines provided by the runtime library, such as _beginthreadex and _endthreadex. This eliminates some memory leaks [Ric-96]. Depending on whether the project intends to use the run-time library, the developers can modify the class implementation. The scope of code changes are isolated to just one class.

Get familiar with class CCPLWinThread. As we go along in this chapter, we will continue to add more functionality to this class.

Now let s look at some of the problems associated with multithreaded programming.

Multithreading Issues

Multithreading is a powerful technique that can improve the performance and responsiveness of your application. At the same time, multithreading introduces some complexities into your code that, if not properly attended during the design and development cycle, may lead to a disaster.

Shared Data Conflicts

If two threads have access to the same variable (more precisely, the same memory location), updating the variable from both the threads may leave the variable s value in an inconsistent state. Consider, for example, the following code:

 extern int g_nCount;     // a global variable  ...  g_nCount = g_nCount + 10; 

On most processors, this addition is not an atomic instruction, that is, the compiler would generate more than one instruction of machine code for the above statement. On the Intel x86 architecture, the following lines of machine language instructions were generated: [2]

[2] The code was compiled without any optimization flags turned on.

 mov eax, DWORD PTR ?g_nCount@@3HA       ; g_nCount  add eax, 10                             ; 0000000aH  mov DWORD PTR ?g_nCount@@3HA, eax       ; g_nCount 

If both the threads executed this sequence, we would expect that the value of g_nCount would be 20 more than it was before any thread executed it. However, if one thread gets preempted by the OS after having executed the first mov instruction, the other thread will pick up the same value of g_nCount as the first thread, both will add 10 onto that value, and when the result is stored back, the final value of g_nCount is higher by just 10.

Shared data conflicts may manifest themselves in a number of ways. In fact, even if only one thread updates the data and the other threads just read it, the data may still get into an inconsistent state. For example, let s say a C structure is being shared between multiple threads. The structure contains a list of names (as string pointers) and a variable, count, that indicates the total number of items in the list. Let s say a thread updates the structure by removing a name from the list and then adjusts the variable count to reflect the new total. If the thread is preempted before count is updated, a different thread will pick the wrong value of count and will try to use a string pointer that has already been deleted. This, in all likelihood, will result in an access violation.

If a memory location (or any other resource) will be accessed concurrently from more than one thread, the developer has to provide some explicit mechanism to synchronize access to such a shared resource.

Fortunately, the Win32 SDK provides many primitives such as critical sections, mutexes, semaphores, and events to achieve synchronization between threads. The first three primitives are generally used to provide mutual exclusion to some shared resource, that is, allow only one thread to access the resource at any given time. The fourth primitive is typically used to send a signal to a thread. The thread can then act upon this signal and take action.

Thread Affinity

Under the Win32 system, certain resources have thread affinity, that is, such resources can only be used by a specific thread. Some examples follow:

Critical sections and mutexes have thread affinity. For example, you cannot enter a critical section on one thread and leave the critical section from another thread.

A TLS by definition has thread affinity. A TLS from one thread is not available in any other thread.

All Windows user-interface (UI) related objects, such as window handles, Windows messages, etc., have thread affinity. A Windows message is a structure of type MSG. A message can be posted to a window handle using PostMessage API. The message goes in an MSG queue associated with the thread that created the window handle. In order to receive and process a message from the message queue, a developer has to set up a window procedure (WndProc) and a message pump. [3] A message pump uses some variant of the following code:

[3] Interaction of a thread and its message queue can be found in WIN32 SDK documentation or in any standard Windows programming book.

 MSG msg;  While (GetMessage(&msg, 0, 0, 0)) {   DispatchMessage(&msg);  } 

One of the fields in the MSG structure is the window handle for which the message was intended. Function call DispatchMessage dispatches the received message to the WndProc associated with this window handle.

Because of the thread affinity of Windows messages, the thread that created the window handle should also be the thread that implements the message pump. As a matter of fact, GetMessage can receive messages for the current thread only (the thread that called the function).


If two threads wait on each other to release a shared resource before resuming their execution, a deadlock condition occurs. As all the threads participating in a deadlock are suspended and cannot, therefore, release the resources they own, no thread can continue execution. As a result, the application hangs.

Incorrect Use of a Synchronization Object

When a synchronization object is used to guard a shared resource, a typical sequence of operations is as follows:

  1. Lock the synchronization object

  2. Use the shared resource

  3. Unlock the synchronization object

If the developer forgets to unlock the synchronization object after locking it, the resource would become inaccessible to any other thread.

A more serious problem occurs when the thread is terminated abnormally. The SDK provides an API, TerminateThread, to kill a thread. When this API is called, the target thread has no chance to execute any user-mode code and its initial stack is not deallocated. This can result in the following problems:

  • If the target thread has entered a critical section, the critical section will not be released. Thus, some resource may become inaccessible to other threads.

  • If the target thread is executing certain kernel level calls when it is terminated, the kernel state for the thread s process could be inconsistent.

  • If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.

In general, threads should not be killed abnormally. A better approach is to send a signal (the event synchronization primitive can be used here) to the thread. Of course, the thread procedure should periodically check for this event and return gracefully from the procedure if the event has been fired.

The StopThread method on the CCPLWinThread class does just this. It triggers an event represented by the member variable, m_hStopEvent. The implementor of the thread procedure has to check for this event periodically by calling a method on the class, Wait, as shown below:

 void Proc()  {   for(;;) {       HANDLE hWait = this->Wait(1000);   // check for stop request                                           // for 1000 milliseconds      if (hWait == m_hStopEvent) {       return;      }      ... // do something    }  } 

Method Wait need not wait for just the stop-event synchronization handle. You can add other synchronization handles to wait for by calling CCPLWinThread s method, AddHandleToWaitGroup.


Using more threads doesn t always translate into greater performance for a couple of reasons:

  • Each thread consumes some system resources. If the resources available to the OS decrease, the overall performance degrades.

  • Thread-switching is a very expensive operation. The OS has to save the thread-context (the register values, etc.) of the executing thread and load the thread-context of the new thread.

Now that we understand the basics of multithread programming and the problems associated with it, let s see what COM has to offer to help us simplify developing a thread-safe component.


COM+ Programming. A Practical Guide Using Visual C++ and ATL
COM+ Programming. A Practical Guide Using Visual C++ and ATL
ISBN: 130886742
Year: 2000
Pages: 129 © 2008-2017.
If you may any questions please contact us: