18.5. tHReading ModuleWe will now introduce the higher-level tHReading module, which gives you not only a THRead class but also a wide variety of synchronization mechanisms to use to your heart's content. Table 18.2 represents a list of all the objects available in the tHReading module.
In this section, we will examine how to use the THRead class to implement threading. Since we have already covered the basics of locking, we will not cover the locking primitives here. The THRead() class also contains a form of synchronization, so explicit use of locking primitives is not necessary. Core Tip: Daemon threads
18.5.1. Thread ClassThe THRead class of the threading is your primary executive object. It has a variety of functions not available to the thread module, and are outlined in Table 18.3.
There are a variety of ways you can create threads using the Thread class. We cover three of them here, all quite similar. Pick the one you feel most comfortable with, not to mention the most appropriate for your application and future scalability (we like the final choice the best):
Create THRead Instance, Passing in FunctionIn our first example, we will just instantiate THRead, passing in our function (and its arguments) in a manner similar to our previous examples. This function is what will be executed when we direct the thread to begin execution. Taking our mtsleep2.py script and tweaking it, adding the use of Thread objects, we have mtsleep3.py, shown in Example 18.4. Example 18.4. Using the tHReading Module (mtsleep3.py)
When we run it, we see output similar to its predecessors' output: $ mtsleep3.py starting at: Sun Aug 13 18:16:38 2006 start loop 0 at: Sun Aug 13 18:16:38 2006 start loop 1 at: Sun Aug 13 18:16:38 2006 loop 1 done at: Sun Aug 13 18:16:40 2006 loop 0 done at: Sun Aug 13 18:16:42 2006 all DONE at: Sun Aug 13 18:16:42 2006 So what did change? Gone are the locks that we had to implement when using the tHRead module. Instead, we create a set of Thread objects. When each Thread is instantiated, we dutifully pass in the function (target) and arguments (args) and receive a THRead instance in return. The biggest difference between instantiating Thread [calling Thread()] and invoking thread.start_new_thread() is that the new thread does not begin execution right away. This is a useful synchronization feature, especially when you don't want the threads to start immediately. Once all the threads have been allocated, we let them go off to the races by invoking each thread's start() method, but not a moment before that. And rather than having to manage a set of locks (allocating, acquiring, releasing, checking lock state, etc.), we simply call the join() method for each thread. join() will wait until a thread terminates, or, if provided, a timeout occurs. Use of join() appears much cleaner than an infinite loop waiting for locks to be released (causing these locks to sometimes be known as "spin locks"). One other important aspect of join() is that it does not need to be called at all. Once threads are started, they will execute until their given function completes, whereby they will exit. If your main thread has things to do other than wait for threads to complete (such as other processing or waiting for new client requests), it should by all means do so. join() is useful only when you want to wait for thread completion. Create Thread Instance, Passing in Callable Class InstanceA similar offshoot to passing in a function when creating a thread is to have a callable class and passing in an instance for executionthis is the more OO approach to MT programming. Such a callable class embodies an execution environment that is much more flexible than a function or choosing from a set of functions. You now have the power of a class object behind you, as opposed to a single function or a list/tuple of functions. Adding our new class ThreadFunc to the code and making other slight modifications to mtsleep3.py, we get mtsleep4.py, given in Example 18.5. Example 18.5. Using Callable classes (mtsleep4.py)
If we run mtsleep4.py, we get the expected output: $ mtsleep4.py starting at: Sun Aug 13 18:49:17 2006 start loop 0 at: Sun Aug 13 18:49:17 2006 start loop 1 at: Sun Aug 13 18:49:17 2006 loop 1 done at: Sun Aug 13 18:49:19 2006 loop 0 done at: Sun Aug 13 18:49:21 2006 all DONE at: Sun Aug 13 18:49:21 2006 So what are the changes this time? The addition of the ThreadFunc class and a minor change to instantiate the THRead object, which also instantiates THReadFunc, our callable class. In effect, we have a double instantiation going on here. Let's take a closer look at our THReadFunc class. We want to make this class general enough to use with functions other than our loop() function, so we added some new infrastructure, such as having this class hold the arguments for the function, the function itself, and also a function name string. The constructor __init__() just sets all the values. When the Thread code calls our ThreadFuncobject when a new thread is created, it will invoke the __call__() special method. Because we already have our set of arguments, we do not need to pass it to the THRead() constructor, but do have to use apply() in our code now because we have an argument tuple. Those of you who have Python 1.6 and higher can use the new function invocation syntax described in Section 11.6.3 instead of using apply() on line 16: self.res = self.func(*self.args) Subclass THRead and Create Subclass InstanceThe final introductory example involves subclassing THRead(), which turns out to be extremely similar to creating a callable class as in the previous example. Subclassing is a bit easier to read when you are creating your threads (lines 29-30). We will present the code for mtsleep5.py in Example 18.6 as well as the output obtained from its execution, and leave it as an exercise for the reader to compare mtsleep5.py to mtsleep4.py. Example 18.6. Subclassing Thread (mtsleep5.py)
Here is the output for mtsleep5.py, again, just what we expected: $ mtsleep5.py starting at: Sun Aug 13 19:14:26 2006 start loop 0 at: Sun Aug 13 19:14:26 2006 start loop 1 at: Sun Aug 13 19:14:26 2006 loop 1 done at: Sun Aug 13 19:14:28 2006 loop 0 done at: Sun Aug 13 19:14:30 2006 all DONE at: Sun Aug 13 19:14:30 2006 While the reader compares the source between the mtsleep4 and mtsleep5 modules, we want to point out the most significant changes: (1) our MyThread subclass constructor must first invoke the base class constructor (line 9), and (2) the former special method __call__() must be called run() in the subclass. We now modify our MyThread class with some diagnostic output and store it in a separate module called myThread (see Example 18.7) and import this class for the upcoming examples. Rather than simply calling apply() to run our functions, we also save the result to instance attribute self.res, and create a new method to retrieve that value, getresult(). Example 18.7. MyThread Subclass of Thread (myThread.py)
18.5.4. Fibonacci and Factorial ... Take Two, Plus SummationThe mtfacfib.py script, given in Example 18.8, compares execution of the recursive Fibonacci, factorial, and summation functions. This script runs all three functions in a single-threaded manner, then performs the same task using threads to illustrate one of the advantages of having a threading environment. Example 18.8. Fibonacci, Factorial, Summation (mtfacfib.py)
Running in single-threaded mode simply involves calling the functions one at a time and displaying the corresponding results right after the function call. When running in multithreaded mode, we do not display the result right away. Because we want to keep our MyThread class as general as possible (being able to execute callables that do and do not produce output), we wait until the end to call the geTResult() method to finally show you the return values of each function call. Because these functions execute so quickly (well, maybe except for the Fibonacci function), you will notice that we had to add calls to sleep() to each function to slow things down so that we can see how threading may improve performance, if indeed the actual work had varying execution times you certainly wouldn't pad your work with calls to sleep(). Anyway, here is the output: $ mtfacfib.py *** SINGLE THREAD starting fib at: Sun Jun 18 19:52:20 2006 233 fib finished at: Sun Jun 18 19:52:24 2006 starting fac at: Sun Jun 18 19:52:24 2006 479001600 fac finished at: Sun Jun 18 19:52:26 2006 starting sum at: Sun Jun 18 19:52:26 2006 78 sum finished at: Sun Jun 18 19:52:27 2006 *** MULTIPLE THREADS starting fib at: Sun Jun 18 19:52:27 2006 starting fac at: Sun Jun 18 19:52:27 2006 starting sum at: Sun Jun 18 19:52:27 2006 fac finished at: Sun Jun 18 19:52:28 2006 sum finished at: Sun Jun 18 19:52:28 2006 fib finished at: Sun Jun 18 19:52:31 2006 233 479001600 78 all DONE 18.5.5. Other Threading Module FunctionsIn addition to the various synchronization and threading objects, the Threading module also has some supporting functions, detailed in Table 18.4.
18.5.6. Producer-Consumer Problem and the Queue ModuleThe final example illustrates the producer-consumer scenario where a producer of goods or services creates goods and places it in a data structure such as a queue. The amount of time between producing goods is non-deterministic, as is the consumer consuming the goods produced by the producer. We use the Queue module to provide an interthread communication mechanism that allows threads to share data with each other. In particular, we create a queue into which the producer (thread) places new goods and the consumer (thread) consumes them. To do this, we will use the following attributes from the Queue module (see Table 18.5).
Without further ado, we present the code for prodcons.py, shown in Example 18.9. Example 18.9. Producer-Consumer Problem (prodcons.py)
Here is the output from one execution of this script: $ prodcons.py starting writer at: Sun Jun 18 20:27:07 2006 producing object for Q... size now 1 starting reader at: Sun Jun 18 20:27:07 2006 consumed object from Q... size now 0 producing object for Q... size now 1 consumed object from Q... size now 0 producing object for Q... size now 1 producing object for Q... size now 2 producing object for Q... size now 3 consumed object from Q... size now 2 consumed object from Q... size now 1 writer finished at: Sun Jun 18 20:27:17 2006 consumed object from Q... size now 0 reader finished at: Sun Jun 18 20:27:25 2006 all DONE As you can see, the producer and consumer do not necessarily alternate in execution. (Thank goodness for random numbers!) Seriously, though, real life is generally random and non-deterministic. Line-by-Line ExplanationLines 16In this module, we will use the Queue.Queue object as well as our thread class myThread.MyThread, which we gave in Example 18.7. We will use random.randint() to make production and consumption somewhat varied, and also grab the usual suspects from the time module. Lines 816The writeQ() and readQ() functions each have a specific purpose, to place an object in the queuewe are using the string 'xxx', for exampleand to consume a queued object, respectively. Notice that we are producing one object and reading one object each time. Lines 1826The writer() is going to run as a single thread whose sole purpose is to produce an item for the queue, wait for a bit, then do it again, up to the specified number of times, chosen randomly per script execution. The reader() will do likewise, with the exception of consuming an item, of course. You will notice that the random number of seconds that the writer sleeps is in general shorter than the amount of time the reader sleeps. This is to discourage the reader from trying to take items from an empty queue. By giving the writer a shorter time period of waiting, it is more likely that there will already be an object for the reader to consume by the time their turn rolls around again. Lines 2829These are just setup lines to set the total number of threads that are to be spawned and executed. Lines 3147Finally, we have our main() function, which should look quite similar to the main() in all of the other scripts in this chapter. We create the appropriate threads and send them on their way, finishing up when both threads have concluded execution. We infer from this example that a program that has multiple tasks to perform can be organized to use separate threads for each of the tasks. This can result in a much cleaner program design than a single threaded program that attempts to do all of the tasks. In this chapter, we illustrated how a single-threaded process may limit an application's performance. In particular, programs with independent, non-deterministic, and non-causal tasks that execute sequentially can be improved by division into separate tasks executed by individual threads. Not all applications may benefit from multithreading due to overhead and the fact that the Python interpreter is a single-threaded application, but now you are more cognizant of Python's threading capabilities and can use this tool to your advantage when appropriate. |