Section 18.5. threading Module


18.5. tHReading Module

We will now introduce the higher-level tHReading module, which gives you not only a THRead class but also a wide variety of synchronization mechanisms to use to your heart's content. Table 18.2 represents a list of all the objects available in the tHReading module.

Table 18.2. threading Module Objects

tHReading Module Objects

Description

Thread

Object that represents a single thread of execution

Lock

Primitive lock object (same lock object as in the tHRead module)

RLock

Re-entrant lock object provides ability for a single thread to (re)acquire an already-held lock (recursive locking)

Condition

Condition variable object causes one thread to wait until a certain "condition" has been satisfied by another thread, such as changing of state or of some data value EventGeneral version of condition variables whereby any number of threads are waiting for some event to occur and all will awaken when the event happens

Semaphore

Provides a "waiting area"-like structure for threads waiting on a lock

BoundedSemaphore

Similar to a Semaphore but ensures it never exceeds its initial value

Timer

Similar to Thread except that it waits for an allotted period of time before running


In this section, we will examine how to use the THRead class to implement threading. Since we have already covered the basics of locking, we will not cover the locking primitives here. The THRead() class also contains a form of synchronization, so explicit use of locking primitives is not necessary.

Core Tip: Daemon threads

Another reason to avoid using the thread module is that it does not support the concept of daemon (or daemonic) threads. When the main thread exits, all child threads will be killed regardless of whether they are doing work. The concept of daemon threads comes into play here if you do not want this behavior.

Support for daemon threads is available in the threading module, and here is how they work: a daemon is typically a server that waits for client requests to service. If there is no client work to be done, the daemon just sits around idle. If you set the daemon flag for a thread, you are basically saying that it is non-critical, and it is okay for the process to exit without waiting for it to "finish." As you have seen in Chapter 16, "Network Programming" server threads run in an infinite loop and do not exit in normal situations.

If your main thread is ready to exit and you do not care to wait for the child threads to finish, then set their daemon flag. Think of setting this flag as denoting a thread to be "not important." You do this by calling each thread's setDaemon() method, e.g., thread.setDae-mon(True), before it begins running (tHRead.start().)

If you want to wait for child threads to finish, just leave them as-is, or ensure that their daemon flags are off by explicitly calling tHRead.setDaemon (False) before starting them. You can check a thread's daemonic status with thread.isDaemon(). A new child thread inherits its daemonic flag from its parent. The entire Python program will stay alive until all non-daemonic threads have exited, in other words, when no active non-daemonic threads are left).


18.5.1. Thread Class

The THRead class of the threading is your primary executive object. It has a variety of functions not available to the thread module, and are outlined in Table 18.3.

Table 18.3. Thread Object Methods

Method

Description

start()

Begin thread execution

run()

Method defining thread functionality (usually overridden by application writer in a subclass)

join(timeout = None)

Suspend until the started thread terminates; blocks unless timeout (in seconds) is given

getName()

Return name of thread

setName(name)

Set name of thread

isAlive()

Boolean flag indicating whether thread is still running

isDaemon()

Return daemon flag of thread

setDaemon(daemonic)

Set the daemon flag of thread as per the Boolean daemonic (must be called before thread start()ed)


There are a variety of ways you can create threads using the Thread class. We cover three of them here, all quite similar. Pick the one you feel most comfortable with, not to mention the most appropriate for your application and future scalability (we like the final choice the best):

  • Create Thread instance, passing in function

  • Create THRead instance, passing in callable class instance

  • Subclass THRead and create subclass instance

Create THRead Instance, Passing in Function

In our first example, we will just instantiate THRead, passing in our function (and its arguments) in a manner similar to our previous examples. This function is what will be executed when we direct the thread to begin execution. Taking our mtsleep2.py script and tweaking it, adding the use of Thread objects, we have mtsleep3.py, shown in Example 18.4.

Example 18.4. Using the tHReading Module (mtsleep3.py)

The Thread class from the threading module has a join() method that lets the main thread wait for thread completion.

1     #!/usr/bin/env python 2 3     import threading 4     from time import sleep, ctime 5 6     loops = [4,2] 7 8     def loop(nloop, nsec): 9         print 'start loop', nloop, 'at:', ctime() 10        sleep(nsec) 11        print 'loop', nloop, 'done at:', ctime() 12 13    def main(): 14        print 'starting at:', ctime() 15        threads = [] 16        nloops = range(len(loops)) 17 18       for i in nloops: 19           t = threading.Thread(target=loop, 20               args=(i, loops[i])) 21           threads.append(t) 22 23       for i in nloops:           # start threads 24           threads[i].start() 25 26       for i in nloops:           # wait for all 27           threads[i].join()      # threads to finish 28 29       print 'all DONE at:', ctime() 30 31    if __name__ == '__main__': 32        main()

When we run it, we see output similar to its predecessors' output:

$ mtsleep3.py starting at: Sun Aug 13 18:16:38 2006 start loop 0 at: Sun Aug 13 18:16:38 2006 start loop 1 at: Sun Aug 13 18:16:38 2006 loop 1 done at: Sun Aug 13 18:16:40 2006 loop 0 done at: Sun Aug 13 18:16:42 2006 all DONE at: Sun Aug 13 18:16:42 2006


So what did change? Gone are the locks that we had to implement when using the tHRead module. Instead, we create a set of Thread objects. When each Thread is instantiated, we dutifully pass in the function (target) and arguments (args) and receive a THRead instance in return. The biggest difference between instantiating Thread [calling Thread()] and invoking thread.start_new_thread() is that the new thread does not begin execution right away. This is a useful synchronization feature, especially when you don't want the threads to start immediately.

Once all the threads have been allocated, we let them go off to the races by invoking each thread's start() method, but not a moment before that. And rather than having to manage a set of locks (allocating, acquiring, releasing, checking lock state, etc.), we simply call the join() method for each thread. join() will wait until a thread terminates, or, if provided, a timeout occurs. Use of join() appears much cleaner than an infinite loop waiting for locks to be released (causing these locks to sometimes be known as "spin locks").

One other important aspect of join() is that it does not need to be called at all. Once threads are started, they will execute until their given function completes, whereby they will exit. If your main thread has things to do other than wait for threads to complete (such as other processing or waiting for new client requests), it should by all means do so. join() is useful only when you want to wait for thread completion.

Create Thread Instance, Passing in Callable Class Instance

A similar offshoot to passing in a function when creating a thread is to have a callable class and passing in an instance for executionthis is the more OO approach to MT programming. Such a callable class embodies an execution environment that is much more flexible than a function or choosing from a set of functions. You now have the power of a class object behind you, as opposed to a single function or a list/tuple of functions.

Adding our new class ThreadFunc to the code and making other slight modifications to mtsleep3.py, we get mtsleep4.py, given in Example 18.5.

Example 18.5. Using Callable classes (mtsleep4.py)

In this example we pass in a callable class (instance) as opposed to just a function. It presents more of an OO approach than mtsleep3.py.

1     #!/usr/bin/env python 2 3     import threading 4     from time import sleep, ctime 5 6     loops = [4,2] 7 8     class ThreadFunc(object): 9 10        def __init__(self, func, args, name=''): 11            self.name = name 12            self.func = func 13            self.args = args 14 15        def __call__(self): 16            apply(self.func, self.args) 17 18    def loop(nloop, nsec): 19        print 'start loop', nloop, 'at:', ctime() 20        sleep(nsec) 21        print 'loop', nloop, 'done at:', ctime() 22 23    def main(): 24        print 'starting at:', ctime() 25        threads = [] 26        nloops = range(len(loops)) 27 28        for i in nloops: # create all threads 29            t = threading.Thread( 30                target=ThreadFunc(loop, (i, loops[i]), 31                loop.__name__)) 32            threads.append(t) 33 34        for i in nloops: # start all threads 35            threads[i].start() 36 37        for i in nloops: # wait for completion 38            threads[i].join() 39 40        print 'all DONE at:', ctime() 41 42    if __name__ == '__main__': 43        main()

If we run mtsleep4.py, we get the expected output:

$ mtsleep4.py starting at: Sun Aug 13 18:49:17 2006 start loop 0 at: Sun Aug 13 18:49:17 2006 start loop 1 at: Sun Aug 13 18:49:17 2006 loop 1 done at: Sun Aug 13 18:49:19 2006 loop 0 done at: Sun Aug 13 18:49:21 2006 all DONE at: Sun Aug 13 18:49:21 2006


So what are the changes this time? The addition of the ThreadFunc class and a minor change to instantiate the THRead object, which also instantiates THReadFunc, our callable class. In effect, we have a double instantiation going on here. Let's take a closer look at our THReadFunc class.

We want to make this class general enough to use with functions other than our loop() function, so we added some new infrastructure, such as having this class hold the arguments for the function, the function itself, and also a function name string. The constructor __init__() just sets all the values.

When the Thread code calls our ThreadFuncobject when a new thread is created, it will invoke the __call__() special method. Because we already have our set of arguments, we do not need to pass it to the THRead() constructor, but do have to use apply() in our code now because we have an argument tuple. Those of you who have Python 1.6 and higher can use the new function invocation syntax described in Section 11.6.3 instead of using apply() on line 16:

self.res = self.func(*self.args)


Subclass THRead and Create Subclass Instance

The final introductory example involves subclassing THRead(), which turns out to be extremely similar to creating a callable class as in the previous example. Subclassing is a bit easier to read when you are creating your threads (lines 29-30). We will present the code for mtsleep5.py in Example 18.6 as well as the output obtained from its execution, and leave it as an exercise for the reader to compare mtsleep5.py to mtsleep4.py.

Example 18.6. Subclassing Thread (mtsleep5.py)

Rather than instantiating the Thread class, we subclass it. This gives us more flexibility in customizing our threading objects and simplifies the thread creation call.

1     #!/usr/bin/env python 2 3     import threading 4     from time import sleep, ctime 5 6     loops = (4, 2) 7 8     class MyThread(threading.Thread): 9         def __init__(self, func, args, name=''): 10            threading.Thread.__init__(self) 11            self.name = name 12            self.func = func 13            self.args = args 14 15        def run(self): 16            apply(self.func, self.args) 17 18    def loop(nloop, nsec): 19        print 'start loop', nloop, 'at:', ctime() 20        sleep(nsec) 21        print 'loop', nloop, 'done at:', ctime() 22 23    def main(): 24        print 'starting at:', ctime() 25        threads = [] 26        nloops = range(len(loops)) 27 28        for i in nloops: 29            t = MyThread(loop, (i, loops[i]), 30                loop.__name__) 31            threads.append(t) 32 33        for i in nloops: 34            threads[i].start() 35 36        for i in nloops: 37            threads[i].join() 38 39        print 'all DONE at:', ctime()' 40 41    if __name__ == '__main__': 42        main()

Here is the output for mtsleep5.py, again, just what we expected:

$ mtsleep5.py starting at: Sun Aug 13 19:14:26 2006 start loop 0 at: Sun Aug 13 19:14:26 2006 start loop 1 at: Sun Aug 13 19:14:26 2006 loop 1 done at: Sun Aug 13 19:14:28 2006 loop 0 done at: Sun Aug 13 19:14:30 2006 all DONE at: Sun Aug 13 19:14:30 2006


While the reader compares the source between the mtsleep4 and mtsleep5 modules, we want to point out the most significant changes: (1) our MyThread subclass constructor must first invoke the base class constructor (line 9), and (2) the former special method __call__() must be called run() in the subclass.

We now modify our MyThread class with some diagnostic output and store it in a separate module called myThread (see Example 18.7) and import this class for the upcoming examples. Rather than simply calling apply() to run our functions, we also save the result to instance attribute self.res, and create a new method to retrieve that value, getresult().

Example 18.7. MyThread Subclass of Thread (myThread.py)

To generalize our subclass of Thread from mtsleep5.py, we move the subclass to a separate module and add a getResult() method for callables that produce return values.

1   #!/usr/bin/env python 2 3   import threading 4   from time import ctime 5 6   class MyThread(threading.Thread): 7       def __init__(self, func, args, name=''): 8             threading.Thread.__init__(self) 9             self.name = name 10            self.func = func 11            self.args = args 12 13       def getResult(self): 14           return self.res 15 16       def run(self): 17           print 'starting', self.name, 'at:', \ 18               ctime() 19            self.res = apply(self.func, self.args) 20           print self.name, 'finished at:', \ 21               ctime()

18.5.4. Fibonacci and Factorial ... Take Two, Plus Summation

The mtfacfib.py script, given in Example 18.8, compares execution of the recursive Fibonacci, factorial, and summation functions. This script runs all three functions in a single-threaded manner, then performs the same task using threads to illustrate one of the advantages of having a threading environment.

Example 18.8. Fibonacci, Factorial, Summation (mtfacfib.py)

In this MT application, we execute three separate recursive functionsfirst in a single-threaded fashion, followed by the alternative with multiple threads.

1      #!/usr/bin/env python 2 3      from myThread import MyThread 4      from time import ctime, sleep 5 6      def fib(x): 7          sleep(0.005) 8          if x < 2: return 1 9          return (fib(x-2) + fib(x-1)) 10 11     def fac(x): 12         sleep(0.1) 13         if x < 2: return 1 14         return (x * fac(x-1)) 15 16     def sum(x): 17         sleep(0.1) 18         if x < 2: return 1 19         return (x + sum(x-1)) 20 21     funcs = [fib, fac, sum] 22     n = 12 23 24     def main(): 25         nfuncs = range(len(funcs)) 26 27         print '*** SINGLE THREAD' 28         for i in   nfuncs: 29             print 'starting', funcs[i].__name__, 'at:', \ 30                 ctime() 31             print funcs[i](n) 32             print funcs[i].__name__, 'finished at:', \ 33                 ctime() 34 35         print '\n*** MULTIPLE THREADS' 36         threads = [] 37         for i in nfuncs: 38             t = MyThread(funcs[i], (n,), 39                 funcs[i].__name__) 40             threads.append(t) 41 42         for i in nfuncs: 43             threads[i].start() 44 45         for i in nfuncs: 46             threads[i].join() 47             print threads[i].getResult() 48 49         print 'all DONE' 50 51     if __name__ == '__main__': 52         main()

Running in single-threaded mode simply involves calling the functions one at a time and displaying the corresponding results right after the function call.

When running in multithreaded mode, we do not display the result right away. Because we want to keep our MyThread class as general as possible (being able to execute callables that do and do not produce output), we wait until the end to call the geTResult() method to finally show you the return values of each function call.

Because these functions execute so quickly (well, maybe except for the Fibonacci function), you will notice that we had to add calls to sleep() to each function to slow things down so that we can see how threading may improve performance, if indeed the actual work had varying execution times you certainly wouldn't pad your work with calls to sleep(). Anyway, here is the output:

$ mtfacfib.py *** SINGLE THREAD starting fib at: Sun Jun 18 19:52:20 2006 233 fib finished at: Sun Jun 18 19:52:24 2006 starting fac at: Sun Jun 18 19:52:24 2006 479001600 fac finished at: Sun Jun 18 19:52:26 2006 starting sum at: Sun Jun 18 19:52:26 2006 78 sum finished at: Sun Jun 18 19:52:27 2006 *** MULTIPLE THREADS starting fib at: Sun Jun 18 19:52:27 2006 starting fac at: Sun Jun 18 19:52:27 2006 starting sum at: Sun Jun 18 19:52:27 2006 fac finished at: Sun Jun 18 19:52:28 2006 sum finished at: Sun Jun 18 19:52:28 2006 fib finished at: Sun Jun 18 19:52:31 2006 233 479001600 78 all DONE


18.5.5. Other Threading Module Functions

In addition to the various synchronization and threading objects, the Threading module also has some supporting functions, detailed in Table 18.4.

Table 18.4. threading Module Functions

Function

Description

activeCount()

Number of currently active Thread objects

currentThread()

Returns the current THRead object

enumerate()

Returns list of all currently active Threads

settrace(func)[a]

Sets a trace function for all threads

setprofile(func)[a]

Sets a profile function for all threads


[a] New in Python 2.3.

18.5.6. Producer-Consumer Problem and the Queue Module

The final example illustrates the producer-consumer scenario where a producer of goods or services creates goods and places it in a data structure such as a queue. The amount of time between producing goods is non-deterministic, as is the consumer consuming the goods produced by the producer.

We use the Queue module to provide an interthread communication mechanism that allows threads to share data with each other. In particular, we create a queue into which the producer (thread) places new goods and the consumer (thread) consumes them. To do this, we will use the following attributes from the Queue module (see Table 18.5).

Table 18.5. Common Queue Module Attributes

Function/Method

Description

Queue Module Function

queue(size)

Creates a Queue object of given size

Queue Object Methods

qsize()

Returns queue size (approximate, since queue may be getting updated by other threads)

empty()

Returns TRue if queue empty, False otherwise

full()

Returns true if queue full, False otherwise

put(item, block=0)

Puts item in queue, if block given (not 0), block until room is available

get(block=0)

Gets item from queue, if block given (not 0), block until an item is available


Without further ado, we present the code for prodcons.py, shown in Example 18.9.

Example 18.9. Producer-Consumer Problem (prodcons.py)

We feature an implementation of the Producer-Consumer problem using Queue objects and a random number of goods produced (and consumed). The producer and consumer are individuallyand concurrentlyexecuting threads.

1      #!/usr/bin/env python 2 3      from random import randint 4      from time import  sleep 5      from Queue import  Queue 6      from myThread import MyThread 7 8      def writeQ(queue): 9          print 'producing object for Q...', 10         queue.put('xxx', 1) 11         print "size now", queue.qsize() 12 13     def readQ(queue): 14         val = queue.get(1) 15         print 'consumed object from Q... size now', \ 16                 queue.qsize() 17 18     def writer(queue, loops): 19         for i in range(loops): 20             writeQ(queue) 21             sleep(randint(1, 3)) 22 23     def reader(queue, loops): 24         for i in range(loops): 25             readQ(queue) 26             sleep(randint(2, 5)) 27 28     funcs = [writer, reader] 29     nfuncs = range(len(funcs)) 30 31     def main(): 32         nloops = randint(2, 5) 33         q = Queue(32) 34 35         threads = [] 36         for i in nfuncs: 37             t = MyThread(funcs[i], (q, nloops), 38                 funcs[i].__name__) 39             threads.append(t) 40 41         for i in nfuncs: 42             threads[i].start() 43 44         for i in nfuncs: 45             threads[i].join() 46 47         print 'all DONE' 48 49    if __name__ == '__main__': 50        main()

Here is the output from one execution of this script:

$ prodcons.py starting writer at: Sun Jun 18 20:27:07 2006 producing object for Q... size now 1 starting reader at: Sun Jun 18 20:27:07 2006 consumed object from Q... size now 0 producing object for Q... size now 1 consumed object from Q... size now 0 producing object for Q... size now 1 producing object for Q... size now 2 producing object for Q... size now 3 consumed object from Q... size now 2 consumed object from Q... size now 1 writer finished at: Sun Jun 18 20:27:17 2006 consumed object from Q... size now 0 reader finished at: Sun Jun 18 20:27:25 2006 all DONE


As you can see, the producer and consumer do not necessarily alternate in execution. (Thank goodness for random numbers!) Seriously, though, real life is generally random and non-deterministic.

Line-by-Line Explanation
Lines 16

In this module, we will use the Queue.Queue object as well as our thread class myThread.MyThread, which we gave in Example 18.7. We will use random.randint() to make production and consumption somewhat varied, and also grab the usual suspects from the time module.

Lines 816

The writeQ() and readQ() functions each have a specific purpose, to place an object in the queuewe are using the string 'xxx', for exampleand to consume a queued object, respectively. Notice that we are producing one object and reading one object each time.

Lines 1826

The writer() is going to run as a single thread whose sole purpose is to produce an item for the queue, wait for a bit, then do it again, up to the specified number of times, chosen randomly per script execution. The reader() will do likewise, with the exception of consuming an item, of course.

You will notice that the random number of seconds that the writer sleeps is in general shorter than the amount of time the reader sleeps. This is to discourage the reader from trying to take items from an empty queue. By giving the writer a shorter time period of waiting, it is more likely that there will already be an object for the reader to consume by the time their turn rolls around again.

Lines 2829

These are just setup lines to set the total number of threads that are to be spawned and executed.

Lines 3147

Finally, we have our main() function, which should look quite similar to the main() in all of the other scripts in this chapter. We create the appropriate threads and send them on their way, finishing up when both threads have concluded execution.

We infer from this example that a program that has multiple tasks to perform can be organized to use separate threads for each of the tasks. This can result in a much cleaner program design than a single threaded program that attempts to do all of the tasks.

In this chapter, we illustrated how a single-threaded process may limit an application's performance. In particular, programs with independent, non-deterministic, and non-causal tasks that execute sequentially can be improved by division into separate tasks executed by individual threads. Not all applications may benefit from multithreading due to overhead and the fact that the Python interpreter is a single-threaded application, but now you are more cognizant of Python's threading capabilities and can use this tool to your advantage when appropriate.



Core Python Programming
Core Python Programming (2nd Edition)
ISBN: 0132269937
EAN: 2147483647
Year: 2004
Pages: 334
Authors: Wesley J Chun

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net