Section 18.4. thread Module

18.4. `tHRead` Module

Let's take a look at what the tHRead module has to offer. In addition to being able to spawn threads, the tHRead module also provides a basic synchronization data structure called a lock object (aka primitive lock, simple lock, mutual exclusion lock, mutex, binary semaphore). As we mentioned earlier, such synchronization primitives go hand in hand with thread management.

Listed in Table 18.1 are the more commonly used thread functions and LockType lock object methods.

Table 18.1. `thread` Module and Lock Objects
Function/Method	Description
`tHRead` Module Functions
`start_new_thread`(`function, args, kwargs=None)`	Spawns a new thread and execute `function` with the given `args` and optional `kwargs`
`allocate_lock()`	Allocates `LockType` lock object
`exit()`	Instructs a thread to exit
`LockType` Lock Object Methods
`acquire`(`wait=None`)	Attempts to acquire lock object
`locked()`	Returns True if lock acquired, False otherwise
`release()`	Releases lock

The key function of the thread module is start_new_thread(). Its syntax is exactly that of the apply() built-in function, taking a function along with arguments and optional keyword arguments. The difference is that instead of the main thread executing the function, a new thread is spawned to invoke the function.

Let's take our onethr.py example and integrate threading into it. By slightly changing the call to the loop*() functions, we now present mtsleep1.py in Example 18.2.

Example 18.2. Using the `thread` Module (`mtsleep1.py`)

The same loops from onethr.py are executed, but this time using the simple multithreaded mechanism provided by the thread module. The two loops are executed concurrently (with the shorter one finishing first, obviously), and the total elapsed time is only as long as the slowest thread rather than the total time for each separately.

1     #!/usr/bin/env python 2 3     import thread 4     from time import sleep, ctime 5 6     def loop0(): 7         print 'start loop 0 at:', ctime() 8         sleep(4) 9         print 'loop 0 done at:', ctime() 10 11    def loop1(): 12        print 'start loop 1 at:', ctime() 13        sleep(2) 14        print 'loop 1 done at:', ctime() 15 16    def main(): 17        print 'starting at:', ctime() 18        thread.start_new_thread(loop0, ()) 19        thread.start_new_thread(loop1, ()) 20        sleep(6) 21        print 'all DONE at:', ctime() 22 23    if __name__ == '__main__': 24        main()

start_new_thread() requires the first two arguments, so that is the reason for passing in an empty tuple even if the executing function requires no arguments.

Upon execution of this program, our output changes drastically. Rather than taking a full 6 or 7 seconds, our script now runs in 4, the length of time of our longest loop, plus any overhead.

$ mtsleep1.py starting at: Sun Aug 13 05:04:50 2006 start loop 0 at: Sun Aug 13 05:04:50 2006 start loop 1 at: Sun Aug 13 05:04:50 2006 loop 1 done at: Sun Aug 13 05:04:52 2006 loop 0 done at: Sun Aug 13 05:04:54 2006 all DONE at: Sun Aug 13 05:04:56 2006

The pieces of code that sleep for 4 and 2 seconds now occur concurrently, contributing to the lower overall runtime. You can even see how loop 1 finishes before loop 0.

The only other major change to our application is the addition of the "sleep(6)" call. Why is this necessary? The reason is that if we did not stop the main thread from continuing, it would proceed to the next statement, displaying "all done" and exit, killing both threads running loop0() and loop1().

We did not have any code that told the main thread to wait for the child threads to complete before continuing. This is what we mean by threads requiring some sort of synchronization. In our case, we used another sleep() call as our synchronization mechanism. We used a value of 6 seconds because we know that both threads (which take 4 and 2 seconds, as you know) should have completed by the time the main thread has counted to 6.

You are probably thinking that there should be a better way of managing threads than creating that extra delay of 6 seconds in the main thread. Because of this delay, the overall runtime is no better than in our single-threaded version. Using sleep() for thread synchronization as we did is not reliable. What if our loops had independent and varying execution times? We may be exiting the main thread too early or too late. This is where locks come in.

Making yet another update to our code to include locks as well as getting rid of separate loop functions, we get mtsleep2.py, presented in Example 18.3. Running it, we see that the output is similar to mtsleep1.py. The only difference is that we did not have to wait the extra time for mtsleep1.py to conclude. By using locks, we were able to exit as soon as both threads had completed execution.

$ mtsleep2.py starting at: Sun Aug 13 16:34:41 2006 start loop 0 at: Sun Aug 13 16:34:41 2006 start loop 1 at: Sun Aug 13 16:34:41 2006 loop 1 done at: Sun Aug 13 16:34:43 2006 loop 0 done at: Sun Aug 13 16:34:45 2006 all DONE at: Sun Aug 13 16:34:45 2006

Example 18.3. Using `tHRead` and Locks (`mtsleep2.py`)

Rather than using a call to sleep() to hold up the main thread as in mtsleep1.py, the use of locks makes more sense.

1     #!/usr/bin/env python 2 3     import thread 4     from time import sleep, ctime 5 6     loops = [4,2] 7 8     def loop(nloop, nsec, lock): 9         print 'start loop', nloop, 'at:', ctime() 10        sleep(nsec) 11        print 'loop', nloop, 'done at:', ctime() 12        lock.release() 13 14    def main(): 15        print 'starting at:', ctime() 16        locks = [] 17        nloops = range(len(loops)) 18 19        for i in nloops: 20            lock = thread.allocate_lock() 21            lock.acquire() 22            locks.append(lock) 23 24        for i in nloops: 25            thread.start_new_thread(loop, 26                (i, loops[i], locks[i])) 27 28        for i in nloops: 29            while locks[i].locked(): pass 30 31        print 'all DONE at:', ctime() 32 33    if __name__ == '__main__': 34        main()

So how did we accomplish our task with locks? Let us take a look at the source code.

Line-by-Line Explanation

Lines 16

After the Unix startup line, we import the thread module and a few familiar attributes of the time module. Rather than hardcoding separate functions to count to 4 and 2 seconds, we will use a single loop() function and place these constants in a list, loops.

Lines 812

The loop() function will proxy for the now-removed loop*() functions from our earlier examples. We had to make some cosmetic changes to loop() so that it can now perform its duties using locks. The obvious changes are that we need to be told which loop number we are as well as how long to sleep for. The last piece of new information is the lock itself. Each thread will be allocated an acquired lock. When the sleep() time has concluded, we will release the corresponding lock, indicating to the main thread that this thread has completed.

Lines 1434

The bulk of the work is done here in main() using three separate for loops. We first create a list of locks, which we obtain using the thread.allocate_lock() function and acquire (each lock) with the acquire() method. Acquiring a lock has the effect of "locking the lock." Once it is locked, we add the lock to the lock list, locks. The next loop actually spawns the threads, invoking the loop() function per thread, and for each thread, provides it with the loop number, the time to sleep for, and the acquired lock for that thread. So why didn't we start the threads in the lock acquisition loop? There are several reasons: (1) we wanted to synchronize the threads, so that "all the horses started out the gate" around the same time, and (2) locks take a little bit of time to be acquired. If your thread executes "too fast," it is possible that it completes before the lock has a chance to be acquired.

It is up to each thread to unlock its lock object when it has completed execution. The final loop just sits and spins (pausing the main thread) until both locks have been released before continuing execution. Since we are checking each lock sequentially, we may be at the mercy of all the slower loops if they are more toward the beginning of the set of loops. In such cases, the majority of the wait time may be for the first loop(s). When that lock is released, remaining locks may have already been unlocked (meaning that corresponding threads have completed execution). The result is that the main thread will fly through those lock checks without pause. Finally, you should be well aware that the final pair of lines will execute main() only if we are invoking this script directly.

As hinted in the earlier Core Note, we presented the tHRead module only to introduce the reader to threaded programming. Your MT application should use higher-level modules such as the threading module, which we will now discuss.

18.4. tHRead Module

Table 18.1. thread Module and Lock Objects

Example 18.2. Using the thread Module (mtsleep1.py)

Example 18.3. Using tHRead and Locks (mtsleep2.py)