Recipe9.7.Storing Per-Thread Information

Recipe 9.7. Storing Per-Thread Information

Credit: John E. Barham, Sami Hangaslammi, Anthony Baxter

Problem

You need to allocate to each thread some storage that only that thread can use.

Solution

Thread-specific storage is a useful design pattern, and Python 2.3 did not yet support it directly. However, even in 2.3, we could code it up in terms of a dictionary protected by a lock. For once, it's slightly more general, and not significantly harder, to program to the lower-level thread module, rather than to the more commonly useful, higher-level tHReading module that Python offers on top of it:

_tss = {  } try:     import thread except ImportError:     # We're running on a single-threaded platform (or, at least, the Python     # interpreter has not been compiled to support threads), so we just return     # the same dict for every call -- there's only one thread around anyway!     def get_thread_storage( ):         return _tss else:     # We do have threads; so, to work:     _tss_lock = thread.allocate_lock( )     def get_thread_storage( ):         """ Return a thread-specific storage dictionary. """         thread_id = thread.get_ident( )         _tss_lock.acquire( )         try:             return _tss.set_default(thread_id, {  })         finally:             _tss_lock.release( )

Python 2.4 offers a much simpler and faster implementation, of course, thanks to the new tHReading.local function:

try:     import threading except ImportError:     import dummy_threading as threading _tss = threading.local( ) def get_thread_storage( ):     return _tss._ _dict_ _

Discussion

The main benefit of multithreaded programs is that all of the threads can share global objects when they need to do so. Often, however, each thread also needs some storage of its ownfor example, to store a network or database connection unique to itself. Indeed, each such externally oriented object is generally best kept under the control of a single thread, to avoid multiple possibilities of highly peculiar behavior, race conditions, and so on. The get_thread_storage function in this recipe solves this problem by implementing the "thread-specific storage" design pattern, and specifically by returning a thread-specific storage dictionary. The calling thread can then use the returned dictionary to store any kind of data that is private to the thread. This recipe is, in a sense, a generalization of the get_transaction function from ZODB, the object-oriented database underlying Zope.

One possible extension to this recipe is to add a delete_thread_storage function. Such a function would be useful, particularly if a way could be found to automate its being called upon thread termination. Python's threading architecture does not make this task particularly easy. You could spawn a watcher thread to do the deletion after a join with the calling thread, but that's a rather heavyweight approach. The recipe as presented, without deletion, is quite appropriate for the common and recommended architecture in which you have a pool of (typically daemonic) worker threads (perhaps some of them general workers, with others dedicated to interfacing to specific external resources) that are spawned at the start of the program and do not go away until the end of the whole process.

When multithreading is involved, implementation must always be particularly careful to detect and prevent race conditions, deadlocks, and other such conflicts. In this recipe, I have decided not to assume that a dictionary's set_default method is atomic (meaning that no thread switch can occur while set_default executes)adding a key can potentially change the dictionary's whole structure, after all. If I was willing to make such an assumption, I could do away with the lock and vastly increase performance, but I suspect that such an assumption might make the code too fragile and dependent on specific versions of Python. (It seems to me that the assumption holds for Python 2.3, but, even if that is the case, I want my applications to survive subtle future changes to Python's internals.) Another risk is that, if a thread terminates and a new one starts, the new thread might end up with the same thread ID as the just-terminated one, and therefore accidentally share the "thread-specific storage" dictionary left behind by the just-terminated thread. This risk might be mitigated (though not eliminated) by providing the delete_thread_storage function mentioned in the previous paragraph. Again, this specific problem does not apply to me, given the kind of multithreading architecture that I use in my applications. If your architecture differs, you may want to modify this recipe's solution accordingly.

If the performance of this recipe's version is insufficient for your application's needs, due to excessive overhead in acquiring and releasing the lock, then, rather than just removing the lock at the risk of making your application fragile, you might consider an alternative:

_creating_threads = True _tss_lock = thread.allocate_lock( ) _tss = {  } class TssSequencingError(RuntimeError): pass def done_creating_threads( ):     """ switch from thread-creation to no-more-threads-created state """     global _creating_threads     if not _creating_threads:         raise TssSequencingError('done_creating_threads called twice')     _creating_threads = False def get_thread_storage( ):     """ Return a thread-specific storage dictionary. """     thread_id = thread.get_ident( )     # fast approach if thread-creation phase is finished     if not _creating_threads: return _tss[thread_id]     # careful approach if we're still creating threads     try:         _tss_lock.acquire( )         return _tss.setdefault(thread_id, {  })     finally:         _tss_lock.release( )

This variant adds a boolean switch _creating_threads, initially true. As long as the switch is true, the variant uses a careful locking-based approach, quite similar to the one presented in this recipe's Solution. At some point in time, when all threads that will ever exist (or at least all that will ever require access to get_thread_storage) have been started, and each of them has obtained its thread-local storage dictionary, your application calls done_creating_threads. This sets _creating_threads to False, and every future call to get_thread_storage then takes a fast path where it simply indexes into global dictionary _tssno more acquiring and releasing the lock, no more creating a thread's thread-local storage dictionary if it didn't yet exist.

As long as your application can determine a moment in which it can truthfully call done_creating_threads, the variant in this subsection should definitely afford a substantial increase in speed compared to this recipe's Solution. Note that it is particularly likely that you can use this variant if your application follows the popular and recommended architecture mentioned previously: a bounded set of daemonic, long-lived worker threads, all created early in your program. This is fortunate, because, if your application is performance-sensitive enough to worry about the locking overhead of this recipe's solution, then no doubt you will want to structure your application that way. The alternative approach of having many short-lived threads is generally quite damaging to performance.

If your application needs to run only under Python 2.4, you can get a much simpler, faster, and solid implementation by relying on the new threading.local function. threading.local returns a new object on which any thread can get and set arbitrary attributes, independently from whatever getting and setting other threads may be doing on the same object. This recipe, in the 2.4 variant, returns the per-thread _ _dict_ _ of such an object, for uniformity with the 2.3 variant. This way, your applications can be made to run on both Python 2.3 and 2.4, using the best version in each case:

import sys if sys.version >= '2.4':  # insert 2.4 definition of get_local_storage here else:  # insert 2.3 definition of get_local_storage here

The 2.4 variant of this recipe also shows off the intended use of module dummy_threading, which, like its sibling dummy_thread, is also available in Python 2.3. By conditionally using these dummy modules, which are available on all platforms, whether or not Python was compiled with thread support, you may sometimes, with due care, be able to write applications that can run on any platform, taking advantage of threading where it's available but running anyway even where threading is not available. In the 2.3 variant, we did not use the similar approach based on dummy_thread, because the overhead would be too high to pay on nonthreaded platforms; in the 2.4 variant, overhead is pretty low anyway, so we went for the simplicity that dummy_threading affords.

Recipe9.7.Storing Per-Thread Information