Using a Plain Old String: Multithreaded | More Exceptional C++: 40 New Engineering Puzzles, Programming Problems, and Solutions

I l @ ve RuBoard

Even if you're already familiar with thread-safety issues and know why and how to share resources between threads safely by serializing access using mutexes , skim this section anyway. It forms the basis for the more-detailed examples later on.

In a nutshell , things are a bit different if our program is multithreaded. Functions like GetError() and SetError() might well be called at the same time on different threads. In that case, calls to these functions could be interlaced, and they will no longer work properly as originally written above. For example, consider what might happen if the following two pieces of code could be executed at the same time:

 // thread A String newerr = SetError( "A" ); newerr += " (set by A)"; cout << newerr << endl; // thread B String newerr = SetError( "B" ); newerr += " (set by B)"; cout << newerr << endl;

There are many ways in which this can go wrong. Here's one: Say thread A is running and, inside the SetError( "A" ) call, gets as far as setting err to " 1 :" before the operating system decides to preempt it and switch to thread B. Thread B executes completely; then thread A is reactivated and runs to completion. The output would be

 2: B (12:01:09.125) (set by B) 2: B (12:01:09.125)A (12:01:09.195) (set by A)

It's easy to invent situations that produce even stranger output, ^[1] but in reality you'd be lucky to get anything this sensible . Because a thread's execution can be preempted anywhere , including right in the middle of a String operation, such as String::operator+=() itself, you're far more likely to see just intermittent crashes, as String member functions on different threads attempt to update the same String object at the same time. (If you don't see this problem right away, try writing those String member functions and see what happens if their execution gets interleaved.)

^[1] Just interrupting a thread between its SetError() call and the following cout statement could affect the ordering of the output (though not the contents). Exercise for the reader: What are the thread-safety issues relating to cout itself, both within the standard iostreams subsystem and within the calling code? Consider first the ordering of partial output.

The way to correct this is to make sure that only one thread can be working with the shared resources at a time. We prevent the functions from getting interlaced by "serializing" them with a mutex or similar device. But who should be responsible for doing the serialization? There are two levels at which we could do it. The main trade-off is that the lower the level at which the work is done, the more locking needs to be done, often needlessly. That's because the lower levels don't know whether acquiring a lock is necessary for a given operation, so they have to do it every time, just in case. Excessive locking is a major concern because acquiring a mutex lock is typically an expensive operation on most systems, approaching or surpassing the cost of a general-purpose dynamic memory allocation.

[WRONG] Do locking within String member functions. This way, the String class itself assumes responsibility for the thread safety of all its objects. This is a bad choice for two (probably obvious) reasons. First, it doesn't solve the problem because it's at the wrong granularity. Option 1 only ensures that String objects won't get corrupted. (That is, they're still valid String objects as far as String is concerned .) But it can't do anything about the SetError() function's interleaving. (The String objects can still end up holding unexpected values exactly as just illustrated , which means they're not valid error messages as far as the Error module is concerned.) Second, it can seriously degrade performance because the locking would be done far more frequently than necessary ”at least once for every mutating String operation, and possibly even for nonmutating operations!
[RIGHT] Do locking in the code that owns/manipulates a String object. This is always the correct choice. Not only is locking done only when it's really needed, but it's done at the right granularity. We lock "an operation" where an operation is, not a low-level String member function, but a high-level Error module message-formatting function. Further, the extra code to do the serialization of the error string is isolated in the Error module.
[PROBLEMATIC] Do both. In the example, so far, Option 2 alone is sufficient. Option 1 is so obviously a bad choice that you might be wondering why I even mention it. The reason is simple: Later in this discussion, we'll see why copy-on-write "optimizations" force us to make that performance-degrading choice and do all locking inside the class. But because (as noted) Option 1 by itself doesn't really solve the whole problem, we can end up having to do Option 1, not instead of, but in addition to Option 2. As you might expect, that's just plain bad news.

In our example, implementing Option 2 means that the Error subsystem should take responsibility for serializing access to the String object that it owns. Here's a typical way to do it:

 // Example A-2: A thread-safe error recording subsystem   // String err; int count = 0; Mutex m;// to protect the err and count values String GetError() {   Lock<Mutex> l(m); //--enter mutual exclusion block------   String ret = err;   l.Unlock();       //--exit mutual exclusion block-------   return ret; } string SetError( const String& msg ) {   Lock<Mutex> l(m); //--enter mutual exclusion block------   err = AsString( ++count ) + ": ";   err += msg;   err += " (" + TimeAsString() + ")";   String ret = err;   l.Unlock();       //--exit mutual exclusion block-------   return ret; }

All is well, because each function body is now atomic as far as err and count are concerned and no interlacing will occur. SetError() calls are automatically serialized so that their bodies do not overlap at all.

For more details about threads, critical sections, mutexes, semaphores, race conditions, the Dining Philosophers Problem, and lots of interesting related topics, see any good text on operating systems or multithreaded programming. From here on, I'll assume you know the basics.

A Helper Lock Manager

Assume that Example A-2 and similar code shown here makes use of a helper lock manager class. The helper ensures that a lock is acquired promptly, and that it's subsequently released exactly once. Wrapping this knowledge in a manager class lets us make code like Example A-2 more exception-safe by avoiding embarrassments such as leaving acquired -and-never-to-be-released locks hanging around if a String operation happens to throw an exception.

The Lock helper looks something like this:

 template<typename T> class Lock { public:   Lock( T& t )     : t_(t)     , locked_(true)   {     t_.Lock();   }   ~Lock()   {     Unlock();   }   void Unlock()   {     if( locked_ )     {       t_.Unlock();       locked_ = false;     }   } private:   T&   t_;   bool locked_; };

I l @ ve RuBoard