Solution

I l @ ve RuBoard

graphics/bulb.gif

Consider the copy-on-write Optimized::String class from Item 14, but with two new functions: Length() and operator[]() .

The point of this Item is to demonstrate why adding functionality like operator[] changes Optimized::String 's semantics enough to affect other parts of the class. But first things first:

Implement the new members of Optimized::String .

The Length() function is simple:

 namespace Optimized {   size_t String::Length() const   {     return data_->used;     } 

There's more to operator[] , however, than meets the casual eye. In the following, what I want you to notice is that what operator[] does (the non- const version returns a reference into the string) is really no different from what begin() and end() do for standard string s (they return iterators that "point into" the string ). Any copy-on-write implementation of std::basic_string will run into the same considerations that we do below.

Writing operator[] for Shareable Strings

Here's a na ve first attempt:

 // BAD: Nave attempt #1 at operator[] // char& String::operator[](size_t n) {   return data_->buf[n]; } const char String::operator[](size_t n) const {   return data_->buf[n]; } 

This isn't good enough by a long shot. Consider:

 // Example 15-1: Why attempt #1 doesn't work // void f(const Optimized::String& s) {   Optimized::String s2(s); // take a copy of the string   s2[0] = 'x';               // oops: also modifies s! } 

You might be thinking that the poor programmer of Example 15-1 could be a little unhappy about this side effect. You would be right. After all, not only is he changing two strings at once when he only meant to change one, but that other invisibly changed string was actually supposed to be const ! Now the poor author of f() has been turned into something of a liar in that he ended up by inadvertently changing something he promised would be const . But he didn't mean to lie, for there's not the slightest bit of casting or other subversion/perversion going on here. That things went badly wrong anyway is just not a good scene, except perhaps to a few strange people who also enjoy watching car wrecks.

So let's go back to String::operator[]() and try again. At the very least, we'd better make sure the string isn't shared, or else the caller might inadvertently be modifying what looks to him like two unrelated String s. "Aha," thinks the once-na ve programmer, "I'll call AboutToModify() to ensure I'm using an unshared representation." And so he does the following:

 // BAD: Inadequate attempt #2 at operator[] // char& String::operator[](size_t n) {   AboutToModify(data_->len);   return data_->buf[n]; } const char String::operator[](size_t n) const {   // no need to check sharing status this time   return data_->buf[n]; } 

This looks good, but it's still not enough. The problem is that we only need to rearrange the Example 15-1 problem code slightly to get back into the same situation as before:

 // Example 15-2: Why attempt #2 doesn't work either // void f(Optimized::String& s) {   char& rc = s[0];  // take a reference to the first char   Optimized::String s2(s); // take a copy of the string   rc = 'x';                  // oops: also modifies s2! } 

You might be thinking that the poor programmer of Example 15-2 could be a little perturbed about this surprise, too. You would again be right, but as of this writing certain popular implementations of basic_string have precisely this copy-on-write- related bug.

The problem is that the reference was taken while the original string was unshared, but then the string became shared and the single update through the reference modified both String objects' visible state.

Key Notion: An "Unshareable" String

When the non- const operator[]() is called, besides taking an unshared copy of the StringBuf , we also need to mark the string "unshareable" just in case the user remembers the reference (and later tries to use it).

Now, marking the string "unshareable for all time" will work, but that's actually a little excessive. It turns out that all we really need to do is mark the string "unshareable for a while." To see what I mean, consider that it's already true that references returned by operator[]() into the string must be considered invalid after the next mutating operation. That is, code like this:

 // Example 15-3: Why references are // invalidated by mutating operations // void f(Optimized::String& s) {   char& rc = s[0];   s.Append('i');   rc = 'x';   // error: oops, buffer may have }             // moved if s did a reallocation 

should already be documented as invalid, whether or not the string uses copy-on-write. In short, calling a mutator clearly invalidates references into the string because you never know if the underlying memory may move ”and move invisibly, from the point of view of the calling code.

Given that fact, in Example 15-2, rc would be invalidated anyway by the next mutating operation on s . So instead of marking s as "unshareable for all time" just because someone might have remembered a reference into it, we could just mark it "unshareable until after the next mutating operation," when any such remembered reference would be invalidated anyway. To the user, the documented behavior is the same.

Do any of the other members need to be changed because of the new member functions? Explain .

As we can now see, the answer is yes.

First, we need to be able to remember whether a given string is currently unshareable so that we won't use reference counting when copying it. We could just throw in a Boolean flag. To avoid even that overhead, though, let's just encode the "unshareable" state directly in the refs count by agreeing that " refs == the biggest unsigned int there can possibly be" means "unshareable." We'll also add a flag to AboutToModify() that advises whether to mark the string unshareable.

 // GOOD: Correct attempt #3 at operator[] // // Add a new static member for convenience, and // change AboutToModify() appropriately. Because // now we'll need to clone a StringBuf in more than // one function (see the String copy constructor, // below), we'll also move that logic into a // single function... it was time for StringBuf to // have its own copy constructor, anyway. // const size_t String::Unshareable =numeric_limits<size_t>::max(); StringBuf::StringBuf(const StringBuf& other, size_t n)   : buf(0), len(0), used(0), refs(1) {   Reserve(max(other.len, n));   copy(other.buf, other.buf+other.used, buf);   used = other.used; } void String::AboutToModify(size_t n,   bool   markUnshareable /* = false */) {   if(data_->refs > 1 && data_->refs != Unshareable)   {     StringBuf* newdata = new StringBuf(*data_, n);     --data_->refs;   // now all the real work is     data_ = newdata; //  done, so take ownership   }   else   {     data_->Reserve(n);   }   data_->refs = markUnshareable ? Unshareable : 1; } char& String::operator[](size_t n) {   AboutToModify(data_->len, true);   return data_->buf[n]; } const char String::operator[](size_t n) const {   return data_->buf[n]; } 

Note that all the other calls to AboutToModify() continue to work as originally written.

Now all we need to do is make String 's copy constructor respect the unshareable state, if it's set:

 String::String(const String& other) {   // If possible, use copy-on-write.   // Otherwise, take a deep copy immediately.   //   if(other.data_->refs != Unshareable)   {     data_ = other.data_;     ++data_->refs;   }   else   {     data_ = new StringBuf(*other.data_);   } } 

String 's destructor needs a small tweak, too:

 String::~String() {   if(data_->refs == Unshareable  --data_->refs < 1)   {     delete data_;   } } 

The other String functions work as originally written:

 String::String() : data_(new StringBuf) { }   void String::Append(char c)   {     AboutToModify(data_->used+1);     data_->buf[used++">data_->used++] = c;   } } 

That's about it in a single-threaded environment, that is. In the final Item of this lazy-optimization miniseries, we'll consider how multithreading affects our copy-on-write string. See Item 16 for the juicy details.

Summary

Here's all the code together. Note that I've also taken this opportunity to implement a slight change to StringBuf::Reserve() ; it now rounds up the chosen "new buffer size" to the next multiple of four, in order to ensure that the size of the memory buffer's size is always a multiple of 4 bytes. This is in the name of efficiency, because many popular operating systems don't allocate memory in chunks smaller than this, anyway. For small strings, this code is also a bit faster than the original version. (The original code would allocate a 1-byte buffer, then a 2-byte buffer, then a 3-byte buffer, then a 4-byte buffer, and then a 6-byte buffer before the exponential-growth strategy would really kick in. The code below goes straight to a 4-byte buffer, then an 8-byte buffer, and so on.)

 namespace Optimized {   class StringBuf   {   public:       StringBuf();              // start off empty      ~StringBuf();              // delete the buffer       StringBuf(const StringBuf& other, size_t n = 0);                                 // initialize to copy of other,                                 //  and ensure len >= n       void Reserve(size_t n); // ensure len >= n       char*    buf;             // allocated buffer       size_t   len;             // length of buffer       size_t   used;            // # chars actually used       unsigned refs;            // reference count     private:       // No copying...       //       StringBuf(const StringBuf&);       StringBuf& operator=(const StringBuf&);     };     class String     {     public:       String();                 // start off empty      ~String();                 // decrement reference count                                 //  (delete buffer if refs==0)       String(const String&);  // point at same buffer and                                 //  increment reference count       void   Append(char);    // append one character       size_t Length() const;    // string length       char&  operator[](size_t);// element access       const char operator[](size_t) const;       // ... operator=() etc. omitted ...     private:       void AboutToModify(size_t n, bool bUnshareable = false);                                 // lazy copy, ensure len>=n                                 //  and mark if unshareable       static size_t Unshareable;// ref-count flag for "unshareable"       StringBuf* data_;     };     StringBuf::StringBuf()       : buf(0), len(0), used(0), refs(1) { }     StringBuf::~StringBuf() { delete[] buf; }     StringBuf::StringBuf(const StringBuf& other, size_t n)       : buf(0), len(0), used(0), refs(1)     {         Reserve(max(other.len, n));         copy(other.buf, other.buf+other.used, buf);         used = other.used;     } void StringBuf::Reserve(size_t n) {   if(len < n)   {     // Same growth code as in Item 14, except now we round     // the new size up to the nearest multiple of 4 bytes.     size_t needed = max<size_t>(len*1.5, n);     size_t newlen = needed ? 4 * ((needed-1)/4 + 1) : 0;     char*  newbuf = newlen ? new char[ newlen ] : 0;     if(buf)     {       copy(buf, buf+used, newbuf);     }     delete[] buf;   // now all the real work is     buf = newbuf;   //  done, so take ownership     len = newlen;   } } const size_t String::Unshareable = numeric_limits<size_t>::max(); String::String() : data_(new StringBuf) { } String::~String() {   if(data_->refs == Unshareable  --data_->refs < 1)   {     delete data_;   } } String::String(const String& other) {   // If possible, use copy-on-write.   // Otherwise, take a deep copy immediately.   //   if(other.data_->refs != Unshareable)   {     data_ = other.data_;     ++data_->refs;   }   else   {     data_ = new StringBuf(*other.data_);   } } void String::AboutToModify(size_t n,   bool   markUnshareable /* = false */)     {       if(data_->refs > 1 && data_->refs != Unshareable)       {         StringBuf* newdata = new StringBuf(*data_, n);         --data_->refs;   // now all the real work is         data_ = newdata; //  done, so take ownership       }       else       {         data_->Reserve(n);       }       data_->refs = markUnshareable ? Unshareable : 1;     }     void String::Append(char c)     {       AboutToModify(data_->used+1);       data_->buf[used++">data_->used++] = c;     }     size_t String::Length() const     {       return data_->used;     }     char& String::operator[](size_t n)     {       AboutToModify(data_->len, true);       return data_->buf[n];     }     const char String::operator[](size_t n) const     {       return data_->buf[n];     } } 
I l @ ve RuBoard


More Exceptional C++
More Exceptional C++: 40 New Engineering Puzzles, Programming Problems, and Solutions
ISBN: 020170434X
EAN: 2147483647
Year: 2001
Pages: 118
Authors: Herb Sutter

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net