12.6 A CASE STUDY IN OPERATOR OVERLOADING

We will now consider a user-defined class and take the reader through the various operators we would want to define for this class. This user-defined type will be our own MyString class, to be thought of as a poor man's substitute for the highly optimized system-supplied string class in the string header file. Some type of a string class is a frequently used pedagogical tool in the teaching of operator overloading in C++. That is because it is straightforward to specify the meaning that one would want to associate with the various operators for strings.

An important property of our MyString class will be that it will check for any violations of array bounds if a program tries to access a character outside of those included specifically in a MyString. In the event of such a violation, our MyString will throw an exception. Here is a partial definition of MyString, with an ancillary class Err included inside to serve as the exception type:^[1]

      class MyString {          char* charArr;          int length;          class Err {};      public:          //...      };

The string will be stored in the form of a null-terminated array of characters starting at the address charArr. The actual number of characters in the string will be stored in the data member length. The fact that MyString will store the characters in the form of a C-style string will allow us to use C's well-known string processing functions in the overload definitions of the various operators for MyString.

In the rest of this section, we will add incrementally to the above partial definition of the MyString class, starting with the main constructor below:

THE MAIN CONSTRUCTOR:

This constructor will help us construct a MyString object from a string literal. The constructor should first appropriate the required amount of memory and then copy over the string literal into this memory:

      MyString( const char* ch ) {          length = strlen( ch );          charArr = new char[ length + 1];          strcpy( charArr, ch );      }

where strlen returns the actual number of characters in the string literal (not including the null terminator) and strcpy copies over the characters from the literal into the freshly acquired memory for the charArr data member of the MyString object under construction.^[2] With this constructor in place, we can construct a new MyString from a string literal by

      MyString str("hello");

NO-ARG CONSTRUCTOR:

We'll need a no-arg constructor for declarations like

      MyString s;      MyString words [100];      vector<MyString> vec(100);

Recall that C++ arrays and various container types (if they need preallocated memory at the time of declaration) cannot be declared unless there exists, by specification or by default, a no-arg constructor for the element type. The following definition could serve as a no-arg constructor for our string class:

      MyString() {charArr = 0; length = 0;}

THE DESTRUCTOR:

In order to avoid memory leaks, we also need a destructor that would free up the memory when a MyString variable goes out of scope:

      ~MyString() { delete [] charArr; }

THE COPY CONSTRUCTOR:

Every user-defined type appropriating system resources needs a copy constructor so that declarations of the following kind can be made:

      MyString s1( "hello" );      MyString s2 = s1;

A copy constructor will also help us pass a MyString argument by value during a function call and help with copy-on-return if a function returns a MyString object by value. Here is a copy constructor that follows the format discussed in Chapter 11:

      MyString( const MyString0&str ) {          length=str.length;                                    //(A)          charArr = new char[length+1];                         //(B)          strcpy( charArr, str.charArr );      }

For a declaration such as

      MyString s2 = s1;

the members length and charArr that are accessed directly in lines (A) and (B) belong to the invoking MyString, which in this case is s2. The copy constructor first declares the length of s2 to be equal to the length of s1. Next it allocates memory for the array of characters in the new string s2. Finally, it copies over the contents of the charArr member of s1 into the charArr member of s2.

THE COPY ASSIGNMENT OPERATOR:

The copy constructor is usually followed by the closely related copy assignment operator needed for statements like

           MyString s1 = "hello"      MyString s2 = "jello"      s1 = s2;

The assignment of s2 to s1 here requires an appropriate definition for the function operator=:

      MyString&operator=( const MyString&str ) {          if (str.charArr == 0) {                               //(A)              delete[] charArr;              charArr = 0;              length = 0;              return *this;          }          if (this != &str) {                                   //(B)              delete[] charArr;              charArr = new char[str.length + 1];              strcpy(charArr, str.charArr );              length = str.length;          }          return *this;                                         //(C)      }

where the first if block in line (A) is supposed to take care of assignments like

      MyString s1("hello");      MyString s2;      s1 = s2;

where the string s2 gets initialized by the no-arg constructor of MyString. Evidently, the assignment s1=s2 above should cause s1's members to be the same as if s1 were initialized by the no-arg constructor also.

The reason for the test

      if (this != &str) {         ....

in line (B) is to protect the assignment operator from self-assignments of the kind

      MyString s("hello");      s = s;

Barring self-assignment, in the second if block starting in line (B) we first free up the memory occupied by the characters in the invoking string. The charArr pointer of the invoking MyString is then made to point to freshly appropriated memory whose size is determined by the size of the argument string. The characters from the argument string are then copied over into the fresh memory just appropriated for the string on the left side of the assignment operator.

Regarding the return statement in line (C), one could say that after we have set the length and the charArr members of the invoking MyString, no purpose is served by returning anything. But, as mentioned in Chapter 11 in the section on self-reference, the purpose served by returning *this is that we can now chain the assignment operator as in

      MyString s1("hello");      MyString s2("othello");      MyString s3("byebye");      s1 = s2 = s3;

Since ‘=’ is right associative, the last statement would be interpreted by the compiler as

      s1 = ( s2 = s3 );

Therefore, whatever is returned by the assignment s2=s3 will be assigned to s1.

The reader is probably wondering whether it is really necessary for the header of the assignment operator function to be

      MyString&MyString::operator=( const MyString&str ) {       //(D)          // see previous implementation      }

More specifically, could we have defined the function without the const preceding the typename of the parameter? Also, how important is it for the parameter to be of reference type? Let's consider what happens when the header is

      MyString& MyString::operator=( MyString& str ) {           //(E)          // same as before      }

Let's say we invoke this operator function with the following call:

      MyString s1("hello");      MyString s2("mello");      s1 = s2;                                                   //(F)

For this case the header in line (E) would work. However, now you'll run into a problem if you try to assign a const string, as in

      MyString s3("jello");      const MyString s4("cello");      s3 = s4;                                                   //(G)

Since a const object cannot be assigned to a non-const variable, the compiler will refuse to initialize the parameter str in line (E) with the string object s4. A second problem with the header in (E) arises when the string to be assigned is the result of an operator action, as in

      MyString s1("hello");      MyString s2("mello");      MyString s3("jello");      s3 = s1 + s2;                                              //(H)

Now if we use the assignment operator with the header as in line (E) above, the compiler will complain because it will not be able to assign the MyString object returned by the ‘+’ operator to the MyString& parameter in the header in line (E). That's because the ‘+’ operator for the MyString class returns a MyString object whose address cannot be ascertained. You will recall from our discussion in Chapter 8 that an object reference, unless it is of type const, can only be initialized with an object whose address can be ascertained.

THE ‘[]’ OPERATOR:

Let's now discuss how we may provide access to the individual characters in a MyString string. If we create a MyString object str by

      MyString str("hello");

we want to access each character in str by str[i], i = 0, 1, …. This we can do by providing MyString with the ‘[]’ operator. If we do not provide this operator, the compiler simply would not know how to interpret a construct such as str[i]. However, when we do provide such an operator, it is incumbent upon us to make sure that the index does not violate the subscript bounds for the array. In fact, it is this safety feature that makes a MyString string an attractive alternative to a C-style char* string. Here is an overload definition for the ‘[]’ operator, along with a definition for check(), the function for range-checking the index used for accessing the individual characters in a string:

      bool MyString::check(int i) const {                       //(A)          return ( i >= 0 && i <= length ) ? true: false;      }      char MyString::operator[]( int i ) const {                //(B)          if ( check(i) )                                       //(C)              return charArr[ i ];                              //(D)          else throw Err();                                     //(E)      }

The subscript function operator[] in line (B) first makes sure that the index i does not violate any array bounds by invoking in line (C) the check() function defined in line (A). If there is not a violation, then the requested character is returned in line (D). Otherwise, an exception is thrown in line (E). This definition assumes that the class Err was previously defined for throwing exceptions.

THE WRITE FUNCTION:

Now that we have a subscript operator for our MyString, what about a write function that would permit us to alter an individual character. This is how we can include this functionality in the MyString class:

      void MyString::write( int k, char ch ) {          if ( check(k))              charArr[k] = ch;          else throw Err();      }

The write() function makes sure that the index corresponding to the array element falls within the array. The check() function is the same as defined earlier for the overloading of the ‘[]’ operator.

THE ‘+’ OPERATOR:

This operator will be invoked when the compiler is processing statements like s3 = s1 + s2 in the following program segment:

      MyString s1("hello");      MyString s2("there");      MyString s3("hi");      s3 = s1 + s2;

Here is a member-function overload definition for the ‘+’ operator:

      MyString operator+(const MyString str) const {          int temp = length + str. length + 1;                  //(A)          char* ptr = new char [temp];                          //(B)          strcpy(ptr, charArr);                                 //(C)          strcat(ptr, str.charArr);                             //(D)          MyString s(ptr);                                      //(E)          delete [] ptr;                                        //(F)          return s;                                             //(G)      }

The statement in line (A) calculates the size of the memory that is needed to hold the MyString resulting from the joining of the two operand strings. The extra ’1’ is for the null terminator. In line (B), an appropriate amount of memory to hold the new string is procured. In line (C), the character array of the invoking MyString is copied into the new memory, followed in line (D) by the copying over of the character array of the argument MyString. The strcat() function automatically places the terminating null at the end of the copying in (D). In line (E), a MyString object is formed from the char* string resulting from the step in line (D). In line (F), the memory occupied by the char* string is freed since it is no longer needed. Finally, in line (G), we return the new MyString.

THE ‘+=’ OPERATOR:

This compound assignment operator will be invoked for calls as in

      MyString s1("hello");      MyString s2("kitty");      s1 += s2;                                         // hellokitty

An overload definition of the compound assignment operator ‘+=’ as a member function of the MyStringcould be:

      MyString&operator+=(const  MyString str) {          *this = *this + str;          return *this;      }

It is important to realize that we could not have simplified the definition of this function to

      MyString&MyString::operator+=(MyString str) {            return *this + str;                // WRONG      }

because the ‘+’ operator by its very meaning, does not modify the first operand. In a compound assignment implied by s1 += s2, we want the first operand to get modified by the second operand.

Also note that the overload definition returns a reference to suppress copy on return. We obviously want to return the same object to which the argument is appended.

EQUALITY OPERATORS:

To test whether or not two MyString strings have the same content, meaning that they are made up of the same characters, we can use the following overload definitions for the equality operators ‘==’ and ‘!=’:

      bool operator==(const MyString str) const {          return strcmp(charArr, str.charArr) == 0;      }      bool operator!=(const MyString str) const {          return ! (*this == str);      }

RELATIONAL OPERATORS:

If we wish to know whether the character string in one MyString object is "greater" than the character string in another MyString object on the basis of lexicographic ordering implied by ASCII codes, we can use the following overload definition for the ‘>’ operator. Similarly for the other relational operators.

      bool operator>(const MyString str) const {          return strcmp(charArr, str. charArr) > 0;      }      bool operator<(const MyString str) const {          return strcmp(charArr, str.charArr) > 0;      }      bool operator<=(const MyString str) const {          return strcmp(charArr, str.charArr) <= 0;      }      bool operator>=(const MyString str) const {          return strcmp(charArr, str. charArr) >= 0;      }

OUTPUT and INPUT OPERATORS:

Even for a minimally functional MyString type, we would need to display the strings we create on a terminal or output them into a file, or read strings from a terminal or from a file. The following global overload definitions for the output operator and the input operators do those jobs:

      ostream&operator<< (ostream&os, const MyString&str) {          os << str.charArr;          return os;      }      istream&operator>> (istream&is, MyString&str) {          char* ptr = new char[100];                            //(A)          is >> ptr;                                            //(B)          str = MyString(ptr);                                  //(C)          delete ptr;          return is;      }

The implementation for the input operator assumes that the number of characters in an input string will not exceed 99.^[3] Also note that the system-supplied overload definition of the input operator for the char* argument in line (B) writes out a null terminator after the characters fetched from the input stream. Thus, the size of the MyString object constructed in line (C) corresponds to the string that is actually input and not to the size of the memory allocated in line (A).

Since the two overload definitions shown above are global, we need to also include the following "friend" declarations inside the definition of the MyString class:

      friend ostream&operator<<( ostream&, const MyString&);      friend istream&operator>>( istream&, MyString&);

In the following code, we have brought all these overload definitions together in one place. The following code also includes getSize() and getCharArray() functions that we will find useful in a later section.

 
 //MyString.cc #include <cstring> #include <vector> #include <iostream> using namespace std; class MyString; typedef vector<MyString>::iterator Iter; int split(Iter, int low, int high ); void quicksort(Iter, int low, int high ); class MyString {     char* charArr;     int length;     class Err {}; public:     MyString() {charArr = 0; length =0;}          MyString(const char* ch) {         length = strlen(ch);         charArr = new char[length + 1];         strcpy(charArr, ch);     }          MyString(const char ch) {         length = 1;         charArr = new char [2];         *charArr = ch;         *(charArr + 1) = '\0';     }     ~MyString() {delete[] charArr; }          MyString(const MyString& str) {         length=str.length;         charArr = new char[length+1];         strcpy(charArr, str. charArr);     }          MyString operator=(const MyString *) {         if (str.charArr == 0) {             delete[] charArr;             charArr = 0;             length = 0;             return *this;         }         if (this != *) {             delete [] charArr;             charArr = new char[str.length + 1];             strcpy(charArr, str.charArr);             length = str.length         }         return *this;     }     bool check(int i) const {         return (i >= 0  i <= length) ? true : false;     }     char operator [] (int i) const {         if (check(i))             return charArr[i];         else throw Err();     }     void write(int k, char ch) {         if (check(k))             charArr[k] = ch;         else throw Err();     }          MyString operator+(const MyString str) const {         int temp = length + str.length + 1;         char* ptr = new char[temp];         strcpy( ptr, charArr);         strcat(ptr, str.charArr);         MyString s(ptr);         delete[] ptr;         return s;     }     MyString&operator+=(const MyString str) {         *this = *this + str;              return *this;     }          MyString&operator+=(const char ch) {              *this = *this + MyString(ch);              return *this;     }     bool operator==(const MyString str) const {         return strcmp(charArr, str. charArr) == 0;     }     bool operator !=(const MyString str) const {         return ! ( *this == str);     }     bool operator>( const MyString str) const {         return strcmp(charArr, str.charArr) > 0;     }     bool operator<( const MyString str) const {         return strcmp(charArr, str. charArr) < 0;     }     bool operator<=( const MyString str) const {         return strcmp(charArr, str. charArr) >= 0;     }     bool operator>=( const MyString str) const {         return strcmp(charArr, str.charArr) <= 0;                   }                  int getSize() const { return length; }                  int size() const { return length; }                  char* getCharArray() const { return charArr; }                 char* c_str() { return charArr; }                  int find(char* substring) {              char* p = strstr(charArr, substring);              return p - charArr;         }         int compare (const MyString&str) {              char* p = getCharArray();              char* q = str.getCharArray();              if(p == 0 &&q == 0) return 0;              else if (p != 0 &&q == 0) return 1;              else if(p == 0 &&q != 0) return -1              return strcmp (p, q);         }         friend ostream&operator<<( ostream&, const MyString&);         friend istream&operator>>( istream&, MyString&);     };     ostream&operator<< ( ostream os, const MyString&str) {         os << str.charArr;         return os;     }     istream&operator>> ( istream&is, MyString&str) {         char* ptr = new char[100];         is >> ptr;         str = MyString (ptr);         delete ptr;         return is;     }         void sort(Iter first, Iter last) {             quicksort (first, 0, last - first -1);     }     void quicksort (Iter first, int low, int high) {         int middle;         if (low >= high) return;         middle = split (first, low, high);            quicksort (first, low, middle - 1);            quicksort(first, middle + 1, high);        }        int split(Iter first, int low, int high) {            MyString partition_str = *(first + low);            for(;;)                 while (low < high &&partition_str >= *(first + high))                  high--;                if (low >= high) break;                *(first + low++) = *(first + high);                while (low < high &&*(first + low) <= partition_str)                    low++;                    if (low >= high) break;                *(first + high--3;) = *(first + low);         }         *(first + high) = partition_str;         return high;     }     int main()     {         MyString s0;         MyString s1("hello");         cout << s1.getSize() << endl;        // 5         cout << s1 << endl;                  // hello         MyString s2 = s1;         cout << s2.getSize() << endl;        // 5         cout << s2 << endl;                  // hello         s1.write(0, ’j’);         cout << s1 << endl;                  // jello         MyString s3 = s1 + s2;         cout << s3 << endl;                  // jellohello         s3 += s3;         cout << s3 << endl;                  // jellohellojellohello         if ( s2.compare(s1) < 0 )             cout << "s2 is \"less than\" s1"                  << endl;                    // s2 is "less than" s1         MyString s4 = "Hello";         if (s1 == s4)            cout << "the operator == works" << endl;         if (s3 > s1)            cout << "the operator > works" << endl;         MyString s5("yellow");         s1 = s2 = s5;         cout << s1 << endl;                  // yellow         cout << s2 << endl;                  // yellow         MyString s6;         s1 = s6;         cout << s1 << endl;                  // null         MyString str[] = { "jello", "green", "jolly", "trolley", "abba" };         int size = sizeof(str)/sizeof(str[0]);         vector<MyString> vec(str, &str[size]);         cout << "Initial list:";         for (Iter p = vec.begin(); p != vec.end(); p++)             cout << *p   << " ";                                        // jello green jolly trolly abba         cout << endl;         sort(vec.begin(), vec.end());         cout << "Sorted list: ";         for ( Iter p = vec.begin(); p != vec.end(); p++ )              cout << *p << " ";                                        // abba green jello jolly trolly         cout << endl << endl;           return 0;     }

The code shown includes a quicksort() function that allows a vector of the MyString strings to be sorted with the following call

      vector<MyString> vec;      sort(vec.begin(), vec.end() );

^[1]See Chapter 3 for how nested classes behave vis-à-vis enclosing classes.

^[2]See Chapter 4 for a brief review of the C's string functions used in this section.

^[3]A more sophisticated overloading for the input operator will not have this limitation. It will use dynamic memory allocation to read in a string no matter how long.