4.2 SOME COMMON SHORTCOMINGS OF C-STYLE STRINGS


4.2 SOME COMMON SHORTCOMINGS OF C-STYLE STRINGS

C-style strings can be painful to use, especially after you have seen the more modern representations of strings in other languages. For starters, when invoking some of the most commonly used string library functions in C, such as a strcpy (), strcat (), and so on, you have to ensure that sufficient memory is allocated for the output string. This requirement, seemingly natural to those who do most of their programming in C, appears onerous after you have experienced the convenience of the modern string types.

Consider this sample code for the string type from the C++ Standard Library:

     string str1 = "hi";     string str2 = "there";     string str3;     str3 = str1 + str2; 

We are joining the strings str1 and str2 together and copying the resulting string into the string object str3. Using the operator + for joining two strings together seems very natural. More particularly, note that we do not worry about whether or not we have allocated sufficient memory for the new longer string. The system automatically ensures that the string object str3 has sufficient memory available to it for storing the new string, regardless of its length.

Now compare the above code fragment with the following fragment that tries to do the same thing but with C-style strings using commonly used functions for string processing in C:

    char* str1 = "hi";    char* str2 = "there";    char* str3 = (char*) malloc( strlen( str1 ) + strlen( str2 ) + 1 );    strcpy( str3, str1 );    strcat( str3, str2 ); 

The syntax here is definitely more tortured. A visual examination of the code, if too hasty, can be confusing with regard to the purpose of the code. You have to remind yourself about the roles of the functions strcpy and strcat to comprehend what's going on. You also have to remember to allocate memory for str3—forgetting to do so is not as uncommon as one might like to think. What's worse, for proper memory allocation for str3 you have to remember to add 1 for the null terminator to the byte count obtained by adding the values returned by strlen for the strings str1 and str2. (Just imagine the disastrous consequences if you should forget!)

For another example of the low-level tedium involved and the potential for introducing bugs when using C-style strings, consider the following function:

     void strip( char* q ) {        char* p = q + strlen( q ) -1;                                    //(A)        while ( *p == ' ' && p >= q )               //(B)           *p-- = '0';                                                   //(C) } 

which could be used to strip off blank space at the trailing end of a string. So in a call such as

     char* str = (char*) malloc( 10 );     strcpy( str, "hello " );     strip( str ) ; 

the function strip would erase the five blank space characters after "hello" in the string str. Going back to the definition of strip, in line (A) we first set the local pointer p to point to the last character in the string. In line (B), we dereference this pointer to make sure that a blank space is stored there and that we have not yet traversed all the way back to the beginning of the string. If both these conditions are satisfied, in line (C) we dereference the pointer again, setting its value equal to the null character, and subsequently decrement the pointer.[4] If someone were to write in a hurry the implementation code for strip, it is not inconceivable that they'd write it in the following form:

     void strip( char* q ) {     char* p = q + strlen( q ) - 1;     while ( *p == ' ' )                                                //(D)     *p-- = '\0'; } 

where in line (D) we have forgotten to make sure that that the local pointer p does not get decremented to a value before the start of the argument string. While this program would compile fine and would probably also give correct results much of the time, it could also cause exhibit unpredictable behavior. In programs such as this, one could also potentially forget to dereference a string pointer resulting in programs that would compile alright, but not run without crashing.

[4]Recall from C programming that the unary postfix increment operator, ‘–’, has a higher precedence than the indirection operator ‘*’. So the expression *ptr– in line (C) is parsed as *(ptr–). But because the decrement operator is postfix, the expression ptr– evaluates to ptr. Therefore, what gets dereferenced is ptr. It is only after the evaluation of the expression that ptr is decremented by the postfix decrement operator.




Programming With Objects[c] A Comparative Presentation of Object-Oriented Programming With C++ and Java
Programming with Objects: A Comparative Presentation of Object Oriented Programming with C++ and Java
ISBN: 0471268526
EAN: 2147483647
Year: 2005
Pages: 273
Authors: Avinash Kak

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net