5.6 C-Style Strings

I l @ ve RuBoard

C++ lets you use not only the C++ std::string class, but also older C-style strings, as well. You may wonder why we would want to study a second type of string when the first one does just fine. The answer is that there are a lot of old C programs out there that have been converted to C++, and the use of C-style strings is quite common.

C-style strings are arrays of characters . The special character '\0' ( NUL) is used to indicate the end of a string. For example:

 char    name[4];  int main(  )  {      name[0] = 'S';      name[1] = 'a';      name[2] = 'm';      name[3] = ' 
 char name[4]; int main( ) { name[0] = 'S'; name[1] = 'a'; name[2] = 'm'; name [3] = '\0'; return (0); } 
'; return (0); }

This creates a character array four elements long. Note that we had to allocate one character for the end-of-string marker.

String constants consist of text enclosed in double quotes ("). You may have already noticed that we've used string constants extensively for output with the std::cout standard class. C++ does not allow one array to be assigned to another, so you can't write an assignment of the form:

 name = "Sam";    // Illegal 

Instead you must use the standard library function std:: strcpy to copy the string constant into the variable. ( std::strcpy copies the whole string, including the end-of-string character.) The definition of this function is in the header file cstring (note the lack of .h on the end).

To initialize the variable name to " Sam " you would write:

 #include <cstring>  char    name[4];  int main(  )  {      std::strcpy(name, "Sam");    // Legal     return (0);  } 

C++ uses variable-length strings. For example, the declaration:

 #include <cstring>  char a_string[50];  int main(  )  {      std::strcpy(a_string, "Sam"); 

creates an array ( a_string ) that can contain up to 50 characters. The size of the array is 50, but the length of the string is 3. Any string up to 49 characters long can be stored in a_string . (One character is reserved for the NUL that indicates the end of the string.)

There are several standard routines that work on string variables . These are listed in Table 5-1.

Table 5-1. String functions

Function

Description

std::strcpy(string1, string2)

Copies string2 into string1

std:: strncpy (string1, string2, length)

Copies string2 into string1, but doesn't copy over length characters (including the end of string character)

std::strcat(string1, string2)

Concatenates string2 onto the end of string1

std:: strncat (string1, string2, length)

Concatenates string2 onto the end of string1 , but only length characters (will not put an end of string character on the result if length characters are copied )

length = std::strlen(string)

Gets the length of a string

std::strcmp(string1, string2)

Returns 0 if string1 equals string2 ;

A negative number if string1 < string2

A positive number if string1 > string2

Example 5-10 illustrates how std::strcpy is used.

Example 5-10. str/sam.cpp
 #include <iostream> #include <cstring> char name[30];  // First name of someone int main(  ) {     std::strcpy(name, "Sam");     std::cout << "The name is " << name << '\n';     return (0); } 

Example 5-11 takes a first name and a last name and combines the two strings. The program works by initializing the variable first to the first name (Steve). The last name (Oualline) is put in the variable last . To construct the full name, the first name is copied into full_name . Then strcat is used to add a space. We call strcat again to tack on the last name.

The dimensions of the string variables are 100 because we know that no one we are going to encounter has a name more than 98 characters long. (One character is reserved for the space and one for the NUL at the end of the string.) If we get a name more than 99 characters long, our program will overflow the array, corrupting memory.

Example 5-11. name2/name2.cpp
 #include <cstring> #include <iostream> char first[100];        // first name char last[100];         // last name char full_name[100];    // full version of first and last name int main(  ) {     strcpy(first, "Steve");     // Initalize first name     strcpy(last, "Oualline");   // Initalize last name     strcpy(full_name, first);   // full = "Steve"     // Note: strcat not strcpy     strcat(full_name, " ");     // full = "Steve "      strcat(full_name, last);    // full = "Steve Oualline"      std::cout << "The full name is " << full_name << '\n';     return (0); } 

The output of this program is:

 The full name is Steve Oualline 

C++ has a special shorthand for initializing strings, using double quotes (") to simplify the initialization. The previous example could have been written:

 char name[] = "Sam"; 

The dimension of name is 4, because C++ allocates a place for the '\0' character that ends the string.

C++ uses variable-length strings. For example, the declaration:

 char long_name[50] = "Sam"; 

creates an array ( long_name ) that can contain up to 50 characters. The size of the array is 50, and the length of the string is 3. Any string up to 49 characters long can be stored in long_name . (One character is reserved for the NUL that indicates the end of the string.)

Our statement initialized only 4 of the 50 values in long_name . The other 46 elements are not initialized and may contain random data.

5.6.1 Safety and C Strings

The problem with strcpy is that it doesn't check to see if the string being changed is big enough to hold the data being copied into it. For example, the following will overwrite random memory:

 char name[5]; //... strcpy(name, "Oualline");  // Corrupts memory 

There are a number of ways around this problem:

  • Use C++ strings. They don't have this problem.

  • Check the size before you copy:

     assert(sizeof(name) >= sizeof("Oualline")); strcpy(name, "Oualline"); 

    Although this method prevents us from corrupting memory, it does cause the program to abort.

  • Use the strncpy function to limit the number of characters copied. For example:

     std::strncpy(name, "Oualline", 4); 

    In this example, only the first four characters of "Oualline" (Oual) are copied into name . A null character is then copied to end the string for a total of 5 characters ”the size of name.

    A more reliable way of doing the same thing is to use the sizeof operator:

     std::strncpy(name, "Oualline", sizeof(name)-1); 

    In this case we've had to add an adjustment of -1 to account for the null at the end of the string.

    This method does not corrupt memory, but strings that are too long will be truncated.

The strcat function has a similar problem. Give it too much data and it will overflow memory. One way to be safe is to put in assert statements:

 char full_name[10]; assert(sizeof(name) >= sizeof("Steve")); std::strcpy(name, "Steve"); // Because we're doing a strcat we have to take into account // the number of characters already in name assert(sizeof(name) >= ((strlen(name) + sizeof("Oualline"))); std::strcat(name, "Oualline"); 

The other way of doing things safely is to use strncat . But strncat has a problem: if it reaches the character limit for the number of characters to copy, it does not put the end-of-string null on the end. So we must manually put it on ourselves . Let's take a look at how to do this. First we set up the program:

 char full_name[10]; std::strncpy(name, "Steve", sizeof(name)); 

Next we add the last name, with a proper character limit:

 std::strncat(name, "Oualline", sizeof(name)-strlen(name)-1); 

If we fill the string, the strncat does not put on the end-of-string character. So to be safe, we put one in ourselves:

 name[sizeof(name)-1] = ' 
 name[sizeof(name)-1] = '\0'; 
';

If the resulting string is shorter than the space available, strncat copies the end-of-string character. In this case our string will have two end-of-string characters. However, since we stop at the first one, the extra one later on does no damage.

Our complete code fragment looks like this:

 char full_name[10]; std::strncpy(name, "Steve", sizeof(name)); std::strncat(name, "Oualline", sizeof(name)-strlen(name)-1); name[sizeof(name)-1] = ' 
 char full_name[10]; std::strncpy(name, "Steve", sizeof(name)); std::strncat(name, "Oualline", sizeof(name)-strlen(name)-1); name[sizeof(name)-1] = '\0'; 
';

You may notice that there is a slight problem with the code presented here. It takes the first name and adds the last name to it. It does not put a space between the two. So the resulting string is "SteveOualline" instead of "Steve Oualline" or, more accurately, "SteveOual" because of space limitations.

There are a lot of rules concerning the use of C-style strings. Not following the rules can result in programs that crash or have security problems. Unfortunately, too many programmers don't follow the rules.

One nice thing about C++ strings is that the number of rules you have to follow to use them goes way down and the functionality goes way up. But there's still a lot of C code that has been converted to C++. As a result, you'll still see a lot of C-style strings.

5.6.2 Reading C-Style Strings

Reading a C-style string is accomplished the same way as it is with the C++ string class, through the use of the getline function:

 char name[50]; // .... std::getline(std::cin, name, sizeof(name)); 

A new parameter has been introduced: sizeof(name) . Because C-style strings have a maximum length, you must tell the getline function the size of the string you are reading. That way it won't get too many characters and overflow your array.

5.6.3 Converting Between C-Style and C++ Strings

To convert a C++ string to a C-style string, use the c_str( ) member function. For example:

 char c_style[100]; std::string a_string("Something"); ....     std::strcpy(c_style, a_string.c_str(  )); 

Conversion from C-style to C++-style is normally done automatically. For example:

 a_string = c_style; 

or

 a_string = "C-style string constant"; 

However, sometimes you wish to make the conversion more explicit. This is done through a type change operator called a cast . The C++ operator static_cast converts one type to another. The general form of this construct is:

 static_cast<   new-type   >(   expression   ) 

For example:

 a_string = static_cast<std::string>(c_style); 

There are actually four flavors of C++-style casts: static_cast , const_cast , dynamic_cast ,and reinterpret_cast . The other three are discussed later in the book.

5.6.4 The Differences Between C++ and C-Style Strings

C++ style strings are easier to use and are designed to prevent problems. For example, the size of C-style strings is limited by the size of the array you declare. There is no size limit when you use a C++ std::string (other than the amount of storage in your computer). That's because the C++ string automatically manages the storage for itself.

Size is a big problem in C-style strings. The std::strcpy and std::strcat functions do not check the size of the strings they are working with. This means that it is possible to copy a long string into a short variable and corrupt memory. It's next to impossible to corrupt memory using C++ strings because size checking and memory allocation is built into the class.

But there is overhead associated with the C++ std::string class. Using it is not as fast as using C-style strings. But for almost all the programs you will probably write, the speed difference will be negligible. And since the risk associated with using C-style strings is significant, it's better to use the C++ std::string class.

I l @ ve RuBoard


Practical C++ Programming
Practical C Programming, 3rd Edition
ISBN: 1565923065
EAN: 2147483647
Year: 2003
Pages: 364

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net