Character Strings: An Introduction

I l @ ve RuBoard

Character Strings: An Introduction

A character string is a series of one or more characters . Here is an example of a string:

 "Zing went the strings of my heart!" 

The double quotation marks are not part of the string. They inform the compiler that they enclose a string, just as single quotation marks identify a character.

Type char Arrays and the Null Character

C has no special variable type for strings. Instead, strings are stored in an array of type char . Characters in a string are stored in adjacent memory cells , one character per cell , and an array consists of adjacent memory locations, so placing a string in an array is quite natural (see Figure 4.1).

Figure 4.1. A string is an array.
graphics/04fig01.jpg

Note that Figure 4.1 shows the character \0 in the last array position. This is the null character , and C uses it to mark the end of a string. The null character is not the digit zero; it is the nonprinting character whose ASCII code value (or equivalent) is . Strings in C are always stored with this terminating null character. The presence of the null character means that the array must have at least one more cell than the number of characters to be stored.

Now just what is an array? You can think of an array as several memory cells in a row. If you prefer more formal and exact language, an array is an ordered sequence of data elements of one type. This example creates an array of 40 memory cells, or elements, each of which can store one char -type value, by using this declaration:

 char name[40]; 

The brackets after name identify it as an array. The 40 within the brackets indicates the number of elements in the array. The char identifies the type of each element (see Figure 4.2).

Figure 4.2. Declaring an array name of type char .
graphics/04fig02.jpg

Using a character string is beginning to sound complicated! You have to create an array, place the characters of a string into an array one by one, and remember to add a \0 at the end. Fortunately, the computer can take care of most of the details itself.

Using Strings

Try the program in Listing 4.2 to see how easy it really is to use strings.

Listing 4.2 The praise1.c program.
 /* praise1.c -- uses an assortment of strings */ #include <stdio.h> #define PRAISE "My sakes, that's a grand name!" int main(void) {   char name[40];   printf("What's your name?\n");   scanf("%s", name);   printf("Hello, %s. %s\n", name, PRAISE);   return 0; } 

The %s tells printf() to print a string. The %s appears twice because the program prints two strings: the one stored in the name array and the one represented by PRAISE . Running praise1.c should produce an output similar to this:

 What's your name? Snippy Coredump Hello, Snippy. My sakes, that's a grand name! 

You do not have to put the null character into the array name yourself. That task is done for you by scanf() when it reads the input. Nor do you include a null character in the character string constant PRAISE . We'll explain the #define statement soon; for now, simply note that the double quotation marks that enclose the text following PRAISE identify the text as a string. The compiler takes care of putting in the null character.

Note (and this is important) that scanf() just reads Snippy Coredump's first name. After scanf() starts to read input, it stops reading at the first whitespace (blank, tab, or newline) it encounters. Therefore, it stops scanning for name when it reaches the blank between Snippy and Coredump . In general, scanf() is used with %s to read only a single word, not a whole phrase, as a string. C has other input-reading functions, such as gets() , for handling general strings. Later chapters will explore string functions more fully.

Strings Versus Characters

The string constant "x" is not the same as the character constant 'x' . One difference is that 'x' is a basic type ( char ), but "x" is a derived type, an array of char . A second difference is that "x" really consists of two characters, 'x' and '\0' , the null character (see Figure 4.3).

Figure 4.3. The character 'x' and the string "x" .
graphics/04fig03.jpg

The strlen() Function

The previous chapter unleashed the sizeof operator, which gives the size of things in bytes. The strlen() function gives the length of a string in characters. Because it takes 1 byte to hold 1 character, you might suppose that both would give the same result when applied to a string, but they don't. Add a few lines to the example, as shown in Listing 4.3, and see why.

Listing 4.3 The praise2.c program.
 /* praise2.c */ #include <stdio.h> #include <string.h>      /* provides strlen() prototype */ #define PRAISE "My sakes, that's a grand name!" int main(void) {   char name[40];      printf("What's your name?\n");   scanf("%s", name);   printf("Hello, %s. %s\n", name, PRAISE);   printf("Your name of %d letters occupies %d memory cells.\n",          strlen(name), sizeof name);   printf("The phrase of praise has %d letters ",          strlen(PRAISE));   printf("and occupies %d memory cells.\n", sizeof PRAISE);   return 0; } 

If you are using a pre-ANSI C compiler, you might have to remove the following line:

 #include <string.h> 

The string.h file contains function prototypes for several string- related functions. Although not necessary for this particular example, this addition makes the program more harmonious with the ANSI spirit. I'll discuss this file in Chapter 12, "File Input/Output." (By the way, some pre-ANSI UNIX systems use strings.h instead of string.h to contain declarations for string functions.)

More generally , ANSI C divides the C function library into families of related functions and provides a header file for each family. For example, printf() and scanf() belong to a family of standard input and output functions and use the stdio.h header file. The strlen() function joins several other string-related functions, such as functions to copy strings and to search through strings, in a family served by the string.h header.

Notice that Listing 4.3 uses two methods to handle long printf() statements. The first method spreads one printf() statement over two lines. (You can break a line between arguments but not in the middle of a string; for example, not between the quotation marks.) The second method uses two printf() statements to print just one line. The newline character ( \n ) appears only in the second statement. Running the program could produce this interchange:

 What's your name?  Tuffy  Hello, Tuffy. My sakes, that's a grand name! Your name of 5 letters occupies 40 memory cells. The phrase of praise has 30 letters and occupies 31 memory cells. 

See what happens? The array name has 40 memory cells, and that is what the sizeof operator reports . Only the first five cells are needed to hold Tuffy, however, and that is what strlen() reports. The sixth cell in the array name contains the null character, and its presence tells strlen() when to stop counting. Figure 4.4 illustrates this concept.

Figure 4.4. The strlen() function knows when to stop.
graphics/04fig04.jpg

When you get to PRAISE , you find that strlen() again gives you the exact number of characters (including spaces and punctuation) in the string. The sizeof operator gives you a number one larger because it also counts the invisible null character used to end the string. You didn't tell the computer how much memory to set aside to store the phrase. It had to count the number of characters between the double quotes itself.

One other point: The preceding chapter used sizeof with parentheses, but this chapter doesn't. Whether you use parentheses depends on whether you want the size of a type or the size of a particular quantity. Parentheses are required for types, but are optional for particular quantities . That is, you would use sizeof(char) or sizeof (float) , but can use sizeof name or sizeof 6.28 .

The last example used strlen() and sizeof for rather a rather trivial purpose of satisfying a user 's potential curiosity . Actually, however, strlen() and sizeof are important programming tools. For example, strlen() is useful in all sorts of character-string programs, as you'll see in Chapter 11, "Character Strings and String Functions."

Let's move on to the #define statement.

I l @ ve RuBoard


C++ Primer Plus
C Primer Plus (5th Edition)
ISBN: 0672326965
EAN: 2147483647
Year: 2000
Pages: 314
Authors: Stephen Prata

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net