Exploring and Exploiting printf() and scanf()

I l @ ve RuBoard

Exploring and Exploiting printf() and scanf()

The functions printf() and scanf() enable you to communicate with a program. They are called input/output functions , or I/O functions for short. They are not the only I/O functions you can use with C, but they are the most versatile. Historically, these functions, like all other functions in the C library, were not part of the definition of C. C originally left the implementation of I/O up to the compiler writers; this made it possible to better match I/O to specific machines. In the interests of compatibility, various implementations all came with versions of scanf() and printf() . However, there were occasional discrepancies between implementations. The ANSI C standard describes standard versions of these functions, and we'll follow that standard. The discrepancies should disappear as the standard is implemented.

Although printf() is an output function and scanf() is an input function, both work much the same, each using a control string and a list of arguments. We will show you how these work, first with printf() and then with scanf() .

The printf() Function

The instructions you give printf() when you ask it to print a variable depend on the variable type. For instance, we have used the %d notation when printing an integer and the %c notation when printing a character. These notations are called conversion specifications because they specify how the data is to be converted into displayable form. We'll list the conversion specifications that the ANSI C standard provides for printf() , and then show how to use the more common ones. Table 4.3 presents the conversion specifiers and the type of output they cause to be printed.

Table  4.3. Conversion specifiers and the resulting printed output.
Conversion Specification Output
%c Single character
%d Signed decimal integer
%e Floating-point number, e-notation
%E Floating-point number, E-notation
%f Floating-point number, decimal notation
%g Use %f or %e , depending on value. The %e style is used if the exponent is less than -4 or greater than or equal to the precision.
%G Use %f or %E , depending on value. The %E style is used if the exponent is less than -4 or greater than or equal to the precision.
%i Signed decimal integer
%o Unsigned octal integer
%p A pointer
%s Character string
%u Unsigned decimal integer
%x Unsigned hexadecimal integer, using hex digits 0f
%X Unsigned hexadecimal integer, using hex digits 0F
%% Print a percent sign

Note

The conversion specifiers %E , %G , %i , %p , and %X did not appear in the first edition of Kernighan and Ritchie, and they are not yet commonly found in all non-ANSI implementations.


Using printf()

Listing 4.6 contains a program that uses some of the conversion-specification examples I discuss.

Listing 4.6 The printout.c program.
 /* printout.c -- uses conversion specifiers */ #include <stdio.h> #define PI 3.141593 int main(void) {   int number = 5;   float ouzo = 13.5;   int cost = 3100;   printf("The %d women drank %f glasses of ouzo.\n", number,          ouzo);   printf("The value of pi is %f.\n", PI);   printf("Farewell! thou art too dear for my possessing,\n");   printf("%c%d\n", '$', 2 * cost);   return 0; } 

The output, of course, is this:

 The 5 women drank 13.500000 glasses of ouzo. The value of pi is 3.141593. Farewell! thou art too dear for my possessing, 00 

This is the format for using printf() :

 printf(control-string, item1, item2,...); 

Item1 , item2 , and so on, are the items to be printed. They can be variables or constants, or even expressions that are evaluated before the value is printed. control-string is a character string describing how the items are to be printed. As mentioned in Chapter 3, "Data and C," the control string should contain a conversion specifier for each item to be printed. For example, consider this statement:

 printf("The %d women drank %f glasses of ouzo.\n", number, ouzo); 

The control string is the phrase in double quotes. It contains two conversion specifiers corresponding to number and ouzo ”the two items to be displayed. Figure 4.6 shows another example of a printf() statement.

Figure 4.6. Arguments for printff() .
graphics/04fig06.jpg

Here is another line from the example:

 printf("The value of pi is %f.\n", PI); 

This time, the list of items has just one member ”the symbolic constant PI .

As you can see in Figure 4.7, a control string contains two distinct forms of information:

Figure 4.7. Anatomy of a control string.
graphics/04fig07.jpg
  • Characters that are actually printed.

  • Conversion specifications.

Don't forget to use one conversion specification for each item in the list following a control string. Woe unto you should you forget this basic requirement! Don't do this:

 printf("The score was Squids %d, Slugs %d.\n", score1); 

Here, there is no value for the second %d . The result of this faux pas depends on your system, but at best you will get nonsense .

If you want to print only a phrase, you don't need any conversion specifications. If you just want to print data, you can dispense with the running commentary . Each of the following statements from Listing 4.6 is quite acceptable:

 printf("Farewell! thou art too dear for my possessing,\n"); printf("%c%d\n", '$', 2 * cost); 

In the second statement, note that the first item on the print list was a character constant rather than a variable and that the second item is a multiplication. This illustrates that printf() uses values, be they variables, constants, or expressions.

Because the printf() function uses the % symbol to identify the conversion specifications, there is a slight problem if you want to print the % sign itself. If you simply use a lone % sign, the compiler thinks you have bungled a conversion specification. The way out is simple. Just use two % symbols.

 pc = 2*6; printf("Only %d%% of Sally's gribbles were edible.\n", pc); 

The following output would result:

 Only 12% of Sally's gribbles were edible. 

Conversion Specification Modifiers for printf()

You can modify a basic conversion specification by inserting modifiers between the % and the defining conversion character. Tables 4.4 and 4.5 list the characters you can place there legally. If you use more than one modifier, they should be in the same order as they appear in Table 4.4. Not all combinations are possible. The tables reflect the ANSI C standard; your implementation may not yet support all the options shown here. The C9X committee proposes adding an ll modifier to indicate long long values and an hh modifier to indicate that an integer value is to be displayed as signed char or unsigned char , depending on whether the original value is signed or unsigned.

Table  4.4. The printf() modifiers.
Modifier Meaning
flag The five flags ( - , + , space, # , and ) are described in Table 4.5. Zero or more flags may be present. Example: "%-10d"
digit(s) The minimum field width. A wider field will be used if the printed number or string won't fit in the field. Example: %4d
.digit(s) Precision. For %e , %E , and %f conversions, the number of digits to be printed to the right of the decimal. For %g and %G conversions, the maximum number of significant digits. For %s conversions, the maximum number of characters to be printed. For integer conversions, the minimum number of digits to appear; leading zeros are used if necessary to meet this minimum. Using only . implies a following zero, so %.f is the same as %.0f . Example: "%5.2f" prints a float in a field five characters wide with two digits after the decimal point.
h Used with an integer conversion to indicate a short int or unsigned short int value. Examples: "%hu" , "%hx" , and "%6.4hd"
l Used with an integer conversion to indicate a long int or unsigned long int . Examples: "%ld" and "%8lu"
L Used with a floating-point conversion to indicate a long double value. Examples: "%Lf" and "%10.4Le"

Conversion of float Arguments

There are conversion specifiers to print floating types double and long double . However, there is no specifier for float . The reason is that, under K&R C, float values were automatically converted to type double before being used in an expression or as an argument. ANSI C, in general, does not automatically convert float to double . To protect the enormous number of existing programs that assume float arguments are converted to double , however, all float arguments to printf() ”as well as to any other C function not using an explicit prototype ”are still automatically converted to double . Therefore, under either K&R C or ANSI C, no special conversion specifier is needed for displaying type float .

Table  4.5. The printf() flags.
Flag Meaning
- The item is left-justified; that is, it is printed beginning at the left of the field. Example: "%-20s"
+ Signed values are displayed with a plus sign, if positive, and with a minus sign, if negative. Example: "%+6.2f"
space Signed values are displayed with a leading space (but no sign) if positive and with a minus sign if negative. A + flag overrides a space. Example: "%6.2f"
flagprintf() functions# Use an alternative form for the conversion specification. Produces an initial for the %o form and an initial 0x or 0X for the %x and %X forms. For all floating-point forms, # guarantees that a decimal-point character is printed, even if no digits follow. For %g and %G forms, it prevents trailing zeros from being removed. Examples: "%#o" , "%#8.0f" , and "%+#10.3E"
For numeric forms, pad the field width with leading zeros instead of with spaces. This flag is ignored if a - flag is present or if, for an integer form, a precision is specified. Examples: "%010d" and "%08.3f"
Examples

Let's put these modifiers to work, beginning with looking at the effect of the field width modifier on printing an integer. Consider the program in Listing 4.7.

Listing 4.7 The width.c program.
 /* width.c -- field widths */ #include <stdio.h> #define PAGES 732 int main(void) {   printf("*%d*\n", PAGES);   printf("*%2d*\n", PAGES);   printf("*%10d*\n", PAGES);   printf("*%-10d*\n", PAGES);   return 0; } 

Listing 4.7 prints the same quantity four times ”but using four different conversion specifications. I used a slash ( / ) to show you where each field begins and ends. The output looks like this:

 *732* *732* *       732* *732       * 

The first conversion specification is %d with no modifiers. It produces a field with the same width as the integer being printed. This is the default option; that is, what's printed if you don't give further instructions. The second conversion specification is %2d . This should produce a field width of 2, but because the integer is three digits long, the field is expanded automatically to fit the number. The next conversion specification is %10d . This produces a field 10 spaces wide, and, indeed, there are seven blanks and three digits between the asterisks , with the number tucked into the right end of the field. The final specification is %-10d . It also produces a field 10 spaces wide, and the - puts the number at the left end, just as advertised. After you get used to it, this system is easy to use and gives you nice control over the appearance of your output. Try altering the value for PAGES to see how different numbers of digits are printed.

Now look at some floating-point formats. Enter, compile, and run the program in Listing 4.8.

Listing 4.8 The floats.c program.
 /* floats.c -- some floating-point combinations */ #include <stdio.h> int main(void) {   const double RENT = 2345.67;  /* const-style constant */   printf("*%f*\n", RENT);   printf("*%e*\n", RENT);   printf("*%4.2f*\n", RENT);   printf("*%3.1f*\n", RENT);   printf("*%10.3f*\n", RENT);   printf("*%10.3e*\n", RENT);   printf("*%+4.2f*\n", RENT);   printf("*%010.2f*\n", RENT);   return 0; } 

This time, the program uses the keyword const to create a symbolic constant. Here is the output:

 *2345.670000* *2.345670e+003* *2345.67* *2345.7* *  2345.670* *2.346e+003* *+2345.67* 

Again, begin with the default version, %f . In this case, there are two defaults: the field width and the number of digits to the right of the decimal. The second default is six digits, and the field width is whatever it takes to hold the number.

Next is the default for %e . It prints one digit to the left of the decimal point and six places to the right. You seem to be getting a lot of digits! The cure is to specify the number of decimal places to the right of the decimal, and the next four examples in this segment do that. Notice how the fourth and the sixth examples cause the output to be rounded off.

Finally, the + flag causes the result to be printed with its algebraic sign, which is a plus sign in this case, and the flag produces leading zeros to pad the result to the full field width. Note that in the specifier %010 the first is a flag, and the remaining digits ( 10 ) specify the field width.

You can modify the RENT value to see how variously sized values are printed. Listing 4.9 demonstrates a few more combinations.

Listing 4.9 The flags.c program.
 /* flags.c -- illustrates some formatting flags */ #include <stdio.h> int main(void) {   printf("%x %X %#x\n", 31, 31, 31);   printf("**%d**% d**% d**\n", 42, 42, -42);   printf("**%5d**%5.3d**%05d**%05.3d**\n", 6, 6, 6, 6);   return 0; } 

On a system that conforms to the ANSI C standard, the output looks like this:

 1f 1F 0x1f **42** 42**-42** **    6**  006**00006**  006** 

First, 1f is the hex equivalent of 31. The x specifier yields a 1f , and the X specifier yields 1F . Using the # flag provides an initial 0x .

The second line illustrates how using a space in the specifier produces a leading space for positive values, but not for negative values. This can produce a pleasing output because positive and negative values with the same number of significant digits are printed with the same field widths.

The third line illustrates how using a precision specifier ( %5.3d ) with an integer form produces enough leading zeros to pad the number to the minimum value of digits (three, in this case). Using the flag, however, pads the number with enough leading zeros to fill the whole field width. Finally, if you provide both the flag and the precision specifier, the flag is ignored.

Now let's examine some of the string options. Consider the example in Listing 4.10.

Listing 4.10 The strings.c program.
 /* strings.c -- string formatting */ #include <stdio.h> #define BLURB "Outstanding acting!" int main(void) {    printf("/%2s/\n", BLURB);    printf("/%22s/\n", BLURB);    printf("/%22.5s/\n", BLURB);    printf("/%-22.5s/\n", BLURB);    return 0; } 

Here is the output:

 /Outstanding acting!/ /   Outstanding acting!/ /                 Outst/ /Outst                 / 

Notice how the field is expanded to contain all the specified characters. Also notice how the precision specification limits the number of characters printed. The .5 in the format specifier tells printf() to print just five characters. Again, the - modifier left-justifies the text.

Applying Your Knowledge

Okay, you've seen some examples. Now how would you set up a statement to print something having the following form?

 The NAME family just may be $XXX.XX dollars richer! 

Here, NAME and XXX.XX represent values that will be supplied by variables in the program, say, name[40] and cash .

Here is one solution:

 printf("The %s family just may be $%.2f richer!\n",name,cash); 

The Meaning of Conversion

Let's take a closer look at what a conversion specification converts. It converts a value stored in the computer in some binary format to a series of characters (a string) to be displayed. For example, the number 76 may be stored internally as binary 01001100. The %d conversion specifier converts this to the characters 7 and 6 , displaying 76 . The %x conversion converts the same value ( 01001100 ) to the hexadecimal representation 4c . The %c converts the same value to the character representation L .

The term conversion is probably somewhat misleading because it wrongly suggests that the original value is replaced with a converted value. Conversion specifications are really translation specifications; %d means "translate the given value to a decimal integer text representation and print the representation."

Mismatched Conversions

Naturally, you should match the conversion specification to the type of value being printed. Often, you have choices. For instance, if you want to print a type int value, you can use %d or %x or %o . All these specifiers assume that you are printing a type int value; they merely provide different representations of the value. Similarly, you can use %f , %e , or %g to represent a type double value.

What if you mismatch the conversion specification to the type? You've seen in the preceding chapter that mismatches can cause problems. This is a very important point to keep in mind, so Listing 4.11 shows some more examples of mismatches within the integer family.

Listing 4.11 The intconv.c program.
 /* intconv.c -- some mismatched integer conversions */ #include <stdio.h> #define PAGES 336 #define WORDS 65616 int main(void) {    short num = PAGES;    short mnum = -PAGES;    printf("%hd %hu %hd %hu\n", num, num, mnum, mnum);    printf("%d %c\n", num, num);    printf("%d %hd\n", WORDS, WORDS);    return 0; } 

Our system produces the following results:

 336 336 -336 65200 336 P 65616 80 

Looking at the first line, you can see that both %hd and %hu produce 336 as output for the variable num ; no problem there. The %u (unsigned) version of mnum came out as 65200 , however, not as the 336 you might have expected. This results from the way that signed short int values are represented on your reference system. First, they are 2 bytes in size . Second, the system uses a method called the two's complement to represent signed integers. In this method, the numbers 0 to 32767 represent themselves , and the numbers 32768 to 65535 represent negative numbers, with 65535 being -1, 65534 being -2, and so forth. Therefore, -336 is represented by 65536 - 336 , or 65200 . So 65200 represents -336 when interpreted as a signed int and represents 65200 when interpreted as an unsigned int . Be wary! One number can be interpreted as two different values. Not all systems use this method to represent negative integers. Nonetheless, there is a moral: Don't expect a %u conversion to simply strip the sign from a number.

The second line shows what happens if you try to convert a value greater than 255 to a character. On this system, a short int is 2 bytes and a char is 1 byte. When printf() prints 336 using %c , it looks at only 1 byte out of the 2 used to hold 336. This truncation (see Figure 4.8) amounts to dividing the integer by 256 and keeping just the remainder. In this case, the remainder is 80, which is the ASCII value for the character P . More technically, you can say that the number is interpreted modulo 256 , which means using the remainder when the number is divided by 256.

Figure 4.8. Reading 336 as a character.
graphics/04fig08.jpg

Finally, we tried printing an integer (65616) larger than the maximum short int (32767) allowed on our system. Again, the computer does its modulo thing. The number 65616, because of its size, is stored as a 4-byte int value on our system. When we print it using the %hd specification, printf() uses only the last 2 bytes. This corresponds to using the remainder after dividing by 65536. In this case, the remainder is 80. A remainder between 32767 and 65536 would be printed as a negative number because of the way negative numbers are stored. Systems with different integer sizes would have the same general behavior, but with different numerical values.

When you start mixing integer and floating types, the results are more bizarre. Consider, for example, Listing 4.12.

Listing 4.12 The floatcnv.c program.
 /* floatcnv.c -- mismatched floating-point conversions */ #include <stdio.h> int main(void) {    float n1 = 3.0;    double n2 = 3.0;    long n3 = 2000000000;    long n4 = 1234567890;    printf("%.1e %.1e %.1e %.1e\n", n1, n2, n3, n4);    printf("%ld %ld\n", n3, n4);    printf("%ld %ld %ld %ld\n", n1, n2, n3, n4);    return 0; } 

On this system, Listing 4.12 produces the following output:

 3.0e+000 3.0e+000 3.1e+046 3.1e+046 2000000000 1234567890 0 1074266112 0 1074266112 

The first line of output shows that using a %e specifier does not convert an integer to a floating-point number. Consider, for example, what happens when you try to print n3 (type long ) using the %e specifier. First, the %e specifier causes printf() to expect a type double value, which is an 8-byte value on this system. When printf() looks at n3 , which is a 4-byte value on this system, it also looks at the adjacent 4 bytes. Therefore, it looks at an 8-byte unit in which the actual n3 is embedded. Second, it interprets the bits in this unit as a floating-point number. Some bits, for example, would be interpreted as an exponent. So even if n3 had the correct number of bits, they would be interpreted differently under %e than under %ld . The net result is nonsense.

The first line also illustrates what we mentioned earlier: that float is converted to double when used as arguments to printf() . On this system, float is 4 bytes, but n1 was expanded to 8 bytes so that printf() would display it correctly.

The second line of output shows that printf() can print n3 and n4 correctly if the correct specifier is used.

The third line of output shows that even the correct specifier can produce phony results if the printf() statement has mismatches elsewhere. As you might expect, trying to print a floating-point value with a %ld specifier fails, but here, trying to print a type long using %ld fails! The problem lies in how C passes information to a function. The exact details of this failure are implementation dependent, but the next box discusses a representative system.

Passing Arguments

The mechanics of argument passing depend on the implementation. This is how argument passing works on our system. The function call looks like this:

 printf("%ld %ld %ld %ld\n", n1, n2, n3, n4); 

This call tells the computer to hand over the values of the variables n1 , n2 , n3 , and n4 to the computer. It does so by placing them in an area of memory called the stack . When the computer puts these values on the stack, it is guided by the types of the variables, not by the conversion specifiers, so for n1 , it places 8 bytes on the stack ( float is converted to double ). Similarly, it places 8 more bytes for n2 , followed by 4 bytes each for n3 and n4 . Then control shifts to the printf() function. This function reads the values off the stack, but when it does so, it reads them according to the conversion specifiers. The %ld specifier indicates that printf() should read 4 bytes, so printf() reads the first 4 bytes in the stack as its first value. This is just the first half of n1 , and it is interpreted as a long integer. The next %ld specifier reads 4 more bytes; this is just the second half of n1 and is interpreted as a second long integer (see Figure 4.9). Similarly, the third and fourth instances of %ld cause the first and second halves of n2 to be read and to be interpreted as two more long integers, so although we have the correct specifiers for n3 and n4 , printf() is reading the wrong bytes.

Figure 4.9. Passing arguments.
graphics/04fig09.jpg
The Return Value of printf()

As mentioned in Chapter 2, a C function generally has a return value. This is a value that the function computes and returns to the calling program. For example, the C library contains a sqrt() function that takes a number as an argument and returns its square root. The return value can be assigned to a variable, can be used in a computation, can be passed as an argument ”in short, it can be used like any other value. The printf() function also has a return value: Under ANSI C, it returns the number of characters it printed. If there is an output error, printf() returns a negative value. (Some pre-ANSI versions of printf() have different return values.)

The return value for printf() is incidental to its main purpose of printing output, and it usually isn't used. One reason you might use the return value is to check for output errors. This is more commonly done when writing to a file rather than to a screen. If a full floppy disk prevented writing from taking place, you could then have the program take some appropriate action, such as beeping the terminal for 30 seconds. However, you have to know about the if statement before doing that sort of thing. The simple example in Listing 4.13 shows how you can determine the return value.

Listing 4.13 The prntval.c program.
 /* prntval.c -- finding printf()'s return value */ #include <stdio.h> int main(void) {   int n = 212;   int rv;   rv = printf("%d F is water's boiling point.\n", n);   printf("The printf() function printed %d characters.\n",              rv);   return 0; } 

The output is as follows :

 212 F is water's boiling point. The printf() function printed 32 characters. 

First, the program used the form rv = printf(...); to assign the return value to rv . This statement thus performs two tasks : printing information and assigning a value to a variable. Second, note that the count includes all the printed characters, including the spaces and the unseen newline character.

Printing Long Strings

Occasionally, printf() statements are too long to put on one line. Because C ignores whitespace (spaces, tabs, newlines) except when used to separate elements, you can spread a statement over several lines, as long as you put your line breaks between elements. For instance, Listing 4.13 used two lines for a statement.

 printf("The printf() function printed %d characters.\n",           rv); 

The line is broken between the comma element and rv . To show a reader that the line was being continued , the example indents the rv . C ignores the extra spaces.

However, you cannot break a quoted string in the middle. Suppose you try something like this:

 printf("The printf() function printed %d characters.\n",           rv); 

C will complain that you have an illegal character in a string constant. You can use \n in a string to symbolize the newline character, but you can't have the actual newline character generated by the Enter (or Return) key in a string.

If you do have to split a string, you have three choices, as shown in Listing 4.14.

Listing 4.14 The longstrg.c program.
 /* longstrg.c -- printing long strings */ #include <stdio.h> int main(void) {     printf("Here's one way to print a ");     printf("long string.\n");     printf("Here's another way to print a \ long string.\n");     printf("Here's the newest way to print a "           "long string.\n");      /* ANSI C */     return 0; } 

Here is the output:

 Here's one way to print a long string. Here's another way to print a long string. Here's the newest way to print a long string. 

Method 1 is to use more than one printf() statement. Because the first string printed doesn't end with a \n character, the second string continues where the first ends.

Method 2 is to terminate the end of the first line with a backslash-return combination. This causes the text on the screen to start a new line, but without including a newline character in the string. The effect is to continue the string over to the next line. However, the next line has to start at the far left, as shown. If you indent that line, say, five spaces, then those five spaces become part of the string.

Method 3, new with ANSI C, is string concatenation. If you follow one quoted string constant with another, separated only by whitespace, C treats the combination as a single string, so the following three forms are equivalent:

 printf("Hello, young lovers, wherever you are."); printf("Hello, young "    "lovers" ", wherever you are."); printf("Hello, young lovers"        ", wherever you are."); 

With all these methods , you should include any required spaces in the strings: "Jim" "Smith" becomes "JimSmith" , but the combination "Jim " "Smith" is "Jim Smith" .

Using scanf()

Now let's go from output to input and examine the scanf() function. The C library contains several input functions, and scanf() is the most general of them, for it can read a variety of formats. Of course, input from the keyboard is text because the keys generate text characters: letters , digits, and punctuation. When you want to enter, say, the integer 2002, you type the characters 2 0 0 and 2 . If you want to store that as a numerical value rather than a string, your program has to convert the string character-by-character to a numerical value, and that is what scanf() does! It converts string input into various forms: integers, floating-point numbers, characters, and C strings. It is the inverse of printf() , which converts integers, floating-point numbers, characters, and C strings to text that is to be displayed on the screen.

Like printf() , scanf() uses a control string followed by a list of arguments. The control string indicates into which formats the input is to be converted. The chief difference is in the argument list. The printf() function uses variable names , constants, and expressions. The scanf() function uses pointers to variables. Fortunately, you don't have to know anything about pointers to use the function. Just remember these two simple rules:

  • If you use scanf() to read a value for one of the basic variable types we've discussed, precede the variable name with an & .

  • If you use scanf() to read a string into a character array, don't use an & .

Listing 4.15 presents a short program illustrating these rules.

Listing 4.15 The input.c program.
 /* input.c -- when to use & */ #include <stdio.h> int main(void) {    int age;             /* variable */    float assets;        /* variable */    char pet[30];        /* string   */        printf("Enter your age, assets, and favorite pet.\n");    scanf("%d %f", &age, &assets); /* use the & here      */    scanf("%s", pet);              /* no & for char array */    printf("%d $%.0f %s\n", age, assets, pet);    return 0; } 

Here is a sample exchange:

 Enter your age, assets, and favorite pet.  12   144.15 hedgehog  12 4 hedgehog 

The scanf() function uses whitespace (newlines, tabs, and spaces) to decide how to divide the input into separate fields. It matches up consecutive conversion specifications to consecutive fields, skipping over the whitespace in between. Note how the input is spread over two lines. You could just as well have used one or five lines, as long as you had at least one newline, space, or tab between each entry. The only exception to this is the %c specification, which reads the very next character, even if that character is whitespace. We'll return to this topic in a moment.

The scanf() function uses pretty much the same set of conversion-specification characters as printf() does. The main difference is that printf() uses %f , %e , %E , %g and %G for both type float and type double , whereas scanf() uses them just for type float , requiring the l modifier for double . Table 4.6 lists the main conversion specifiers as described in ANSI C.

Table  4.6. ANSI C conversion specifiers for scanf() .
Conversion Specifier Meaning
%c Interpret input as a character.
%d Interpret input as a signed decimal integer.
%e , %f , %g Interpret input as a floating-point number.
%E , %G Interpret input as a floating-point number.
%i Interpret input as a signed decimal integer.
%o Interpret input as a signed octal integer.
%p Interpret input as a pointer (an address).
%s Interpret input as a string; input begins with the first non-whitespace character and includes everything up to the next whitespace character.
%u Interpret input as an unsigned decimal integer.
%x , %X Interpret input as a signed hexadecimal integer.

You also can use modifiers in the conversion specifiers shown in Table 4.6. The modifiers go between the percent sign and the conversion letter. If you use more than one in a specifier, they should appear in the same order as shown in Table 4.7. The C9X committee proposes adding an hh modifier to indicate signed char or unsigned char and an ll modifier to indicate long long or unsigned long long .

Table  4.7. Conversion modifiers for scanf() .
Modifier Meaning
* Suppress assignment (see text). Example: "%*d"
digit(s) Maximum field width; input stops when the maximum field width is reached or when the first whitespace character is encountered , whichever comes first. Example: "%10s"
h , l , or L "%hd" and "%hi" indicate that the value will be stored in a short int . "%ho" , "%hx" , and "%hu" indicate that the value will be stored in an unsigned short int . "%ld" and "%li" indicate that the value will be stored in a long . "%lo" , "%lx" , and "%lu" indicate that the value will be stored in unsigned long . "%le" , "%lf" , and "%lg" indicate that the value will be stored in type double . Using L instead of l with e , f , and g indicates that the value will be stored in type long double . In the absence of these modifiers, d , i , o , and x indicate type int , and e , f , and g indicate type float .

As you can see, using conversion specifiers can be involved, and there are features that I have omitted. These omitted features primarily facilitate reading selected data from highly formatted sources, such as punched cards or other data records. Because you will be using scanf() primarily as a convenient means for feeding data to a program interactively, I won't discuss the more esoteric features.

The scanf() View of Input

Let's look in more detail at how scanf() reads input. Suppose you use a %d specifier to read an integer. Scanf() begins reading input a character at a time. It skips over whitespace characters (spaces, tabs, and newlines) until it finds a non-whitespace character. Because it is attempting to read an integer, scanf() expects to find a digit character or, perhaps, a sign ( + or - ). If it finds a digit or a sign, it saves the sign and then reads the next character. If that is a digit, it saves the digit and reads the next character. Scanf() continues reading and saving characters until it encounters a nondigit. It then concludes that it has reached the end of the integer. Scanf() places the nondigit back in the input. This means that the next time the program goes to read input, it starts at the previously rejected, nondigit character. Finally, scanf() computes the numerical value corresponding to the digits it read and places that value in the specified variable.

If you use a field width, scanf() halts at the field end or at the first whitespace, whichever comes first.

What if the first non-whitespace character is, say, an A instead of a digit? Then scanf() stops right there and places the A (or whatever) back in the input. No value is assigned to the specified variable, and the next time the program reads input, it starts at the A again. If your program has only %d specifiers, scanf() will never get past that A . Also, if you use a scanf() statement with several specifiers, ANSI C requires the function to stop reading input at the first failure.

Reading input using the other numeric specifiers works much the same as the %d case. The main difference is that scanf() may recognize more characters as being part of the number. For instance, the %x specifier requires that scanf() recognize the hexadecimal digits a-f and A-F. Floating-point specifiers require scanf() to recognize decimal points and E-notation.

If you use a %s specifier, any character other than whitespace is acceptable, so scanf() skips whitespace to the first non-whitespace character and then saves up non-whitespace characters until hitting whitespace again. This means that %s results in scanf() reading a single word, that is, a string with no whitespace in it. If you use a field width, scanf() stops at the end of the field or at the first whitespace. You can't use the field width to make scanf() read more than one word for one %s specifier. A final point: When scanf() places the string in the designated array, it adds the terminating '\0' to make the array contents a C string.

If you use a %c specifier, all input characters are fair game. If the next input character is a space or a newline, then a space or a newline is assigned to the indicated variable; whitespace is not skipped .

Actually, scanf() is not the most commonly used input function in C. It is featured here because of its versatility (it can read all the different data types), but C has several other input functions, such as getchar () and gets() that are better suited for specific tasks, such as reading single characters or reading strings containing spaces. We will cover some of these functions in Chapters 7, "C Control Statements: Branching and Jumps," 11, "Character Strings and String Functions," and 12, "File Input/Output." In the meantime, if you need an integer or decimal fraction or a character or a string, you can use scanf() .

Regular Characters in the Format String

The scanf() function does enable you to place ordinary characters in the format string. Ordinary characters other than the space character must be matched exactly by the input string. For instance, suppose you accidentally place a comma between two specifiers:

 scanf("%d,%d", &n, &m); 

The scanf() function interprets this to mean that you will type a number, then type a comma, and then type a second number. That is, you would have to enter two integers this way:

 88,121 

Because the comma comes immediately after the %d in the format string, you would have to type it immediately after the 88. However, because scanf() skips over whitespace preceding an integer, you could type a space or newline after the comma when entering the input. That is,

 88, 121 
and
 88, 121 
also would be accepted.

A space in the format string means to skip over any whitespace before the next input item. For instance, the statement

 scanf("%d ,%d", &n, &m); 
would accept any of the following input lines:
 88,121 88  ,121 88 ,  121 

Note that the concept of "any whitespace" includes the special cases of no whitespace.

Except for %c , the specifiers automatically skip over whitespace preceding an input value, so scanf("%d%d", &n, &m) behaves the same as scanf("%d %d", &n, &m) . For %c , adding a space character to the format string does make a difference. For example, if %c is preceded by a space in the format string, scanf() does skip to the first non-whitespace character. That is, the command scanf("%c", &ch) reads the first character encountered in input, and scanf(" %c", &ch) reads the first non-whitespace character encountered.

The scanf() Return Value

The scanf() function returns the number of items that it successfully reads. If it reads no items, which happens if you type a non-numeric string when it expects a number, scanf() returns the value . It returns EOF when it detects the condition known as "end of file." ( EOF is a special value defined in the stdio.h file. Typically, a #define directive gives EOF the value -1 .) We'll discuss end of file in Chapter 6, "C Control Statements: Looping," and make use of scanf() 's return value later in the book. After you learn about if statements and while statements, you can use the scanf() return value to detect and handle mismatched input.

The * Modifier with printf() and scanf()

Both printf() and scanf() can use the * modifier to modify the meaning of a specifier, but they do so in dissimilar fashions . First, let's see what the * modifier can do for printf() .

Suppose that you don't want to commit yourself to a field width in advance, but that you want the program to specify it. You can do this by using * instead of a number for the field width, but you also have to use an argument to tell what the field width should be. That is, if you have the conversion specifier %*d , the argument list should include a value for * and a value for d . The technique also can be used with floating-point values to specify the precision as well as the field width. Listing 4.16 is a short example showing how this works.

Listing 4.16 The varwid.c program.
 /* varwid.c -- uses variable-width output field */ #include <stdio.h> int main(void) {   unsigned width, precision;   int number = 256;   double weight = 242.5;   printf("What field width?\n");   scanf("%d", &width);   printf("The number is :%*d:\n", width, number);   printf("Now enter a width and a precision:\n");   scanf("%d %d", &width, &precision);   printf("Weight = %*.*f\n", width, precision, weight);   return 0; } 

The variable width provides the field width , and number is the number to be printed. Because the * precedes the d in the specifier, width comes before number in printf() 's argument list. Similarly, width and precision provide the formatting information for printing weight . Here is a sample run:

 What field width?  6  The number is :   256: Now enter a width and a precision:  8 3  Weight =  242.500 

Here the reply to the first question was 6 , so 6 was the field width used. Similarly, the second reply produced a width of 8 with 3 digits to the right of the decimal. More generally, a program could decide on values for these variables after looking at the value of weight .

The * serves quite a different purpose for scanf() . When placed between the % and the specifier letter, it causes that function to skip over corresponding input. Listing 4.17 provides an example.

Listing 4.17 The skip2.c program.
 /* skip2.c -- skips over first two integers of input */ #include <stdio.h> int main(void) {    int n;    printf("Please enter three integers:\n");    scanf("%*d %*d %d", &n);    printf("The last integer was %d\n", n);    return 0; } 

The scanf() instruction in Listing 4.17 says, "Skip two integers and copy the third into n ." Here is a sample run:

 Please enter three integers  1976 1992 1996  The last integer was 1996 

This skipping facility is useful if, for example, a program needs to read a particular column of a file that has data arranged in uniform columns .

I l @ ve RuBoard


C++ Primer Plus
C Primer Plus (5th Edition)
ISBN: 0672326965
EAN: 2147483647
Year: 2000
Pages: 314
Authors: Stephen Prata

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net