What Is gcov? | GNU/Linux Application Programming (Programming Series)

Let s begin with an overview of what gcov can do for us. The gcov utility is a coverage testing tool. When built with an application, the gcov utility monitors an application under execution and identifies which source lines have been executed and which have not. Further, gcov can identify the number of times a particular line has been executed, making it useful for performance profiling (where an application is spending most of its time). Because gcov can tell which lines have not been executed, it is useful as a coverage testing tool. In concert with a test suite, gcov can identify whether all source lines have been adequately covered [FSF 2002].

We ll discuss the use of gcov bundled with version 3.2.2 of the GNU compiler tool chain.

Preparing the Image

Let s first look at how an image is prepared for use with gcov . We ll provide more detail of gcov options in the coming sections, so this will serve as an introduction. We ll use the simple bubblesort source file shown in Listing 7.1.

Listing 7.1: Sample Source File to Illustrate the gcov Utility (on the CD-ROM at ./source/ch7/bubblesort.c)

  1:  #include <stdio.h>  2:   3:  void bubbleSort(int list[], int size)  4:  {  5:  int i, j, temp, swap = 1;  6:   7:  while (swap) {  8:   9:  swap = 0;  10:   11:  for (i = (size-1) ; i >= 0 ; i) {  12:   13:  for (j = 1 ; j <= i ; j++) {  14:   15:  if (list[j-1] > list[j]) {  16:   17:  temp = list[j-1];  18:  list[j-1] = list[j];  19:  list[j] = temp;  20:  swap = 1;  21:   22:  }  23:   24:  }  25:   26:  }  27:   28:  }  29:   30:  }  31:   32:  int main()  33:  {  34:  int theList[10]={10, 9, 8, 7, 6, 5, 4, 3, 2, 1};  35:  int i;  36:   37:  /* Invoke the bubble sort algorithm */  38:  bubbleSort(theList, 10);  39:   40:  /* Print out the final list */  41:  for (i = 0 ; i < 10 ; i++) {  42:  printf("%d\n", theList[i]);  43:  }  44:   45:  }

The gcov utility is used in conjunction with the compiler tool chain. This means that the image that we re to do coverage testing on must be compiled with a special set of options. These are illustrated below for compiling the source file bubblesort.c :

 gcc bubblesort.c -o bubblesort  -ftest-coverage -fprofile-arcs

The resulting image, when executed, produces a number of files containing statistics about the application (along with statistics emitted to standard-out). These files are then used by the gcov utility to report statistics and coverage information to the developer. When the -ftest-coverage option is specified, two files are generated for each source file. These files use the extension .bb (basic-block) and .bbg (basic block graph) and are used to reconstruct the program flow graph of the executed application. For the option -fprofile-arcs , a .da file is generated that contains the execution count for each instrument branch. These files are used after execution, along with the original source file, to identify the execution behavior of the source.

Using the gcov Utility

Now that we have our image, let s continue to walk through the rest of the process. Executing our new application yields the set of statistics files discussed previously ( .bb , .bbg , and .da ). We then execute the gcov application with the source file that we wish to examine, as:

 $ ./bubblesort     ...     $ gcov bubblesort.c     100.00% of 17 source lines executed in file bubblesort.c     Creating bubblesort.c.gcov.

This tells us that all source lines within our sample application were executed at least once. We can see the actual counts for each source line by reviewing the generated file bubblesort.c .gcov (see Listing 7.2).

Listing 7.2: File bubblesort.c.gcov Resulting from Invocation of gcov Utility

  1:  #include <stdio.h>  2:   3:  void bubbleSort(int list[], int size)  4:  1    {  5:  1      int i, j, temp, swap = 1;  6:   7:  3      while (swap) {  8:   9:  2        swap = 0;  10:   11:  22        for (i = (size-1) ; i >= 0 ; i) {  12:   13:  110          for (j = 1 ; j <= i ; j++) {  14:   15:  90             if (list[j-1] > list[j]) {  16:   17:  45              temp = list[j-1];  18:  45              list[j-1] = list[j];  19:  45              list[j] = temp;  20:  45              swap = 1;  21:   22:  }  23:   24:  }  25:   26:  }  27:   28:  }  29:   30:  }  31:   32:  int main()  33:  1    {  34:  1      int theList[10]={10, 9, 8, 7, 6, 5, 4, 3, 2, 1};  35:  1      int i;  36:   37:  /* Invoke the bubble sort algorithm */  38:  1      bubbleSort(theList, 10);  39:   40:  /* Print out the final list */  41:  11      for (i = 0 ; i < 10 ; i++) {  42:  10        printf("%d\n", theList[i]);  43:  }  44:   45:  }

Let s now walk through some of the major points of Listing 7.2 to see what s provided. The first column shows the execution count for each line of source (line 4 shows a count of one execution, the call of the bubbleSort function). In some cases execution counts aren t provided. These are simply C source elements that don t result in code (for example, lines 22 through 30).

The counts can provide some information about the execution of the application. For example, the test at line 15 was executed 90 times, but the code executed within the test (lines 17 “20) was executed only 45 times. This tells you that while the test was invoked 90 times, the test succeeded only 45. In other words, half of the tests resulted in a swap of two elements. This behavior is due to the ordering of the test data at line 34.

Note	The gcov files ( .bb , .bbg , and .da ) should be removed before running the application again. If the .da file isn t removed, the statistics will simply accumulate rather than start over. This can be useful but, if unexpected, problematic .

The code segment executed most often, not surprisingly, is the inner loop of the sort algorithm. This is because line 13 is invoked one time more than line 15 due to the exit test (to complete the loop).

Looking at Branch Probabilities

We can also see the branch statistics for the application using the -b option. This option writes branch frequencies and summaries for each branch in the instrumented application. For example, when we invoke gcov with the -b option, we now get the following:

 $ gcov -b bubblesort.c 100.00% of 17 source lines executed in file bubblesort.c 100.00% of 12 branches executed in file bubblesort.c 100.00% of 12 branches taken at least once in file bubblesort.c 100.00% of 2 calls executed in file bubblesort.c Creating bubblesort.c.gcov. $

The resulting bubblesort.c .gcov file is shown in Listing 7.3. Here we see a similar listing to 7.2, but this time the branch points have been labeled with their frequencies.

Listing 7.3: File bubblesort.c.gcov Resulting from Invocation of gcov Utility with -b

  1:  #include <stdio.h>  2:   3:  void bubbleSort(int list[], int size)  4:  1    {  5:  1      int i, j, temp, swap = 1;  6:   7:  3      while (swap) {  8:  branch 0 taken = 67%  9:  branch 1 taken = 100%  10:   11:  2        swap = 0;  12:   13:  22        for (i = (size-1) ; i >= 0 ; i) {  14:  branch 0 taken = 91%  15:  branch 1 taken = 100%  16:  branch 2 taken = 100%  17:   18:  110          for (j = 1 ; j <= i ; j++) {  19:  branch 0 taken = 82%  20:  branch 1 taken = 100%  21:  branch 2 taken = 100%  22:   23:  90            if (list[j-1] > list[j]) {  24:  branch 0 taken = 50%  25:   26:  45              temp = list[j-1];  27:  45              list[j-1] = list[j];  28:  45              list[j] = temp;  29:  45              swap = 1;  30:   31:  }  32:   33:  }  34:   35:  }  36:   37:  }  38:   39:  }  40:   41:  int main()  42:  1    {  43:  1      int theList[10]={10, 9, 8, 7, 6, 5, 4, 3, 2, 1};  44:  1      int i;  45:   46:  /* Invoke the bubble sort algorithm */  47:  1      bubbleSort(theList, 10);  48:  call 0 returns = 100%  49:   50:  /* Print out the final list */  51:  11      for (i = 0 ; i < 10 ; i++) {  52:  branch 0 taken = 91%  53:  branch 1 taken = 100%  54:  branch 2 taken = 100%  55:  10        printf("%d\n", theList[i]);  56:  call 0 returns = 100%  57:  }  58:   59:  }

The branch points are very dependent upon the target architecture s instruction set. Line 23 is a simple if statement and therefore has one branch point represented. Note that this is 50%, which cross-checks with our observation of line execution counts previously. Other branch points are a little more difficult to parse. For example, line 7 represents a while statement and has two branch points. In 86 assembly, this line compiles to what you see in Listing 7.4.

Listing 7.4: x86 Assembly for the First Branch Point of bubblesort.c.gcov

  1:  cmpl   1:  cmpl $0, -20(%ebp)  2:  jne .L4  3:  jmp .L1 
 , -20(%ebp)  2:  jne        .L4  3:  jmp        .L1

The swap variable is compared at line 1 to the value 0 in Listing 7.4. If it s not equal to zero, the jump at line 2 is taken (jump-nonzero) to .L4 (line 11 from Listing 7.3). Otherwise, the jump at line 3 is taken to .L1 . The branch probabilities show that line 2 (branch 0) was taken 67% of the time. This is because the line was executed three times, but the jne (line 2 of Listing 7.3) was taken only twice (2/3 or 67%). When the jne at line 2 is not taken, we do the absolute jump ( jmp ) at line 3. This is executed once, and once executed the application ends. Therefore, branch 1 (line 9 of Listing 7.3) is taken 100% of the time.

So the branch probabilities are useful in understanding program flow, but consulting the assembly can be required to understand what the branch points represent.

Incomplete Execution Coverage

When gcov encounters an application whose test coverage is not 100%, the lines that are not executed are labeled with ###### rather than an execution count. Listing 7.5 shows a source file created by gcov that illustrates less than 100% coverage.

Listing 7.5: A Sample Program with Incomplete Test Coverage (on the CD-ROM at ./source/ch7/incomptest.c )

  1:  #include <stdio.h>  2:   3:  int main()  4:  1    {  5:  1      int a=1, b=2;  6:   7:  1      if (a == 1) {  8:  1        printf("a = 1\n");  9:  } else {  10:  ######        printf("a != 1\n");  11:  }  12:   13:  1      if (b == 1) {  14:  ######        printf("b = 1\n");  15:  } else {  16:  1        printf("b != 1\n");  17:  }  18:   19:  1      return 0;  20:  }

The gcov utility also reports this information to standard-out when it is run. It emits the number of source lines possible to execute (in this case 9) and the percentage that were actually executed (here, 78%):

 $ gcov incomptest.c   77.78% of 9 source lines executed in file incomptest.c     Creating incomptest.c.gcov.     $

If our sample application had multiple functions, we could see the breakdown per function through the use of the -f option (or -function-summaries ). This is illustrated using our previous bubblesort application as:

 $ gcov -f bubblesort.c 100.00% of 11 source lines executed in function bubbleSort 100.00% of 6 source lines executed in function main 100.00% of 17 source lines executed in file bubblesort.c Creating bubblesort.c.gcov. $

Options Available for gcov

Now that we ve seen gcov in action in a few scenarios, let s look at gcov s full list of options (see Table 7.1). The gcov utility is invoked with the source file to be annotated, as:

 gcov [options] sourcefile

Table 7.1: gcov Utility Options
Option	Purpose
-v, ”version	Emit version information (no further processing).
-h, ”help	Emit help information (no further processing).
-b, ”branch-probabilities	Emit branch frequencies to the output file (with summary).
-c, ”branch-counts	Emit branch counts rather than frequencies.
-n, ”no-output	Do not create the gcov output file.
-l, ”long-file- names	Create long filenames.
-f, ”function-summaries	Emit summaries for each function.
-o, ”object-directory	Directory where .bb , .bbg , and .da files are stored.

From Table 7.1, we can see a short single letter option, and a longer option. The short option is useful when using gcov from the command line, but when gcov is part of a Makefile, the longer options should be used as they re more descriptive.

To retrieve version information about the gcov utility, the -v option is used. Since gcov is tied to a given compiler tool chain (it s actually built from the gcc tool chain source), the versions for gcc and gcov will be identical.

An introduction to gcov and option help for gcov can be displayed using the -h option.

The branch probabilities can be emitted to the annotated source file using the -b option (see the section Looking at Branch Probabilities, earlier in this chapter). Rather than producing branch percentages, branch counts can be emitted using the -c option.

If the annotated source file is not important, the -n option can be used. This can be useful if all that s important is to understand the test coverage of the source. This information is emitted to standard-out.

When including source in header files, it can be useful to use the -l option to produce long filenames. This helps make filenames unambiguous if multiple source files include headers containing source (each getting its own gcov annotated header file).

Coverage information can be emitted to standard-out for each function rather than the entire application using the -f option. This is discussed in the section Incomplete Execution Coverage, earlier in this chapter.

The final option, -o , tells gcov where the gcov object files are stored. By default, gcov will look for the files in the current directory. If they re stored elsewhere, this option specifies where gcov can find them.

Considerations

Certain capabilities should be avoided when using gcov for test coverage. Optimization should be disabled when using gcov . Since optimization can result in source lines being moved or removed, coverage is less meaningful. Coverage testing is also less meaningful when using source macro expansion in the source after the preprocessor stage. These aren t shown in gcov and therefore miss identification of full test coverage.

For GNU/Linux kernel developers, gcov can be used for certain architectures within the kernel. A patch is available from IBM to allow gcov use in the kernel. Its availability is provided in the Resources section.