10.2 Identifying and Handling Bottlenecks | Managing Projects with GNU Make (Nutshell Handbooks)

Unnecessary delays in makefile s come from several sources: poor structuring of the makefile , poor dependency analysis, and poor use of make functions and variables . These problems can be masked by make functions such as shell that invoke commands without echoing them, making it difficult to find the source of the delay.

Dependency analysis is a two-edged sword. On the one hand, if complete dependency analysis is performed, the analysis itself may incur significant delays. Without special compiler support, such as supplied by gcc or jikes , creating a dependency file requires running another program, nearly doubling compilation time. ^[2] The advantage of complete dependency analysis is that it allows make to perform fewer compiles. Unfortunately, developers may not believe this benefit is realized and write makefile s with less complete dependency information. This compromise almost always leads to an increase in development problems, leading other developers to overcompensate by compiling more code than would be required with the original, complete dependency information.

^[2] In practice, compilation time grows linearly with the size of the input text and this time is almost always dominated by disk I/O. Similarly, the time to compute dependencies using the simple -M option is linear and bound by disk I/O.

To formulate a dependency analysis strategy, begin by understanding the dependencies inherent in the project. Once complete dependency information is understood , you can choose how much to represent in the makefile (computed or hardcoded) and what shortcuts can be taken during the build. Although none of this is exactly simple, it is straightforward.

Once you've determined your makefile structure and necessary dependencies, implementing an efficient makefile is usually a matter of avoiding some simple pitfalls.

10.2.1 Simple Variables Versus Recursive

One of the most common performance- related problems is using recursive variables instead of simple variables. For example, because the following code uses the = operator instead of := , it will execute the date command every time the DATE variable is used:

 DATE = $(shell date +%F)

The +%F option instructs date to return the date in "yyyy-mm-dd" format, so for most users the repeated execution of date would never be noticed. Of course, developers working around midnight might get a surprise!

Because make doesn't echo commands executed from the shell function, it can be difficult to determine what is actually being run. By resetting the SHELL variable to /bin/sh -x , you can trick make into revealing all the commands it executes.

This makefile creates its output directory before performing any actions. The name of the output directory is composed of the word "out" and the date:

 DATE = $(shell date +%F) OUTPUT_DIR = out-$(DATE) make-directories := $(shell [ -d $(OUTPUT_DIR) ]  mkdir -p $(OUTPUT_DIR)) all: ;

When run with a debugging shell, we can see:

 $ make SHELL='/bin/sh -x' + date +%F + date +%F + '[' -d out-2004-03-30 ']' + mkdir -p out-2004-03-30 make: all is up to date.

This clearly shows us that the date command was executed twice. If you need to perform this kind of shell trace often, you can make it easier to access with:

 ifdef DEBUG_SHELL   SHELL = /bin/sh -x endif

10.2.2 Disabling @

Another way commands are hidden is through the use of the silent command modifier, @ . It can be useful at times to be able to disable this feature. You can make this easy by defining a variable, QUIET , to hold the @ sign and use the variable in commands:

 ifndef VERBOSE   QUIET := @ endif ... target:         $(QUIET) echo Building target...

When it becomes necessary to see commands hidden by the silent modifier, simply define VERBOSE on the command line:

 $ make VERBOSE=1 echo Building target... Building target...

10.2.3 Lazy Initialization

When simple variables are used in conjunction with the shell function, make evaluates all the shell function calls as it reads the makefile . If there are many of these, or if they perform expensive computations , make can feel sluggish . The responsiveness of make can be measured by timing make when invoked with a nonexistent target:

 $  time make no-such-target  make: *** No rule to make target no-such-target.  Stop. real    0m0.058s user    0m0.062s sys     0m0.015s

This code times the overhead that make will add to any command executed, even trivial or erroneous commands.

Because recursive variables reevaluate their righthand side every time they are expanded, there is a tendency to express complex calculations as simple variables. However, this decreases the responsiveness of make for all targets. It seems that there is a need for another kind of variable, one whose righthand side is evaluated only once the first time the variable is evaluated, but not before.

An example illustrating the need for this type of initialization is the find-compilation-dirs function introduced in the Section 9.3.1 in Chapter 9:

 # $(call find-compilation-dirs, root-directory) find-compilation-dirs =                       \   $(patsubst %/,%,                            \     $(sort                                    \       $(dir                                   \         $(shell $(FIND)  -name '*.java')))) PACKAGE_DIRS := $(call find-compilation-dirs, $(SOURCE_DIR))

Ideally, we would like to perform this find operation only once per execution, but only when the PACKAGE_DIRS variable is actually used. This might be called lazy initialization . We can build such a variable using eval like this:

 PACKAGE_DIRS = $(redefine-package-dirs) $(PACKAGE_DIRS) redefine-package-dirs = \    $(eval PACKAGE_DIRS := $(call find-compilation-dirs, $(SOURCE_DIR)))

The basic approach is to define PACKAGE_DIRS first as a recursive variable. When expanded, the variable evaluates the expensive function, here find-compilation-dirs , and redefines itself as a simple variable. Finally, the (now simple) variable value is returned from the original recursive variable definition.

Let's go over this in detail:

When make reads these variables, it simply records their righthand side because the variables are recursive.
The first time the PACKAGE_DIRS variable is used, make retrieves the righthand side and expands the first variable, redefine-package-dirs .
The value of redefine-package-dirs is a single function call, eval .
The body of the eval redefines the recursive variable, PACKAGE_DIRS , as a simple variable whose value is the set of directories returned by find-compilation-dirs . Now PACKAGE_DIRS has been initialized with the directory list.
The redefine-package-dirs variable is expanded to the empty string (because eval expands to the empty string).
Now make continues to expand the original righthand side of PACKAGE_DIRS . The only thing left to do is expand the variable PACKAGE_DIRS . make looks up the value of the variable, sees a simple variable, and returns its value.

The only really tricky part of this code is relying on make to evaluate the righthand side of a recursive variable from left to right. If, for instance, make decided to evaluate $(PACKAGE_DIRS) before $(redefine-package-dirs) , the code would fail.

The procedure I just described can be refactored into a function, lazy-init :

 # $(call lazy-init,variable-name,value) define lazy-init    = $$(redefine-) $$()   redefine- = $$(eval  := ) endef # PACKAGE_DIRS - a lazy list of directories $(eval                           \   $(call lazy-init,PACKAGE_DIRS, \     $$(call find-compilation-dirs,$(SOURCE_DIRS))))