2.5 The Implicit Rules Database | Managing Projects with GNU Make (Nutshell Handbooks)

GNU make 3.80 has about 90 built-in implicit rules. An implicit rule is either a pattern rule or a suffix rule (which we will discuss briefly later). There are built-in pattern rules for C, C++, Pascal, FORTRAN, ratfor, Modula, Texinfo, T _E X (including Tangle and Weave ), Emacs Lisp, RCS, and SCCS. In addition, there are rules for supporting programs for these languages, such as cpp , as , yacc , lex , tangle , weave and dvi tools.

If you are using any of these tools, you'll probably find most of what you need in the built-in rules. If you're using some unsupported languages such as Java or XML, you will have to write rules of your own. But don't worry, you typically need only a few rules to support a language and they are quite easy to write.

To examine the rules database built into make , use the ”print-data-base command-line option ( -p for short). This will generate about a thousand lines of output. After version and copyright information, make prints its variable definitions each one preceded by a comment indicating the "origin" of the definition. For instance, variables can be environment variables, default values, automatic variables, etc. After the variables , come the rules. The actual format used by GNU make is:

 %: %.C #  commands to execute (built-in):         $(LINK.C) $^ $(LOADLIBES) $(LDLIBS) -o $@

For rules defined by the makefile , the comment will include the file and line where the rule was defined:

 %.html: %.xml #  commands to execute (from `Makefile', line 168):         $(XMLTO) $(XMLTO_FLAGS) html-nochunks $<

2.5.1 Working with Implicit Rules

The built-in implicit rules are applied whenever a target is being considered and there is no explicit rule to update it. So using an implicit rule is easy: simply do not specify a command script when adding your target to the makefile . This causes make to search its built-in database to satisfy the target. Usually this does just what you want, but in rare cases your development environment can cause problems. For instance, suppose you have a mixed language environment consisting of Lisp and C source code. If the file editor.l and editor.c both exist in the same directory (say one is a low-level implementation accessed by the other) make will believe that the Lisp file is really a flex file (recall flex files use the .l suffix) and that the C source is the output of the flex command. If editor.o is a target and editor.l is newer than editor.c , make will attempt to "update" the C file with the output of flex overwriting your source code. Gack.

To work around this particular problem you can delete the two rules concerning flex from the built-in rule base like this:

 %.o: %.l %.c: %.l

A pattern with no command script will remove the rule from make 's database. In practice, situations such as this are very rare. However, it is important to remember the built-in rules database contains rules that will interact with your own makefile s in ways you may not have anticipated.

We have seen several examples of how make will "chain" rules together while trying to update a target. This can lead to some complexity, which we'll examine here. When make considers how to update a target, it searches the implicit rules for a target pattern that matches the target in hand. For each target pattern that matches the target file, make will look for an existing matching prerequisite. That is, after matching the target pattern, make immediately looks for the prerequisite "source" file. If the prerequisite is found, the rule is used. For some target patterns, there are many possible source files. For instance, a .o file can be made from .c , .cc , .cpp , .p , .f , .r , .s , and .mod files. But what if the source is not found after searching all possible rules? In this case, make will search the rules again, this time assuming that the matching source file should be considered as a new target for updating. By performing this search recursively, make can find a "chain" of rules that allows updating a target. We saw this in our lexer.o example. make was able to update the lexer.o target from lexer.l even though the intermediate .c file was missing by invoking the .l to .c rule, then the .c to .o rule.

One of the more impressive sequences that make can produce automatically from its database is shown here. First, we setup our experiment by creating an empty yacc source file and registering with RCS using ci (that is, we want a version-controlled yacc source file):

 $  touch foo.y  $  ci foo.y  foo.y,v  <--  foo.y . initial revision: 1.1 done

Now, we ask make how it would create the executable foo . The ”just-print (or -n ) option tells make to report what actions it would perform without actually running them. Notice that we have no makefile and no "source" code, only an RCS file:

 $  make -n foo  co  foo.y,v foo.y foo.y,v  -->  foo.y revision 1.1 done bison -y  foo.y mv -f y.tab.c foo.c gcc    -c -o foo.o foo.c gcc   foo.o   -o foo rm foo.c foo.o foo.y

Following the chain of implicit rules and prerequisites, make determined it could create the executable, foo , if it had the object file foo.o . It could create foo.o if it had the C source file foo.c . It could create foo.c if it had the yacc source file foo.y . Finally, it realized it could create foo.y by checking out the file from the RCS file foo.y,v , which it actually has. Once make has formulated this plan, it executes it by checking out foo.y with co , transforming it into foo.c with bison , compiling it into foo.o with gcc , and linking it to form foo again with gcc . All this from the implicit rules database. Pretty cool.

The files generated by chaining rules are called intermediate files and are treated specially by make . First, since intermediate files do not occur in targets ( otherwise they would not be intermediate), make will never simply update an intermediate file. Second, because make creates intermediate files itself as a side effect of updating a target, make will delete the intermediates before exiting. You can see this in the last line of the example.

2.5.2 Rule Structure

The built-in rules have a standard structure intended to make them easily customizable. Let's go over the structure briefly then talk about customization. Here is the (by now familiar) rule for updating an object file from its C source:

 %.o: %.c         $(COMPILE.c) $(OUTPUT_OPTION) $<

The customization of this rule is controlled entirely by the set of variables it uses. We see two variables here, but COMPILE.c in particular is defined in terms of several other variables:

 COMPILE.c = $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c CC = gcc OUTPUT_OPTION = -o $@

The C compiler itself can be changed by altering the value of the CC variable. The other variables are used for setting compilation options ( CFLAGS ), preprocessor options ( CPPFLAGS ), and architecture-specific options ( TARGET_ARCH ).

The variables in a built-in rule are intended to make customizing the rule as easy as possible. For that reason, it is important to be very careful when setting these variables in your makefile . If you set these variables in a naive way, you destroy the end user 's ability to customize them. For instance, given this assignment in a makefile :

 CPPFLAGS = -I project/include

If the user wanted to add a CPP define to the command line, they would normally invoke make like:

 $ make CPPFLAGS=-DDEBUG

But in so doing they would accidentally remove the -I option that is (presumably) required for compiling. Variables set on the command line override all other assignments to the variable. (See the Section 3.6 in Chapter 3 for more details on command-line assignments). So, setting CPPFLAGS inappropriately in the makefile "broke" a customization feature that most users would expect to work. Instead of using simple assignment, consider redefining the compilation variable to include your own variables:

 COMPILE.c = $(CC) $(CFLAGS) $(INCLUDES) $(CPPFLAGS) $(TARGET_ARCH) -c INCLUDES = -I project/include

Or you can use append-style assignment, which is discussed in the Section 3.2.1 in Chapter 3.

2.5.3 Implicit Rules for Source Control

make knows about two source code control systems, RCS and SCCS, and supports their use with built-in implicit rules. Unfortunately, it seems the state of the art in source code control and modern software engineering have left make behind. I've never found a use for the source control support in make , nor have I seen it used in other production software. I do not recommend the use of this feature. There are a number of reasons for this.

First, the source control tools supported by make , RCS and SCCS, although valuable and venerable tools in their day, have largely been supplanted by CVS, the Concurrent Version System, or proprietary tools. In fact, CVS uses RCS to manage individual files internally. However, using RCS directly proved to be a considerable problem when a project spanned more than one directory or more than one developer. CVS, in particular, was implemented to fill the gaps in RCS's functionality in precisely these areas. Support for CVS has never been added to make , which is probably a good thing. ^[2]

^[2] CVS is, in turn , becoming supplanted by newer tools. While it is currently the most ubiquitous source control system, subversion (http://subversion.tigris.org) looks to be the new wave.

It is now well recognized that the life cycle of software becomes complex. Applications rarely move smoothly from one release to the next . More typically, one or more distinct releases of an application are being used in the field (and require bug fix support), while one or more versions are in active development. CVS provides powerful features to help manage these parallel versions of the software. But it also means that a developer must be very aware of the specific version of the code she is working on. Having the makefile automatically check out source during a compilation begs the question of what source is being checked out and whether the newly checked out source is compatible with the source already existing in the developer's working directories. In many production environments, developers are working on three or more distinct versions of the same application in a single day. Keeping this complexity in check is hard enough without having software quietly updating your source tree for you.

Also, one of the more powerful features of CVS is that it allows access to remote repositories. In most production environments, the CVS repository (the database of controlled files) is not located on the developer's own machine, but on a server. Although network access is now quite fast (particularly on a local area network) it is not a good idea to have make probing the network server in search of source files. The performance impact would be disastrous.

So, although it is possible to use the built-in implicit rules to interface more or less cleanly with RCS and SCCS, there are no rules to access CVS for gathering source files or makefile . Nor do I think it makes much sense to do so. On the other hand, it is quite reasonable to use CVS in makefile s. For instance, to ensure that the current source is properly checked in, that the release number information is managed properly, or that test results are correct. These are uses of CVS by makefile authors rather than issues of CVS integration with make .

2.5.4 A Simple Help Command

Large makefile s can have many targets that are difficult for users to remember. One way to reduce this problem is to make the default target a brief help command. However, maintaining the help text by hand is always a problem. To avoid this, you can gather the available commands directly from make 's rules database. The following target will present a sorted four column listing of the available make targets:

 # help - The default goal .PHONY: help help:         $(MAKE) --print-data-base --question                 \         $(AWK) '/^[^.%][-A-Za-z0-9_]*:/                       \                { print substr($, 1, length($)-1) }'      \         $(SORT)                                              \         $(PR) --omit-pagination --width=80 --columns=4

The command script consists of a single pipeline. The make rule database is dumped using the ”print-data-base command. Using the ”question option prevents make from running any actual commands. The database is then passed through a simple awk filter that grabs every line representing a target that does not begin with percent or period (pattern rules and suffix rules, respectively) and discards extra information on the line. Finally, the target list is sorted and printed in a simple four-column listing.

Another approach to the same command (my first attempt), used the awk command on the makefile itself. This required special handling for included makefile s (covered in Section 3.7.1 in Chapter 3) and could not handle generated rules at all. The version presented here handles all that automatically by allowing make to process these elements and report the resulting rule set.