5.6 Command-Line Limits | Managing Projects with GNU Make (Nutshell Handbooks)

When working with large projects, you occasionally bump up against limitations in the length of commands make tries to execute. Command-line limits vary widely with the operating system. Red Hat 9 GNU/Linux appears to have a limit of about 128K characters , while Windows XP has a limit of 32K. The error message generated also varies. On Windows using the Cygwin port, the message is:

 C:\usr\cygwin\bin\bash: /usr/bin/ls: Invalid argument

when ls is given too long an argument list. On Red Hat 9 the message is:

 /bin/ls: argument list too long

Even 32K sounds like a lot of data for a command line, but when your project contains 3,000 files in 100 subdirectories and you want to manipulate them all, this limit can be constraining.

There are two basic ways to get yourself into this mess: expand some basic value using shell tools, or use make itself to set a variable to a very long value. For example, suppose we want to compile all our source files in a single command line:

 compile_all:         $(JAVAC) $(wildcard $(addsuffix /*.java,$(source_dirs)))

The make variable source_dirs may contain only a couple hundred words, but after appending the wildcard for Java files and expanding it using wildcard , this list can easily exceed the command-line limit of the system. By the way, make has no built-in limits to constrain us. So long as there is virtual memory available, make will allow any amount of data you care to create.

When you find yourself in this situation, it can feel like the old Adventure game, "You are in a twisty maze of passages all alike." For instance, you might try to solve the above using xargs , since xargs will manage long command lines by parceling out arguments up to the system-specific length:

 compile_all:         echo $(wildcard $(addsuffix /*.java,$(source_dirs)))  \         xargs $(JAVAC)

Unfortunately, we've just moved the command-line limit problem from the javac command line to the echo command line. Similarly, we cannot use echo or printf to write the data to a file ( assuming the compiler can read the file list from a file).

No, the way to handle this situation is to avoid creating the file list all at once in the first place. Instead, use the shell to glob one directory at a time:

 compile_all:         for d in $(source_dirs); \         do                       \             $(JAVAC) $$d/*.java; \         done

We could also pipe the file list to xargs to perform the task with fewer executions:

 compile_all:         for d in $(source_dirs); \         do                       \             echo $$d/*.java;     \         done                    \         xargs $(JAVAC)

Sadly, neither of these command scripts handle errors during compilation properly. A better approach would be to save the full file list and feed it to the compiler, if the compiler supports reading its arguments from a file. Java compilers support this feature:

 compile_all: $(FILE_LIST)         $(JAVA) @$< .INTERMEDIATE: $(FILE_LIST) $(FILE_LIST):         for d in $(source_dirs); \         do                       \             echo $$d/*.java;     \         done > $@

Notice the subtle error in the for loop. If any of the directories does not contain a Java file, the string *.java will be included in the file list and the Java compiler will generate a "File not found" error. We can make bash collapse empty globbing patterns by setting the nullglob option.

 compile_all: $(FILE_LIST)         $(JAVA) @$< .INTERMEDIATE: $(FILE_LIST) $(FILE_LIST):         shopt -s nullglob;       \         for d in $(source_dirs); \         do                       \             echo $$d/*.java;     \         done > $@

Many projects have to make lists of files. Here is a macro containing a bash script producing file lists. The first argument is the root directory to change to. All the files in the list will be relative to this root directory. The second argument is a list of directories to search for matching files. The third and fourth arguments are optional and represent file suffixes.

 # $(call collect-names, root-dir, dir-list, suffix1-opt, suffix2-opt) define collect-names   echo Making $@ from directory list...   cd ;                                                    \   shopt -s nullglob;                                        \   for f in $(foreach file,,'$(file)'); do                 \     files=( $$f$(if ,/*.{$(if ,$(comma))}) );       \     if (( $${#files[@]} > 0 ));                             \     then                                                    \       printf '"%s"\n' $${files[@]};                         \     else :; fi;                                             \   done endef

Here is a pattern rule for creating a list of image files:

 %.images:         @$(call collect-names,$(SOURCE_DIR),$^,gif,jpeg) > $@

The macro execution is hidden because the script is long and there is seldom a reason to cut and paste this code. The directory list is provided in the prerequisites. After changing to the root directory, the script enables null globbing. The rest is a for loop to process each directory we want to search. The file search expression is a list of words passed in parameter $2 . The script protects words in the file list with single quotes because they may contain shell-special characters. In particular, filenames in languages like Java can contain dollar signs:

 for f in $(foreach file,,'$(file)'); do

We search a directory by filling the files array with the result of globbing. If the files array contains any elements, we use printf to write each word followed by a newline. Using the array allows the macro to properly handle paths with embedded spaces. This is also the reason printf surrounds the filename with double quotes.

The file list is produced with the line:

 files=( $$f$(if ,/*.{$(if ,$(comma))}) );

The $$f is the directory or file argument to the macro. The following expression is a make if testing whether the third argument is nonempty . This is how you can implement optional arguments. If the third argument is empty, it is assumed the fourth is as well. In this case, the file passed by the user should be included in the file list as is. This allows the macro to build lists of arbitrary files for which wildcard patterns are inappropriate. If the third argument is provided, the if appends /*.{$3} to the root file. If the fourth argument is provided, it appends ,$4 after the $3 . Notice the subterfuge we must use to insert a comma into the wildcard pattern. By placing a comma in a make variable we can sneak it past the parser, otherwise , the comma would be interpreted as separating the then part from the else part of the if . The definition of comma is straightforward:

 comma := ,

All the preceding for loops also suffer from the command-line length limit, since they use wildcard expansion. The difference is that the wildcard is expanded with the contents of a single directory, which is far less likely to exceed the limits.

What do we do if a make variable contains our long file list? Well, then we are in real trouble. There are only two ways I've found to pass a very long make variable to a subshell. The first approach is to pass only a subset of the variable contents to any one subshell invocation by filtering the contents.

 compile_all:         $(JAVAC) $(wordlist 1, 499, $(all-source-files))         $(JAVAC) $(wordlist 500, 999, $(all-source-files))         $(JAVAC) $(wordlist 1000, 1499, $(all-source-files))

The filter function can be used as well, but that can be more uncertain since the number of files selected will depend on the distribution within the pattern space chosen . Here we choose a pattern based on the alphabet:

 compile_all:         $(JAVAC) $(filter a%, $(all-source-files))         $(JAVAC) $(filter b%, $(all-source-files))

Other patterns might use special characteristics of the filenames themselves .

Notice that it is difficult to automate this further. We could try to wrap the alphabet approach in a foreach loop:

 compile_all:         $(foreach l,a b c d e ...,                        \           $(if $(filter $l%, $(all-source-files)),        \             $(JAVAC) $(filter $l%, $(all-source-files));))

but this doesn't work. make expands this into a single line of text, thus compounding the line-length problem. We can instead use eval:

 compile_all:         $(foreach l,a b c d e ...,                 \           $(if $(filter $l%, $(all-source-files)), \             $(eval                                 \               $(shell                              \                 $(JAVAC) $(filter $l%, $(all-source-files));))))

This works because eval will execute the shell command immediately, expanding to nothing. So the foreach loop expands to nothing. The problem is that error reporting is meaningless in this context, so compilation errors will not be transmitted to make correctly.

The wordlist approach is worse . Due to make 's limited numerical capabilities, there is no way to enclose the wordlist technique in a loop. In general, there are very few satisfying ways to deal with immense file lists.