9.3 Compiling Java

     

Java can be compiled with make in two ways: the traditional approach, one javac execution per source file; or the fast approach outlined previously using the @filename syntax.

9.3.1 The Fast Approach: All-in-One Compile

Let's start with the fast approach. As you can see in the generic makefile :

 # all_javas - Temp file for holding source file list all_javas := $(OUTPUT_DIR)/all.javas # compile - Compile the source .PHONY: compile compile: $(all_javas)         $(JAVAC) $(JFLAGS) @$< # all_javas - Gather source file list .INTERMEDIATE: $(all_javas) $(all_javas):         $(FIND) $(SOURCE_DIR) -name '*.java' > $@ 

The phony target compile invokes javac once to compile all the source of the project.

The $(all_javas) prerequisite is a file, all.javas , containing a list of Java files, one filename per line. It is not necessary for each file to be on its own line, but this way it is much easier to filter files with grep -v if the need ever arises. The rule to create all.javas is marked .INTERMEDIATE so that make will remove the file after each run and thus create a new one before each compile. The command script to create the file is straightforward. For maximum maintainability we use the find command to retrieve all the java files in the source tree. This command can be a bit slow, but is guaranteed to work correctly with virtually no modification as the source tree changes.

If you have a list of source directories readily available in the makefile , you can use faster command scripts to build all.javas . If the list of source directories is of medium length so that the length of the command line does not exceed the operating system's limits, this simple script will do:

 $(all_javas):         shopt -s nullglob; \         printf "%s\n" $(addsuffix /*.java,$(PACKAGE_DIRS)) > $@ 

This script uses shell wildcards to determine the list of Java files in each directory. If, however, a directory contains no Java files, we want the wildcard to yield the empty string, not the original globbing pattern (the default behavior of many shells ). To achieve this effect, we use the bash option shopt -s nullglob . Most other shells have similar options. Finally, we use globbing and printf rather than ls -1 because these are built-in to bash , so our command script executes only a single program regardless of the number of package directories.

Alternately, we can avoid shell globbing by using wildcard :

 $(all_javas):         print "%s\n" $(wildcard \                        $(addsuffix /*.java,$(PACKAGE_DIRS))) > $@ 

If you have very many source directories (or very long paths), the above script may exceed the command-line length limit of the operating system. In that case, the following script may be preferable:

 .INTERMEDIATE: $(all_javas) $(all_javas):         shopt -s nullglob;            \         for f in $(PACKAGE_DIRS);     \         do                            \           printf "%s\n" $$f/*.java;   \         done > $@ 

Notice that the compile target and the supporting rule follow the nonrecursive make approach. No matter how many subdirectories there are, we still have one makefile and one execution of the compiler. If you want to compile all of the source, this is as fast as it gets.

Also, we completely discarded all dependency information. With these rules, make neither knows nor cares about which file is newer than which. It simply compiles everything on every invocation. As an added benefit, we can execute the makefile from the source tree, instead of the binary tree. This may seem like a silly way to organize the makefile considering make 's abilities to manage dependencies, but consider this:

  • The alternative (which we will explore shortly) uses the standard dependency approach. This invokes a new javac process for each file, adding a lot of overhead. But, if the project is small, compiling all the source files will not take significantly longer than compiling a few files because the javac compiler is so fast and process creation is typically slow. Any build that takes less than 15 seconds is basically equivalent regardless of how much work it does. For instance, compiling approximately 500 source files (from the Ant distribution) takes 14 seconds on my 1.8-GHz Pentium 4 with 512 MB of RAM. Compiling one file takes five seconds.

  • Most developers will be using some kind of development environment that provides fast compilation for individual files. The makefile will most likely be used when changes are more extensive , complete rebuilds are required, or unattended builds are necessary.

  • As we shall see, the effort involved in implementing and maintaining dependencies is equal to the separate source and binary tree builds for C/C++ (described in Chapter 8). Not a task to be underestimated.

As we will see in later examples, the PACKAGE_DIRS variable has uses other than simply building the all.javas file. But maintaining this variables can be a labor- intensive , and potentially difficult, step. For smaller projects, the list of directories can be maintained by hand in the makefile , but when the number grows beyond a hundred directories, hand editing becomes error-prone and irksome. At this point, it might be prudent to use find to scan for these directories:

 # $(call find-compilation-dirs, root-directory) find-compilation-dirs =                       \   $(patsubst %/,%,                            \     $(sort                                    \       $(dir                                   \         $(shell $(FIND)  -name '*.java')))) PACKAGE_DIRS := $(call find-compilation-dirs, $(SOURCE_DIR)) 

The find command returns a list of files, dir discards the file leaving only the directory, sort removes duplicates from the list, and patsubst strips the trailing slash. Notice that find-compilation-dirs finds the list of files to compile, only to discard the filenames, then the all.javas rule uses wildcards to restore the filenames. This seems wasteful , but I have often found that a list of the packages containing source code is very useful in other parts of the build, for instance to scan for EJB configuration files. If your situation does not require a list of packages, then by all means use one of the simpler methods previously mentioned to build all.javas .

9.3.2 Compiling with Dependencies

To compile with full dependency checking, you first need a tool to extract dependency information from the Java source files, something similar to cc -M . Jikes (http://www.ibm.com/developerworks/opensource/ jikes ) is an open source Java compiler that supports this feature with the -makefile or +M option. Jikes is not ideal for separate source and binary compilation because it always writes the dependency file in the same directory as the source file, but it is freely available and it works. On the plus side, it generates the dependency file while compiling, avoiding a separate pass.

Here is a dependency processing function and a rule to use it:

 %.class: %.java         $(JAVAC) $(JFLAGS) +M $<         $(call java-process-depend,$<,$@) # $(call java-process-depend, source-file, object-file) define java-process-depend   $(SED) -e 's/^.*\.class *:/ $(subst .class,.d,):/'   \          $(subst .java,.u,) > $(subst .class,.tmp,)   $(SED) -e 's/#.*//'                                      \          -e 's/^[^:]*: *//'                                \          -e 's/ *\$$$$//'                                 \          -e '/^$$$$/ d'                                    \          -e 's/$$$$/ :/' $(subst .class,.tmp,)           \          >>  $(subst .class,.tmp,)   $(MV) $(subst .class,.tmp,).tmp  $(subst .class,.d,) endef 

This requires that the makefile be executed from the binary tree and that the vpath be set to find the source. If you want to use the Jikes compiler only for dependency generation, resorting to a different compiler for actual code generation, you can use the +B option to prevent Jikes from generating bytecodes.

In a simple timing test compiling 223 Java files, the single line compile described previously as the fast approach required 9.9 seconds on my machine. The same 223 files compiled with individual compilation lines required 411.6 seconds or 41.5 times longer. Furthermore, with separate compilation, any build that required compiling more than four files was slower than compiling all the source files with a single compile line. If the dependency generation and compilation were performed by separate programs, the discrepancy would increase.

Of course, development environments vary, but it is important to carefully consider your goals. Minimizing the number of files compiled will not always minimize the time it takes to build a system. For Java in particular, full dependency checking and minimizing the number of files compiled does not appear to be necessary for normal program development.

9.3.3 Setting CLASSPATH

One of the most important issues when developing software with Java is setting the CLASSPATH variable correctly. This variable determines which code is loaded when a class reference is resolved. To compile a Java application correctly, the makefile must include the proper CLASSPATH . The CLASSPATH can quickly become long and complex as Java packages, APIs, and support tools are added to a system. If the CLASSPATH can be difficult to set properly, it makes sense to set it in one place.

A technique I've found useful is to use the makefile to set the CLASSPATH for itself and other programs. For instance, a target classpath can return the CLASSPATH to the shell invoking the makefile :

 .PHONY: classpath classpath:         @echo "export CLASSPATH='$(CLASSPATH)'" 

Developers can set their CLASSPATH with this (if they use bash ):

 $ eval $(make classpath) 

The CLASSPATH in the Windows environment can be set with this invocation:

 .PHONY: windows_classpath windows_classpath:         regtool set /user/Environment/CLASSPATH "$(subst /,\,$(CLASSPATH))"         control sysdm.cpl,@1,3 &         @echo "Now click Environment Variables, then OK, then OK again." 

The program regtool is a utility in the Cygwin development system that manipulates the Windows Registry. Simply setting the Registry doesn't cause the new values to be read by Windows, however. One way to do this is to visit the Environment Variable dialog box and simply exit by clicking OK.

The second line of the command script causes Windows to display the System Properties dialog box with the Advanced tab active. Unfortunately, the command cannot display the Environment Variables dialog box or activate the OK button, so the last line prompts the user to complete the task.

Exporting the CLASSPATH to other programs, such as Emacs JDEE or JBuilder project files, is not difficult.

Setting the CLASSPATH itself can also be managed by make . It is certainly reasonable to set the CLASSPATH variable in the obvious way with:

 CLASSPATH = /third_party/toplink-2.5/TopLink.jar:/third_party/... 

For maintainability, using variables is preferred:

 CLASSPATH = $(TOPLINK_25_JAR):$(TOPLINKX_25_JAR):... 

But we can do better than this. As you can see in the generic makefile , we can build the CLASSPATH in two stages: first list the elements in the path as make variables, then transform those variables into the string value of the environment variable:

 # Set the Java classpath class_path := OUTPUT_DIR                \               XERCES_JAR                \               COMMONS_LOGGING_JAR       \               LOG4J_JAR                 \               JUNIT_JAR ... # Set the CLASSPATH export CLASSPATH := $(call build-classpath, $(class_path)) 

(The CLASSPATH in Example 9-1 is meant to be more illustrative than useful.) A well-written build-classpath function solves several irritating problems:

  • It is very easy to compose a CLASSPATH in pieces. For instance, if different applications servers are used, the CLASSPATH might need to change. The different versions of the CLASSPATH could then be enclosed in ifdef sections and selected by setting a make variable.

  • Casual maintainers of the makefile do not have to worry about embedded blanks, newlines, or line continuation, because the build-classpath function handles them.

  • The path separator can be selected automatically by the build-classpath function. Thus, it is correct whether run on Unix or Windows.

  • The validity of path elements can be verified by the build-classpath function. In particular, one irritating problem with make is that undefined variables collapse to the empty string without an error. In most cases this is very useful, but occasionally it gets in the way. In this case, it quietly yields a bogus value for the CLASSPATH variable. [1] We can solve this problem by having the build-classpath function check for the empty valued elements and warn us. The function can also check that each file or directory exists.

    [1] We could try using the ”warn-undefined-variables option to identify this situation, but this also flags many other empty variables that are desirable.

  • Finally, having a hook to process the CLASSPATH can be useful for more advanced features, such as help accommodating embedded spaces in path names and search paths.

Here is an implementation of build-classpath that handles the first three issues:

 # $(call build-classpath, variable-list) define build-classpath $(strip                                          \   $(patsubst %:,%,                               \     $(subst : ,:,                                \       $(strip                                    \         $(foreach c,,$(call get-file,$c):))))) endef # $(call get-file, variable-name) define get-file   $(strip                                       \     $()                                       \     $(if $(call file-exists-eval,),,          \       $(warning The file referenced by variable \                 '' ($()) cannot be found))) endef # $(call file-exists-eval, variable-name) define file-exists-eval   $(strip                                        \     $(if $(),,$(warning '' has no value))    \     $(wildcard $())) endef 

The build-classpath function iterates through the words in its argument, verifying each element and concatenating them with the path separator (: in this case). Selecting the path separator automatically is easy now. The function then strips spaces added by the get-file function and foreach loop. Next, it strips the final separator added by the foreach loop. Finally, the whole thing is wrapped in a strip so errant spaces introduced by line continuation are removed.

The get-file function returns its filename argument, then tests whether the variable refers to an existing file. If it does not, it generates a warning. It returns the value of the variable regardless of the existence of the file because the value may be useful to the caller. On occasion, get-file may be used with a file that will be generated, but does not yet exist.

The last function, file-exists-eval , accepts a variable name containing a file reference. If the variable is empty, a warning is issued; otherwise , the wildcard function is used to resolve the value into a file (or a list of files for that matter).

When the build-classpath function is used with some suitable bogus values, we see these errors:

 Makefile:37: The file referenced by variable 'TOPLINKX_25_JAR'              (/usr/java/toplink-2.5/TopLinkX.jar) cannot be found ... Makefile:37: 'XERCES_142_JAR' has no value Makefile:37: The file referenced by variable              'XERCES_142_JAR' ( ) cannot be found 

This represents a great improvement over the silence we would get from the simple approach.

The existence of the get-file function suggests that we could generalize the search for input files.

 # $(call get-jar, variable-name) define get-jar   $(strip                                                     \     $(if $(),,$(warning '' is empty))                     \     $(if $(JAR_PATH),,$(warning JAR_PATH is empty))           \     $(foreach d, $(dir $()) $(JAR_PATH),                    \       $(if $(wildcard $d/$(notdir $())),                    \         $(if $(get-jar-return),,                              \           $(eval get-jar-return := $d/$(notdir $())))))     \     $(if $(get-jar-return),                                   \       $(get-jar-return)                                       \       $(eval get-jar-return :=),                              \       $()                                                   \       $(warning get-jar: File not found '' in $(JAR_PATH)))) endef 

Here we define the variable JAR_PATH to contain a search path for files. The first file found is returned. The parameter to the function is a variable name containing the path to a jar. We want to look for the jar file first in the path given by the variable, then in the JAR_PATH . To accomplish this, the directory list in the foreach loop is composed of the directory from the variable, followed by the JAR_PATH . The two other uses of the parameter are enclosed in notdir calls so the jar name can be composed from a path from this list. Notice that we cannot exit from a foreach loop. Instead, therefore, we use eval to set a variable, get-jar-return , to remember the first file we found. After the loop, we return the value of our temporary variable or issue a warning if nothing was found. We must remember to reset our return value variable before terminating the macro.

This is essentially reimplementing the vpath feature in the context of setting the CLASSPATH . To understand this, recall that the vpath is a search path used implicitly by make to find prerequisites that cannot be found from the current directory by a relative path. In these cases, make searches the vpath for the prerequisite file and inserts the completed path into the $^ , $? , and $+ automatic variables. To set the CLASSPATH , we want make to search a path for each jar file and insert the completed path into the CLASSPATH variable. Since make has no built-in support for this, we've added our own. You could, of course, simply expand the jar path variable with the appropriate jar filenames and let Java do the searching, but CLASSPATH s already get long quickly. On some operating systems, environment variable space is limited and long CLASSPATH s are in danger of being truncated. On Windows XP, there is a limit of 1023 characters for a single environment variable. In addition, even if the CLASSPATH is not truncated, the Java virtual machine must search the CLASSPATH when loading classes, thus slowing down the application.



Managing Projects with GNU make
Managing Projects with GNU Make (Nutshell Handbooks)
ISBN: 0596006101
EAN: 2147483647
Year: 2003
Pages: 131

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net