3.4. Automation EnvironmentsAn automation environment is a tool that performs a variety of automated tasks for you. An automated task is anything that has been modified so that you can run it, usually from a command line, without any manual interaction. An automated environment makes it easier to run each of these tasks in the correct order and at chosen intervals. An automation environment is a key part of the XP concept of continuous integration (CI), which involves keeping a product integrated and working correctly all the time. For that reason, automation environments are sometimes referred to as "CI frameworks." If you find the phrase "automated environments" too unwieldy, then perhaps EFA for "environments for automation" might do. Many different tasks are commonly controlled by automation environments. (Exactly what some of these tasks involve is explained in more detail in subsequent chapters.) Some of these tasks are:
This list of tasks is quite extensive and varies widely in practice. A good automation environment is able to integrate with all of the tools in your development environment, whether by CLI, email, or custom scripts (see Section 3.2, earlier in this chapter). Some of your individual tools may be able perform one or more of the tasks listed above by themselves, without the need for an automation environment. For instance, a build tool may be able to create release packages for your product, or you may want to describe your project's structure using a tool such as Maven (see Section 5.5.4). Great! Call those tools from the automation environment and have them do the work for you. Some automation environments will allow you to define quite complex dependencies between projects; for instance, so that project X will build only after projects Y and Z have been successfully built and tested. How the information in the reports generated by an automation environment is provided to people makes a critical difference in how useful the information is. To avoid annoying everyone in the project, the default policy should be that only the transition from success to failure of build and test results should generate notifications, since developers will quickly ignore repeated emails. A good tool will let you specify the notification policy for different products and cases. Notification should be as rapid as possible, and the tool should support sending notifications by as many methods as possible. Email messages are still the most common kind of notification, but pagers, SMS messages, RSS feeds, and even X10 connections to control red and green lava lamps have their time and place. For each series of builds, the generated reports should include a short name for the builds (e.g., "the Windows server build"), a build label (see Section 3.5, later in this chapter), and platform details such as processor type, operating system, and version. The reports should be available in multiple formats, including summaries and text-based versions. One common format for reports is a "waterfall," which has columns of builds with the most recent builds at the top, as shown in Figure 3-1. Build and test failures need to be shown in ways that help people identify their causes, so links to errors in logs and source files are good to have. Reports can also list all recent changes to the source code and who made the changes; this is sometimes known as the blame list. Displaying estimates of how much time each build will take, also called the ETA (estimated time of arrival), is a nice touch. Figure 3-1. An automated build reportAnother measure of how easy a particular automation environment is to use is how easy it is to extend. Your project might decide to use a different SCM tool or want to use make instead of Ant, so your automation environment should either support a wide range of tools already or have a clear and openly available API for adding support for new tools.
Most automation environments store information about past tasks; sometimes this is in files and sometimes in a database. Either way, automated environments are also useful for monitoring the general historical health of your development environment. Other pieces of information that are useful to keep track of are:
Some useful tools for monitoring the health of a development environment include Orca (http://www.orcaware.com) and Argus (http://argus.tcp4me.com). Both of these tools can generate informative web pages, and Argus can be configured to send alerts about problems as they are detected. MRTG (http://people.ee.ethz.ch/~oetiker/webtools/mrtg) is another commonly used tool for monitoring the status of networks. The rest of this section examines four automation environments. Two of these environments were inspired by Ant (the Java-based build tool discussed in Section 5.5.4) and are still most commonly used for building products written in Java. There are a number of other automation environments with future promise that are not discussed here, including DamageControl, Gump from the Apache Project, and BuildBot, a Python-based tool with good support for cross-platform builds. For a more extensive comparison of different automation environments, see the large but only partially complete comparison table at http://docs.codehaus.org/display/DAMAGECONTROL/Continuous+Integration+Server+Feature+Matrix. 3.4.1. Shell Scripts and Batch FilesThe simplest way to automate a number of tasks is to execute them from within shell scripts or batch files. Such scripts can be fast to develop and they can be scheduled to be run regularly by the Unix tool cron or the Windows at command. However, there are good reasons to avoid using them as your entire automation environment.Section 5.5.1 lists some of the disadvantages of shell scripts as build tools, and Section 6.4.1 discusses the drawbacks of using them as test environments. Many of those problems apply to automation environments as well. Things that are awkward with shell scripts or batch files include:
Running tasks regularly is easy to do using cron or the Windows at command (though how to make at run a task more than once a day is nonobvious). The biggest problem with both of these tools is when some scheduled tasks start to make other tasks take longer. For instance, a build that used to take 40 minutes may sometimes take 65 minutes due to other scheduled tasks running at the same time. If two builds on the same machine will cause errors, then this problem may occur intermittently, appearing to break builds at random. Of course, one way to avoid this is to check at the start of a build whether another build is already running and, if so, to stop the current build. If you do use cron to execute lots of tasks, one useful thing to do is to track how long each task takes so that you can see where tasks are overlapping. The crontab file shown in Example 3-1 does this. Example 3-1. Measuring the duration of cron tasks# The logfile where the durations will be recorded DURATION_LOG=/tmp/crontab_task_durations.log # Set the format for the output from the 'time' command TIME="start:4 duration:%E name:myprogram" # This task is run at four minutes past every hour 4 * * * * /usr/bin/time -a -o $DURATION_LOG myprogram # Change the format for the output from time to show # when the second task started TIME="start:8 duration:%E name:myprogram2" # This task is run at eight minutes past every hour 8 * * * * /usr/bin/time -a -o $DURATION_LOG myprogram2 3.4.2. TinderboxOne of the oldest public automation environments is the open source Tinderbox tool from the Mozilla organization (http://www.mozilla.org/tinderbox.html). Like other tools from Mozilla such as Bugzilla (see Section 7.2.3) and Bonsai, Tinderbox is written entirely in Perl, still contains some Mozilla-specific functionality, and is generally complicated to install. Tinderbox is not available as a single package for download. Instead, you check the source code out from CVS by typing: cvs -d :pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot login Use the password anonymous when prompted and then type: cvs -d :pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot co mozilla/webtools/tinderbox2 There have been no recent updates to Tinderbox. You can also browse the source code, which is reasonably well documented, at http://lxr.mozilla.org/mozilla/source/webtools/tinderbox2. Tinderbox2 is the current working version of Tinderbox. Tinderbox uses formatted email messages from scripts running on the different build machines to update a central collection of HTML reports. You write the scripts to perform the particular task and schedule them as you wish, and then the scripts email the results to Tinderbox. In general, if you can configure the contents of email messages that are sent from a tool, then you can integrate that tool by using Tinderbox directly. Tinderbox comes with a small set of states such as building, testing, and build_failed to describe your builds but allows you to add more states, change their names, or change the color used for each state in the HTML reports. Tinderbox integrates well with the other Mozilla tools, and you can arrange for build summaries to have URLs linking to browser-based views of your repository and change logs. Logfiles from builds are automatically parsed by Tinderbox, and links to errors are added to the reports. Since it uses scripts to execute tasks, Tinderbox can run make as easily as it can run Ant, and Tinderbox is suitable for large non-Java projects. 3.4.3. AnthillAnthill (http://www.urbancode.com/projects/anthill) is a "build and release management tool" that runs as a Java servlet inside an application server such as Tomcat. An open source version is available at no cost, but there is also a commercial version named AnthillPro with more features that is available for $2,499 per year. One of the strengths of Anthill is that it can be administered using a web-based interface, though the underlying configuration files are all in XML and so can also be edited with any other text editor. A wide range of SCM tools are supported by Anthill, including ClearCase, CVS, Perforce, Subversion, and Visual SourceSafe. The underlying build tool is usually Ant, but make is also supported. Multiple projects are supported, and multiple builds can occur at the same time. 3.4.4. CruiseControlCruiseControl (http://cruisecontrol.sourceforge.net) is an open source "framework for a continuous build process." It was inspired directly by the XP concept of continuous integration and, being a Java-based tool, will run on most platforms. As of CruiseControl 2.0, multiple projects are supported. The different SCM tools supported by CruiseControl include ClearCase, CVS, Perforce, Subversion, and Visual SourceSafe. The underlying build tool is usually Ant, though the Ant exec task can be used to run other tools indirectly. There is also a project named CruiseControl.NET, which is a port of CruiseControl to the .NET platform. Information about using CruiseControl is widely available. The best how-to document currently available about installing and configuring it is at http://www.javaranch.com/journal/200409/DrivingOnCruiseControl_Part1.html. There is also a book about project automation that uses CruiseControl as its core: Pragmatic Project Automation, by Mike Clark (Pragmatic Bookshelf). There are also two Wikis about CruiseControl, one at http://confluence.public.thoughtworks.org/display/CC/Home and another, more general one at http://c2.com/cgi/wiki?CruiseControl. |