Section 11.1. An Introduction to Hooks | Subversion Version Control. Using The Subversion Version Control System in Development Projects

11.1. An Introduction to Hooks

Subversion provides for the capability to have scripts automatically run on the server at various repository access points. These scripts are able to examine the data that flows into the repository and make decisions about whether specific actions should be allowed, as well as trigger other side effects based on the data supplied (such as send an e-mail).

Each hook script is a program that is invoked on a particular trigger, to perform the necessary processing for that hook. Normally, hook scripts are shell scripts, but that is not a restriction. Hook scripts can be written in any interpreted or compiled language (for the rest of this discussion, I will refer to all forms as scripts). The only requirement is that they be some sort of executable that the Subversion server can run. When run, each hook script can either perform all of the necessary processing internally, or it can call one or more external programs and use their output.

Hook scripts are specific to a repository, and are stored in a directory named hooks inside the directory that the svnadmin create command created to store the repository database. The trigger action for each script is defined by its name (in Windows, they should also end in .exe or .bat). The different trigger actions available are the following.

start-commit
pre-commit
post-commit
pre-revprop-change
post-revprop-change

When one of these actions occurs, Subversion invokes the appropriate script, and passes to it relevant information (which varies, depending on the action) via arguments. For all of the hook scripts except post-commit and post-revprop-change, if the script returns a non-zero value, Subversion will reject the data that was being processed (both post-actions occur after the processing, and can't cause an interruption).

11.1.1. Available Hook Scripts

Subversion supports several hook scripts, which are triggered on different events.

`start-commit`

The first thing Subversion does when it receives a repository commit is to invoke the start-commit script (if one exists), even before it creates a transaction for the commit. This gives the hook script a chance to examine the target repository, and the user making the commit, and make a decision on whether that user is authorized to access that repository. One possible use for this script would be to prevent denial of service attacks against the repository from huge unauthorized commits.

The arguments passed to start-commit are

The path to the repository used for this commit
The name of the user attempting to make the commit

`pre-commit`

If a commit makes it past the start-commit script, Subversion creates a new transaction (which allows the repository to be returned to the state it was in before the commit attempt), in case the commit fails for any reason. After Subversion has created this transaction, it invokes pre-commit. The pre-commit script is able to examine the transaction, and make a decision on whether that data meets the requirements for a commit.

The arguments passed to pre-commit are

The path to the repository used for this commit
The name of the transaction for the commit

`post-commit`

This script is called after a commit has completed successfully. Because the commit has already happened, it has no power to affect the commit itself, but can perform logging or trigger other side effects.

The arguments passed to post-commit are

The path to the repository used for this commit
The revision number of the commit

`pre-revprop-change`

When a revision property is modified, Subversion invokes this script before actually performing the change. If you want to change revision properties, this script is required. If it isn't present, Subversion triggers a failure on every attempted revision property change. Because revision properties are unversioned, this script can be useful for disallowing dangerous (or otherwise undesired) revision property changes. It can also be useful for making a backup of modified revision properties before actually performing a commit, which can be very useful in avoiding accidental property data loss.

The arguments passed to pre-revprop-change are

The path to the repository used for this property change
The revision for which the property is being changed
The user attempting the property change
The name of the property being changed

In addition to the arguments sent to the script, Subversion also passes the revision property value itself on the script's standard input stream (stdin), instead of as an argument.

`post-revprop-change`

This script is called after a revision property change has successfully completed. Because the change has already occurred, the scripts can't affect the commit itself. It can, however, be useful for logging or generating some other side effect.

The arguments passed to post-revprop-change are

The path to the repository used for this property change
The revision for which the property is being changed
The user attempting the property change
The name of the property being changed

11.1.2. What a Hook Script Can Do

Hook scripts have a lot of latitude in what they're allowed to do when they run. There are really only a few restrictions (which I discuss in the next section), and everything else is fair game. However, there are a few actions that you will find most of your scripts needing to do, which deserve a little extra attention.

Examining the Repository

In a very few cases, the information provided in the arguments to the hook script is sufficient for the script to perform all of its required processing. The rest of the time, though, the script needs to examine the repository to get the information it needs. To perform its examinations, there are no enforced restrictions on what tools Subversion is allowed to use, but it is usually safest to use the svnlook program instead of svn, in order to avoid the potential of modifying the repository (which is strictly forbidden, but also not enforced). The specifics of what you can get out of svnlook are explained in detail, in Section 11.3.1, "The Subversion Commands."

The svnlook program is a tool for examining the repository. It has many of the same commands as svn, but without the capability to modify the repository in any way. Unlike svn, svnlook doesn't operate on working copies, nor can it communicate with a remote repository. Instead, svnlook must be used to examine a repository located on the local system.

Examining Transactions

The pre-commit hook script is provided with a unique argument, the name of a transaction. With this transaction name, svnlook can examine a transaction that is in-process, before the changes therein become a permanent part of the repository. To refer to a transaction, you have to run svnlook with the --TRansaction parameter, which takes the transaction's name, just as --revision would take the revision number (the only practical way to get a transaction name is when it is a parameter to a pre-commit script).

As a simple example, the following script gets the log message for the commit that is currently being processed, and appends it to an external log file that keeps track of all the commits attempted, regardless of whether they succeed.

 #!/bin/sh # Get the pre-commit script arguments # $1 = The repository path # $2 = The transaction name RPS = "$1" TXN = "$2" # Append the log message to a log record. /usr/bin/svnlook log --transaction "$TXN" "$RPS" > /var/log/txn.log # Exit with zero, to allow the transaction processing to continue. exit 0

Running External Programs

In addition to running Subversion commands from within hook scripts, you are free to run any other external programs that you need to in order to perform the desired processing in the script. This allows you to not only take advantage of already existing programs that perform the actions you need, but it also allows you to write your own external programs and scripts that can be shared among multiple hook scripts (either in the same repository or across multiple repositories). In fact, it is generally good practice to write all of your real hook script processing in one or more external scripts, and then call those from the actual hook script, in the correct order.

11.1.3. What a Hook Script Can't Do

Hook scripts are very flexible in their allowed actions, but there are a few restrictions that you need to be aware of. It is especially important for you to read this section carefully, because most of these restrictions are not enforced by Subversion. The results of trying to perform these forbidden actions can range from silent failure to potential repository corruption.

Don't Modify the Repository

You might be tempted to write a hook script that modifies a transaction. It would be handy to automatically modify code to fit a certain coding style, or encrypt certain files for added security. However, hook scripts should never modify the transaction they are operating on. Subversion has no mechanism for reporting back to the working copy if a transaction is modified, so the working copy and repository would become out of sync if you did modify the transaction.

Because none of the Subversion command-line tools have facilities that would allow you to modify the transaction, this is not a hard rule to follow. However, it is possible for a custom program using the Subversion libraries to modify a transaction, so consider yourself warned about not doing it.

Communication with Client Is Limited

Subversion will buffer anything that a hook script sends to standard error, and if the script fails, that output will be sent to the client and displayed along with the commit failure message. However, that is the extent of the direct capability a hook script has to communicate with the client. There is no way to give feedback to the user if the commit succeeds, nor is there a way to get any additional information from the user while the hook script is running.

If you need to get extra information to a user after a successful commit, you could get around the communication limitation by writing the output of the script to a log file that the user could access (possibly via a Web server), or by sending an e-mail. Getting extra information from the user is a more difficult limitation to work around, but you may be able to do it if you're creative. For instance, you could have the user place information in the log message that would be parsed by the hook script. Or, if you need to get information in the middle of hook script execution, you could find some alternate means of communication that works outside of the Subversion framework, such as an automated message sent to an instant messaging account that waits for a response to be sent back.

11.1.4. Tips for a Good Hook Script

There are a few things that you want to keep in mind when you are writing hook scripts. If you are an experienced script writer (or programmer of another sort), many of these things will be second nature to you, but if not, this is a good place to learn; and even if you are an experienced programmer, it is probably worthwhile to think about some of the points in this section in the context of writing good Subversion hook scripts.

You can also find some template scripts, which are automatically generated in the hooks subdirectory for your repository, when the repository is generated. These examples will give you a good idea what each hook script can do, and some ideas for what you might use that particular script for. You should note, though, that the scripts referred to in the templates are fictitious scripts that are merely there for illustration (although at least onecommit-email.pldoes exist in the utility scripts that Subversion supplies). Later on in the chapter, I will discuss how you might write some of the scripts alluded to in the templates.

Keep It Short

Hook scripts run whenever the action they are associated with is triggered. With the exception of the two revision property scripts, that means every time a commit is performed. Furthermore, every user has to wait for the execution of each relevant hook script (that's three on a successful commit) to execute, in its entirety, before being able to move on to other things. In an active development environment where commits are done frequently, that is a lot of time spent waiting for Subversion to finish running hook scripts.

Some time is expected, of course, but significant delays will quickly annoy your users, which will lead to fewer commits, which will make your repository less useful to you. Therefore, it is vital that all of your hook scripts run in as small an amount of time as possible. As a rule of thumb to figure out how short you should keep your hook script run times, I would suggest that you consider how often you think your average user will be committing changes to the repository. If you expect frequent commits (many per hour), you should keep the hook script runtimes under a few seconds each for the average case. If you expect less frequent commits (just a few per day, or less), it may be acceptable to have longer runtimes for your hook scripts; but remember that user feedback is minimal, so you want to make sure that things are kept short enough that users don't worry that the commit has locked up.

If you have a hook script with a long-running side effect, you might consider running it in the background, so that your hook script can finish and allow the commit to complete (thus returning control to the user) before the hook script itself has completed. Obviously, this is not practical for hook scripts that depend on the output of a program to decide whether the commit should succeed, but if the side effect is entirely independent (such as sending an e-mail or modifying a developer's Web site), it might be a good way to make commits to the repository feel faster without sacrificing functionality.

Do You Really Want It Every Time?

Subversion hook scripts will run every single time a commit is made, which for most people is a lot. Before you set up a hook script, put some thought into whether you really need it to run every time a commit is made. There are a lot of things that seem useful when you first think of them, but end up being nothing but annoying when they are put into practice. If a hook script has a side effect, like sending an e-mail or instant message that the user doesn't usually care about, it will quickly get ignoredit's just human nature to make repetitive actions a habit that doesn't require any conscious thought. Then, when the side effect is something important, it is likely to not be noticed.

For every hook, you need to carefully consider whether the script that you want to have run is really going to produce something that is of value to the receiver more often than not. If the answer is "no," you might want to consider adding a few checks to your hook script that will help determine when the information is actually useful, and refrain from sending it out at other times. Alternately, you can set up the hook script to perform the operation every n revisions (e.g., if revnum % 10 == 0). Not only will your users be happier, but they will also be more likely to notice and react to important side effects from the script.

Early On: Log What You Do

When you have developed a new hook script, it is a given that you will want to test it before making a deployment to the real repository. Pre-release testing can only go so far, though, and it is often the case that things that worked great in the lab will break under the pressure of real-world usage. In the case of Subversion hook scripts, this is especially ominous, due to the importance of a live repository, and the difficulty of noticing problems when there is no feedback to the user. Furthermore, if a commit is accepted that shouldn't have been, you can easily end up with broken data in your repository (not the kind that breaks the repository, but the kind that doesn't run as it's supposed to).

To avoid problems in the future, a new hook script that is introduced to a live repository should always (unless it's trivially simple) be run for a reasonable period with copious amounts of debugging output being sent to a log somewhere. This will save you many headaches and long hours in the future, not only by helping you pinpoint where any problems are occurring, but also by making it easier for you to correct any errors that occur, before they cause a chain reaction.

Remember the Edge Cases

Accounting for edge cases is an important tenet of software develpment, but one that is easy to forget in the context of writing small systems, such as hook scripts. When a hook script is run, you want to make sure that it can handle any sort of data that Subversion will allow the user to throw at it. For instance, if your pre-commit script expects a certain format of log message, make sure that it will properly handle not just wildly incorrect log messages, but also the log messages that are similar to the required format, but not quite correct (such as a keyword in the middle of data).

Reuse What You Can

Subversion comes from the culture of the UNIX world. In the UNIX world, there are many small, single-task programs that can be easily strung together to perform larger tasks. This makes development of complex scripts much easier, and allows programs to make use of well-tested components that can be shared among multiple applications. In other operating systems (Windows mostly), this sort of single-task application is not nearly so prevalent. This means that hook scripts don't have available to them the same rich set of default tools to use for processing the data they receive.

A first instinct may be to write all of the missing functionality into your hook script, but this can lead to bloated, slow scripts that are hard to debug. You are much better off if you take the UNIX approach and create individual component scripts that perform the individual tasks that you need performed. This makes it easier to share those tasks among multiple scripts and repositories, as well as making debugging of component functionality much easier.

Another option for Windows users, if you are looking for a rich set of UNIX-like tools for your hook scripts to take advantage of, is to install a Windows package (such as Cygwin, which provides a wealth of UNIX tools) that gives you a prebuilt set of component programs that can be integrated into your scripts. This not only saves you the time spent writing the functions themselves, but also the time spent debugging and testing the components.

11.1.5. The Pre-made Subversion Scripts

To aid you in your quest to generate the perfect Subversion hook script, the developers of Subversion provide you with a set of pre-made scripts that provide useful utilities commonly found in hook scripts. If you made an installation of Subversion from a binary package, you will likely find them in a directory such as

 /usr/share/svn/tools/hook-scripts

If they are not installed on your system, you can get them by downloading the Subversion source and looking in the tools/hook-scripts subdirectory. The scripts that you find should include (at the least) the following scripts.

commit-access-control.pl
svnperms.py
commit-email.pl
propchange-email.pl
mailer.py