Section 11.2. Making the Most of Hook Scripts | Subversion Version Control. Using The Subversion Version Control System in Development Projects

11.2. Making the Most of Hook Scripts

When working with your own Subversion repositories, you will invariably come up with innumerable ideas for possible places where automation will make your life easier. In fact, if you're anything like me, you'll find that the potential ways you can think of for automating your troubles away are far greater in number than your time available. To help you out a bit, in this section, I'll talk about some of the ways that you might add hook scripts to Subversion to help with automation, along with some ideas on implementing them.

11.2.1. Automatically Send E-mails

If you have a mailing list set up that is dedicated to Subversion commit reports, you can set up a commit script to e-mail that list every time someone makes a commit, either with just the log or with a full diff of all the changes applied to the repository. Users who are interested in keeping track of Subversion development can then subscribe to the e-mail to see when changes are made, rather than needing to periodically check the Subversion logs to see what has changed.

You can also check the section of the repository where the commit was made, and send e-mails to different mailing lists in response. There are a couple of reasons why you might want to do this.

If you have multiple projects in your repository, you may want commit e-mails for each project to go to that project's own mailing list. This helps allow you to keep projects logically separated, even though they reside in the same physical repository (which gives you the advantage of allowing code to move between the two).
You may find it useful to only send out notifications for changes made to the trunk of your project. This allows individual developers to perform a lot of small commits on a branch created for a given task without spamming the mailing list with huge numbers of e-mails. Then, when a change is merged into the trunk, all of those changes will be sent out in one compiled e-mail that shows the changes made during the merge.

Redundant Archival E-mails

A mailing list that receives commit messages can also be used as an emergency archive for restoring a lost repository. It is, of course, no substitute for nightly backups of the repository itself, but it could save you the loss of a single day's changes if a crash were to occur. If you have an archival mailing list where all repository commits (with full diffs, log messages, and other metadata) are sent, and your repository is lost in the middle of the day, you could go back to those archival e-mails and restore all of the changes that had been applied to the repository since the last repository backup.

To make a potential restoration even easier, you could set up your post-commit script to automatically run svnadmin dump to create an incremental backup file that could then be e-mailed to an archival e-mail drop. The following script shows how you might write such a script. The script itself is written in Python, but even if you don't know Python, it should be clear what is happening.

 #!/usr/bin/python # Subversion commit archival program. # Takes a repository, a revision number, and an email address import commands import smtplib from email.MIMEText import MIMEText # Some variables that may need to be set for a specific repository svnbinpath = '/usr/bin/' svnadmin = svnbinpath + 'svnadmin' fromaddr = 'svnrepos@mydomain.com' # Runs 'svnadmin dump' and get its output def dumpRevision(repos, revision): dump = getstatusoutput('%s dump --incremental -r %s %s'                          % svnadmin, revision, repos) if dump[0] != 0:    # svnadmin failed    return None else:    # return the dump    return dump[1] # Creates an email with the supplied revision dump def createDumpMessage(address, revision, dump):    msg = MIMEText(dump)    msg['Subject'] = 'Dump of revision %s' % revision    msg['From'] = fromaddr    msg['To'] = address # Sends the supplied email message # Uses the local systems SMTP system def sendDumpMessage(address, msg):    s = smtplib.SMTP()    s.connect()    s.sendmail(fromaddr, [address], msg.as_string())    s.close() # Main execution point for the program # This will use the other three functions to email a repository dump if __name__ == '__main__':    # Check to see that we have enough arguments    if sys.argv.length() < 4: exit(1)    # Parse the arguments    repository = sys.argv[1]    revision = sys.argv[2]    address = sys.argv[3]    # Get the repository dump and email it    dump = dumpRevision(repository, revision)    if dump == None: exit(2)    msg = createDumpMessage(address, revision, dump)    sendDumpMessage(address, msg)

Communicating with an Issue Tracker

Some issue-tracking systems can be controlled by sending them e-mails. For example, you may be able to create a new open issue, update an existing issue, or close an issue just be sending a properly formatted e-mail to the tracker. If you have your users format their log files properly, you can parse them automatically in a post-commit hook script and generate the messages to send to the issue tracker automatically.

Having the hook script automatically notify the issue tracker will remove one extra step from your user's commit process, which should reduce the chance for error. Of course, the downside is that the log message needs to be properly formatted for the issue tracker to be able to parse it properly. Ideally, the best way to help ensure proper formatting is to keep the formatting simple and distinct in a way that can be parsed even when mixed with unformatted text. For example, you might use a unique tag that would identify issue numbers or status changes. The parser could then search unformatted text for those tags and react appropriately. Additionally (or alternately), you could use a pre-commit hook script to check log messages for incorrect formatting and return an error if it doesn't parse correctly.

Subversion Supplied Scripts

Because automatically sending e-mails is such a commonly desired action, Subversion even provides three different example scripts that you can use for sending e-mails. They are robust enough to use "as is" for many purposes, and reasonably easy to modify if they don't quite fit your needs. Two of the scripts (commit-email.pl and propchange-email.pl) are written in Perl, and the third (mailer.py) is written in Python, so you have a choice of language to attack if you need to make modifications. All three can easily be run in your post-commit hook script to send e-mails.

`commit-email.pl`

The commit-email.pl script can be run by providing it with the repository, revision number, and an e-mail address, and it will send an e-mail with the author of the commit, the date, the log, and a list of the changes made. You can also set up multiple invocations of commit-email.pl to each match a specific subdirectory in the repository and only send an e-mail if a file in that subdirectory changed during the commit. In this way, you can configure different mailing addresses for different projects in the same repository.

The following example post-commit script shows how you might set up a hook script to run commit-email.pl with two different mailing addresses for the trunks of two projects in the repository (stored in /project1/trunk and /project2/trunk).

 #!/bin/sh # Get the post-commit script arguments # $1 = The repository path # $2 = The revision number RPS = "$1" REV = "$2" # Send commit emails COMMIT_EMAIL = /usr/local/share/tools/hook-scripts/commit-email.pl $COMMIT_EMAIL "${RPS}" ${REV} -m "project1/trunk" "project1-list@mydomain.com" $COMMIT_EMAIL "${RPS}" ${REV} -m "project2/trunk" "project2-list@mydomain.com"

`propchange-email.pl`

If you want to send out notification e-mails when revision properties change, you can use the propchange-email.pl script instead. It works almost identically to commitemail.pl, except it takes a property name and user in addition to the repository and revision number.

In this example post-revprop-change hook script, you can see how you might use propchange-email.pl to send a notice to an administrator every time someone made a change to a log file.

 #!/bin/sh # Get the post-revprop-change script arguments # $1 = The repository path # $2 = The revision number # $3 = The user making the change # $4 = The property being changed RPS = "$1" REV = "$2" USER = "$3" PROP = "$4" # Send a property change email PRPCHG_EMAIL= "/usr/local/share/tools/hook-scripts/propchange-email.pl" ADDRESS = "repos-admin@mydomain.com" if    test ["$PROP"="svn:log"] then    $PRPCHG_EMAIL "$RPS" "$REV" "$USER" "$PROP" "$ADDRESS" fi

`mailer.py`

Subversion also provides an example Python script that performs the same function as commit-email.pl. Additionally, mailer.py lets you set up complex sets of groups that determine which addresses to e-mail based on regular expressions that match the files that were changed. To find out more about the specifics of how this script works, see the script itself, and the sample configuration file that is included with it.

11.2.2. Send Notifications via RSS

RSS (Really Simple Syndication) has become a very popular means of getting notifications on changes to news sites, blogs, and other Web sites with rapidly changing content. It's not just limited to Web sites, though. In fact, it can easily handle any sort of frequently modified serial data, such as the record of commits made to a Subversion repository.^[1]

^[1] The scripts and techniques in this section were graciously provided by Stuart Robinson and his employer, Absolute Systems (www.absolutesys.com).

RSS feeds are supplied to RSS readers as XML files, available through a Web browser. To get the latest RSS feed for a site, the reader simply redownloads the RSS feed XML file from a predetermined URL. So, if you set up a post-commit hook script to update the RSS feed every time a commit is made to the repository, you can have an up-to-date RSS feed of repository activity. This is especially handy in a rapid development environment, where short build/test cycles make it important for everyone to keep up with the activities of their coworkers.

Generating the RSS

To generate the RSS feed, you need a script, similar to the following one, that takes the repository and extracts the information about the last repository commit. The script itself is fairly long, but I'll go over what's happening shortly.

 #!/bin/bash # Main configuration variables ReposName=$1 RepositoriesDir=/svnrepos/repositories MaxItems=100 # Setup misc. variables Repos=$RepositoriesDir/$ReposName RssFile=/var/www/html/rss/$ReposName.xml RssFileTmp=/var/www/html/rss/$ReposName.xml.tmp RssHeader=$Repos/rss/header RssFooter=$Repos/rss/footer RssItemsDir=$Repos/rss/items #================================================= # Attempts to create a lock file so that this script isn't # affected by other instances that might be started at the same time. # If a lock-file already exists, this script is exited immediately. AcquireLockFile () {   LockFile=/svnrepos/locks/$ReposName.rssGen.lock   if [ -a $LockFile ]; then     # Another process is currently updating the RSS feed, so exit.     echo "Lock-file $LockFile exists. Exiting."     echo ""     exit 1   fi   # Create the lock file   touch $LockFile   echo "Lock-file $LockFile acquired." }

 #================================================= ReleaseLockFile () {   rm -f $LockFile   echo "Lock-file $LockFile released." }

 #================================================= ComputeFirstAndLastItemRevisions () {   echo "Computing first and last item revisions:"   SvnHeadRevision= ` svnlook youngest $Repos `   LastItemToInclude=$SvnHeadRevision   FirstItemToInclude=$((LastItemToInclude-MaxItems+1))   if [[ $FirstItemToInclude -lt 0 ]]; then     FirstItemToInclude=1   fi   echo " First revision to include: $FirstItemToInclude"   echo " Last revision to include : $LastItemToInclude"   echo " Max items                : $MaxItems" }

 #================================================= DeleteOldItemFiles () {   echo "Deleting old item files"   FirstMissingItem=$((LastItemToInclude+1))   # Move all items we want to keep to tmp, and then delete all others.   if [[ ! -a $RssItemsDir/tmp ]]; then     mkdir $RssItemsDir/tmp   fi   for ((Rev=LastItemToInclude; Rev >= FirstItemToInclude; Rev--))   do     if [[ -a $RssItemsDir/Item.$Rev ]]; then       #echo " Moving $RssItemsDir/Item.$Rev to $RssItemsDir/tmp"       mv $RssItemsDir/Item.$Rev $RssItemsDir/tmp     else       # We need items from FirstItemToInclude -> LastItemToInclude, but       # the current item is missing, so record the revision number so       # we can later access SVN to create the missing items.       FirstMissingItem=$Rev       #echo " DEBUG: FirstMissingItem=$FirstMissingItem"     fi   done   #echo " Removing all other Item files in $RssItemsDir"   rm -f $RssItemsDir/Item.*   # Now move the items we want to keep back to $RssItemsDir   #echo " Moving the files we're keeping from $RssItemsDir/tmp back to $RssItemsDir"   mv -f $RssItemsDir/tmp/Item.* $RssItemsDir   rmdir $RssItemsDir/tmp }

[View full width]
 #================================================= # Creates new Items to represent each of the SVN revisions for # which there are currently no item-files in $RssItemsDir (for revision # numbers >= FirstItemToInclude, and <= LastItemToInclude). CreateNewItemFilesFromSVN () {   # Now access SVN to create RSS items for all revisions from $MaxRev+1 -> $LatestRevision   echo "Accessing SVN to create missing items"   echo " First missing item: Rev $FirstMissingItem"   echo " Last missing item: Rev $LastItemToInclude"   for ((Rev=FirstMissingItem; Rev <= LastItemToInclude; Rev++))   do   echo " Creating <Item> for SVN revision $Rev"   RssItemFile=$RssItemsDir/Item.$Rev   echo "    ItemFile=$RssItemFile"   echo "    Computing vars..."   AuthorId= ` svnlook -r $Rev author $Repos 2>&1 `   Author= ` getent passwd | grep $AuthorId | cut -d: -f 5 `   CommitMsg= ` svnlook log -r $Rev $Repos 2>&1 `   CommitDate= ` svnlook -r $Rev date $Repos 2>&1 `   CommitDateRss= ` echo $CommitDate | sed -e "s/\([^ ]*\) \([^ ]*\) \([^ ]*\).*/\\1T\\2+02 :00/" `   # Sample valid date=<dc:date>2004-06-07T17:03:30+02:00</dc:date>   URL="http://svnserver/viewcvs?rev=$Rev&root=$ReposName&view=rev"   FirstModifiedPath= ` svnlook -r $Rev changed $Repos | cut -b5-1000 | sed -e "s{\([^/]* /[^/]*\).*{\1{" | uniq `   Category= ` echo $FirstModifiedPath | sed -e "s&\(.*\)/\(.*\) &\\2 (\\1)&" `   echo " Done computing vars"   echo " <item>" > $RssItemFile   echo "  <title><![CDATA[$CommitMsg]]></title>" >> $RssItemFile   echo "  <link><![CDATA[$URL]]></link>" >> $RssItemFile   echo "  <description><![CDATA[$CommitMsg]]></description>" >> $RssItemFile   echo "  <category>$Category</category>" >> $RssItemFile   echo "  <dc:creator>$Author</dc:creator>" >> $RssItemFile   echo "  <dc:date>$CommitDateRss</dc:date>" >> $RssItemFile   echo "  <pubDate>$CommitDateRss</pubDate>" >> $RssItemFile   echo " </item>" >> $RssItemFile   echo "CommitDateRss=$CommitDateRss"  done }

 #================================================= # Echos the contents of RssHeader, followed by each of the Rss Item files # in revision-number-order, followed by RssFooter to the Rss file being generated. AssembleThePieces () {   echo "Assembling the pieces"   cat $RssHeader > $RssFileTmp   local PubDate= ` date +"%a, %d %b %Y %T %Z" `   echo " <dc:date>$PubDate</dc:date>" >> $RssFileTmp   echo " <pubDate>$PubDate</pubDate>" >> $RssFileTmp   echo " <lastBuildDate>$PubDate</lastBuildDate>" >> $RssFileTmp   # Add all RSS items to the RSS file   for ((Rev=LastItemToInclude; Rev >= FirstItemToInclude; Rev--))   do     cat $RssItemsDir/Item.$Rev >> $RssFileTmp   done   # Add the RSS footer   cat $RssFooter >> $RssFileTmp   mv -f $RssFileTmp $RssFile }

 #================================================= # Generates a new RSS file for the repository. GenerateRssFile () {   AcquireLockFile   ComputeFirstAndLastItemRevisions   DeleteOldItemFiles   CreateNewItemFilesFromSVN   AssembleThePieces   ReleaseLockFile   echo "Done." } GenerateRssFile

Setting Up Variables

Let's take a look at this script, section by section. The first section sets up a number of useful variables that will be used throughout the rest of the script.

 # Main configuration variables ReposName=$1 RepositoriesDir=/svnrepos/repositories MaxItems=100 # Setup misc. variables Repos=$RepositoriesDir/$ReposName RssFile=/var/www/html/rss/$ReposName.xml RssFileTmp=/var/www/html/rss/$ReposName.xml.tmp RssHeader=$Repos/rss/header RssFooter=$Repos/rss/footer RssItemsDir=$Repos/rss/items

The first three variables are the important configuration variables that need to be customized for the specific location of the script. The ReposName variable indicates the name of the repository associated with the feed. In this case, that variable is taken from the first argument sent to the script, which is supplied by the post-commit hook script that calls the genRSS script. The RepositoriesDir variable points to the directory where all of the Subversion repositories are stored. So, if you have two repositories, /var/svnrepos1 and /var/svnrepos2, you would set RepositoriesDir equal to /var. Finally, the MaxItems variable stores the maximum number of items that is included in the RSS feed.

Locking the Script

Because commits can occur very close together, it would be possible for this script to end up running concurrently with another instance of itself. To avoid that, and serialize the running of the script, we need to create a lock that will be acquired when the script is run, and released when it is finished. If another script attempts to acquire the lock at the same time, the script will exit.

 #================================================= # Attempts to create a lock file so that this script isn't # affected by other instances that might be started at the same time. # If a lock-file already exists, this script is exited immediately. AcquireLockFile () {   LockFile=/svnrepos/locks/$ReposName.rssGen.lock   if [ -a $LockFile ]; then     # Another process is currently updating the RSS feed, so exit.     echo "Lock-file $LockFile exists. Exiting."     echo ""     exit 1   fi   # Create the lock file   touch $LockFile   echo "Lock-file $LockFile acquired." }

 #================================================= ReleaseLockFile () {   rm -f $LockFile   echo "Lock-file $LockFile released." }

This section of the script consists of two fairly simple functions. The first function, AcquireLockFile() simply checks to see if the lock file exists. If it does, the script exits. If there is no lock file, the script creates one by running touch. When the script exits, the ReleaseLockFile() function is called. This function simply removes the lock file that AcquireLockFile() created, thus freeing up the next instance of the script to run.

Computing Revision Range

Next, we need to compute the range of revisions that will be included in the RSS feed, which will be made up of a number of revisions equal to the value of MaxItems (as set at the beginning of the script), from the HEAD revision back. So, if MaxItems is equal to 100, and the repository is currently at revision 1400, the range is from revision 1301 through revision 1400.

 #================================================= ComputeFirstAndLastItemRevisions () {   echo "Computing first and last item revisions:"   SvnHeadRevision= ` svnlook youngest $Repos `   LastItemToInclude=$SvnHeadRevision   FirstItemToInclude=$((LastItemToInclude-MaxItems+1))   if [[ $FirstItemToInclude -lt 0 ]]; then     FirstItemToInclude=1   fi   echo " First revision to include: $FirstItemToInclude"   echo " Last revision to include : $LastItemToInclude"   echo " Max items                : $MaxItems" }

The HEAD revision of the repository is found by running svnlook youngest, which returns the revision number of the youngest revision in the repository. Then, the beginning of the range is calculated by subtracting the MaxItems value from the HEAD revision. If the first revision happens to fall below zero (i.e., there aren't MaxItems revisions in the repository), the start of the range is set to the beginning of the repository.

Deleting Old Files

The genRSS script creates an item file for each revision contained in the current RSS feed. As new revisions are committed, old revisions fall off the back of the list, and their item files need to be deleted. The DeleteOldItemFiles() function shown next handles the cleanup of those files as they become obsolete.

 #================================================= DeleteOldItemFiles () {   echo "Deleting old item files"   FirstMissingItem=$((LastItemToInclude+1))   # Move all items we want to keep to tmp, and then delete all others.   if [[ ! -a $RssItemsDir/tmp ]]; then     mkdir $RssItemsDir/tmp   fi   for ((Rev=LastItemToInclude; Rev >= FirstItemToInclude; Rev--))   do     if [[ -a $RssItemsDir/Item.$Rev ]]; then       #echo " Moving $RssItemsDir/Item.$Rev to $RssItemsDir/tmp"       mv $RssItemsDir/Item.$Rev $RssItemsDir/tmp     else       # We need items from FirstItemToInclude -> LastItemToInclude, but the current       # item is missing, so record the revision number so we can later access SVN       # to create the missing items.       FirstMissingItem=$Rev       #echo " DEBUG: FirstMissingItem=$FirstMissingItem"     fi   done   #echo " Removing all other Item files in $RssItemsDir"   rm -f $RssItemsDir/Item.*   # Now move the items we want to keep back to $RssItemsDir   #echo " Moving the files we're keeping from $RssItemsDir/tmp back to $RssItemsDir"   mv -f $RssItemsDir/tmp/Item.* $RssItemsDir   rmdir $RssItemsDir/tmp }

Because genRSS doesn't have any idea how many revisions have been added since the last time the script was run, figuring out which item files to remove would be a difficult task. So, instead, genRSS figures out which item files it wants to keep (the ones that correspond to revisions in the current range) and moves those into a temporary directory. Then, it removes all of the item files that remain in the RssItemsDir directory. After the obsolete files have been removed, it can then move the still-valid files back and remove the temporary directory.

Inside this function, genRSS also generates the variable FirstMissingItem. This indicates the first revision to be included in the RSS feed for which there is no existing item file. That way, genRSS only has to generate item files for new revisions, instead of wasting time generating files it already has.

Creating the Feed

Now, we come to the heart of genRSS, the functions that actually generate the RSS feed data. There are two functions here. The first, CreateNewItemFilesFromSVN(), creates the item files that will contain information about each revision in the feed. Then, the AssembleThePieces() function takes those items and creates an RSS feed XML file that it then puts up on the Web server for all (or some, depending on your access controls) to see.

[View full width]
 #================================================= # Creates new Items to represent each of the SVN revisions for # which there are currently no item-files in $RssItemsDir (for revision # numbers >= FirstItemToInclude, and <= LastItemToInclude). CreateNewItemFilesFromSVN () {   # Now access SVN to create RSS items for all revisions from $MaxRev+1 -> $LatestRevision   echo "Accessing SVN to create missing items"   echo " First missing item: Rev $FirstMissingItem"   echo " Last missing item: Rev $LastItemToInclude"   for ((Rev=FirstMissingItem; Rev <= LastItemToInclude; Rev++))   do     echo " Creating <Item> for SVN revision $Rev"     RssItemFile=$RssItemsDir/Item.$Rev     echo " ItemFile=$RssItemFile"     echo " Computing vars..."     AuthorId= ` svnlook -r $Rev author $Repos 2>&1 `     Author= ` getent passwd | grep $AuthorId | cut -d: -f 5 `     CommitMsg= ` svnlook log -r $Rev $Repos 2>&1 `     CommitDate= ` svnlook -r $Rev date $Repos 2>&1 `     CommitDateRss= ` echo $CommitDate | sed -e "s/\([^ ]*\) \([^ ]*\) \([^ ]*\).*/\\1T\ \2+02:00/" `     # Sample valid date=<dc:date>2004-06-07T17:03:30+02:00</dc:date>     URL="http://svnserver/viewcvs?rev=$Rev&root=$ReposName&view=rev"     FirstModifiedPath= ` svnlook -r $Rev changed $Repos | cut -b5-1000 | sed -e "s{\([^/]* /[^/]*\).*{\1{" | uniq `     Category= ` echo $FirstModifiedPath | sed -e "s&\(.*\)/\(.*\) &\\2 (\\1)&" `     echo " Done computing vars"     echo " <item>" > $RssItemFile     echo "   <title><![CDATA[$CommitMsg]]></title>" >> $RssItemFile     echo "   <link><![CDATA[$URL]]></link>" >> $RssItemFile     echo "   <description><![CDATA[$CommitMsg]]></description>" >> $RssItemFile     echo "   <category>$Category</category>" >> $RssItemFile     echo "   <dc:creator>$Author</dc:creator>" >> $RssItemFile     echo "   <dc:date>$CommitDateRss</dc:date>" >> $RssItemFile     echo "   <pubDate>$CommitDateRss</pubDate>" >> $RssItemFile     echo "  </item>" >> $RssItemFile     echo "CommitDateRss=$CommitDateRss"   done }

The CreateNewItemFilesFromSVN() function loops through all of the new revisions that are to be included in the RSS feed, and uses svnlook to get useful information about each revision, which is then parsed (into an RSS-friendly format) and fed into an RSS item file for later inclusion in the RSS feed.

The first bit of information parsed is the author of the revision.

 AuthorId= ` svnlook -r $Rev author $Repos 2>&1 ` Author= ` getent passwd | grep $AuthorId | cut -d: -f 5 `

The svnlook author command is used to get the revision, which is then stored in AuthorId. The username that's returned isn't really what we want, though. It would be much better to have the actual full name of the user who committed the revision. So, we instead call getent passwd, which returns the contents of the /etc/passwd file, and then search it for the username of the author. After that is found, cut is used to extract the user's real name, which should be stored in the fifth colon-separated field of the password entry.

Next, genRSS uses svnlook log to retrieve the log message for the revision, and svnlook date to retrieve the time and date of the commit. RSS feeds, however, need a fairly specific format for date information, which happens to be a little bit different from the date format that svnlook date returns. Therefore, it is necessary to process the returned date, and massage it into a format suitable for RSS, which is achieved in the code snippet that follows by using sed to retrieve the date and time from svnlook date's output and replace the whole string with a modified version that matches the required RSS format. Note the +02:00 on the replace side of the sed expression. That is the time-zone indicator, and shows that the time is two hours ahead of Coordinated Universal Time (formerly known as Greenwich Mean Time, abbreviated UTC). This needs to be modified for your local site, in order to give the correct local time zone.

[View full width]
 CommitDate= ` svnlook -r $Rev date $Repos 2>&1 ` CommitDateRss= ` echo $CommitDate | sed -e "s/\([^ ]*\) \([^ ]*\) \([^ ]*\).*/\\1T\\2+02 :00/" `

If svnlook date were to output 2004-10-02 17:40:08 +0200 (Sat, 02 Oct 2004), the RSS format would look like 2004-10-02T17:40:08+02:00.

After getting the log and date, the script generates a URL where users will be taken if they click on the link provided for the RSS feed entry in their RSS reader. In this case, the URL generated takes the user to a page in a ViewCVS site that shows the changes for the particular revision. This, of course, assumes that there is in fact a ViewCVS site set up and running. For more information about ViewCVS, take a look at Chapter 8, "Integrating with Other Tools."

 URL="http://svnserver/viewcvs?rev=$Rev&root=$ReposName&view=rev"

Next, genRSS generates a category entry for the RSS item, based on the modified paths. As with the date modification, genRSS uses sed to massage the output garnered from svnlook changed to get a category name that identifies the section of the repository that was modified. This parsing is necessarily very repository specific, and you probably need to generate your own sed commands to parse the output in order to get a meaningful category. If parsing the changed files doesn't make sense as a means to get a category, you may instead want to have the user put a category line in her log message that can be extracted.

[View full width]
 FirstModifiedPath= ` svnlook -r $Rev changed $Repos | cut -b5-1000 | sed -e "s{\([^/]*/[^ /]*\).*{\1{" | uniq ` Category= ` echo $FirstModifiedPath | sed -e "s&\(.*\)/\(.*\)&\\2 (\\1) &" `

Finally, all of the data that has been gathered is output into an item file, in the appropriate XML format for the RSS feed.

 echo " <item>" > $RssItemFile echo "   <title><![CDATA[$CommitMsg]]></title>" >> $RssItemFile echo "   <link><![CDATA[$URL]]></link>" >> $RssItemFile echo "   <description><![CDATA[$CommitMsg]]></description>" >> $RssItemFile echo "   <category>$Category</category>" >> $RssItemFile echo "   <dc:creator>$Author</dc:creator>" >> $RssItemFile echo "   <dc:date>$CommitDateRss</dc:date>" >> $RssItemFile echo "   <pubDate>$CommitDateRss</pubDate>" >> $RssItemFile echo " </item>" >> $RssItemFile

After the item files have all been generated, it's time to assemble them all into a full RSS feed XML file. This is accomplished by the AssembleThePieces() function, which is shown here.

 #================================================= # Echos the contents of RssHeader, followed by each of the Rss Item # files in revision-number-order, followed by RssFooter to the Rss file # being generated. AssembleThePieces () {   echo "Assembling the pieces"   cat $RssHeader > $RssFileTmp   local PubDate= ` date +"%a, %d %b %Y %T %Z" `   echo "   <dc:date>$PubDate</dc:date>" >> $RssFileTmp   echo "   <pubDate>$PubDate</pubDate>" >> $RssFileTmp   echo "   <lastBuildDate>$PubDate</lastBuildDate>" >> $RssFileTmp   # Add all RSS items to the RSS file   for ((Rev=LastItemToInclude; Rev >= FirstItemToInclude; Rev--))   do     cat $RssItemsDir/Item.$Rev >> $RssFileTmp   done   # Add the RSS footer   cat $RssFooter >> $RssFileTmp   mv -f $RssFileTmp $RssFile }

As you can see in the preceding code, the RSS feed file is generated by successively inserting the RSS header, publication date, each item file, and RSS footer into a temporary RSS file. After the full file is created, that is then moved over to replace the old live RSS file.

The RSS header and footer are stock pieces of XML, which are stored in their own files. The header file contains various pieces of information about the Subversion repository, and needs to be customized for your particular repository.

As an example, here is what the header might look like. Notice that you need to customize most of the tags under the <channel> tag to match your repository.

 <?xml version="1.0" encoding="iso-8859-1"?> <rss version="2.0"      xmlns:dc="http://purl.org/dc/elements/1.1/"      xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"      xmlns:admin="http://webns.net/mvcb/"      xmlns:slash="http://purl.org/rss/1.0/modules/slash/"      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"      xmlns:content="http://purl.org/rss/1.0/modules/content/">   <channel>     <title>InteractV1 Code Updates</title>     <link>http://svnserver/viewcvs/?root=InteractV1</link>     <description>News about recent code updates to InteractV1</description>     <webMaster>nstrydom@absolutesys.com</webMaster>     <managingEditor>tcl@absolutesys.com</managingEditor>     <dc:language>en-us</dc:language>     <sy:updatePeriod>hourly</sy:updatePeriod>     <sy:updateFrequency>1</sy:updateFrequency>     <sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>

The footer, then, is quite simple, and just closes off a couple of tags that are still open at the end of the RSS feed.

    </channel> </rss>

Tying It All Together

Finally, at the end of the genRSS script is the function that ties everything together. The GenerageRssFile() function calls each of the other functions in the proper order, and outputs "Done." when it is finished. After the function is declared, the script immediately calls it.

 #================================================= # Generates a new RSS file for the repository. GenerateRssFile () {   AcquireLockFile   ComputeFirstAndLastItemRevisions   DeleteOldItemFiles   CreateNewItemFilesFromSVN   AssembleThePieces   ReleaseLockFile   echo "Done." } GenerateRssFile

Taking Action on the Post-commit

Now that you have the genRSS script, you need to set up your post-commit script to run it, which will give you a script something like the following example.

 #!/bin/sh REPOS=$1 REV=$2 REPOSNAME= ` /bin/basename $REPOS ` /svnrepos/scripts/genRSS $REPOSNAME

For the most part, this is a very straightforward script. The one "gotcha" that you might notice, though, is the REPOSNAME variable, which is constructed using the basename command to strip off everything but the trailing repository name. This is because the genRSS script takes the name of the repository, not the full path to the repository.

11.2.3. Implement Fine-grain Access Controls

The Authz module for Apache and WebDAV allows you to restrict access to specific directories to individual users or groups of users. What if you use svnserve though? It doesn't have the fine-grained access controls of Authz built in. Or, what if you need to restrict write access to a specific file? Authz only allows restrictions to be placed on a per-directory basis. These cases are where pre-commit hook scripts can come in handy. With a hook script, you can check the permissions of the user against the directory (or file) where the commit is taking place, before allowing it to be applied (unfortunately, there is no way to run a hook script before a read takes place; as the French say, such is life).

Like e-mailing of commits, access controls are another very common use of hook scripts. As such, Subversion provides two example scripts, similar to the e-mail commit scripts. Each of these scripts allows you fine-grained access control over commits, on a per-file or per-directory basis. If you need to make modifications, there is a Perl script and a Python script, which you can use depending on your language of choice.

`commit-access-control.pl`

You run the commit-access-control.pl script by passing it the repository in question, the transaction name, and a configuration file with the user permissions for the repository. If the script determines that the user performing the commit has the proper permission, it exits with a return status of 0; otherwise it exits with a 1.

The following example shows how you might write a pre-commit script that runs commit-access-control.pl and decide whether to allow the commit.

 #!/bin/sh # Get the pre-commit script arguments # $1 = The repository path # $2 = The transaction name RPS = "$1" TXN = "$2" # Check the repository permissions ACCESS_CONTROL = /usr/local/share/tools/hook-scripts/commit-access-control.pl CONFIG_FILE = /var/repos/svnrepos/access_control.conf ${ACCESS_CONTROL} "${RPS}" "${TXN}" "${CONFIG_FILE}" || exit 1 # It passed everything appropriately exit 0

The configuration file that commit-access-control.pl uses is very similar to the Authz configuration. Each section name is enclosed in brackets ([sec name]) and contains entries for a pattern to match directories against (for determining which directories the group applies to), a list of users to grant the permission to, and an access option that determines whether the allowed access is read-only or read-write.

The following example shows a config file that sets up three permissions sections, giving read permission to everyone, write permission for the trunk to only two users, and write permission to branches to a select few users.

 [global] match = .* access = read-only [trunk permissions] match = /trunk users = fred ethel access = read-write [branches] match = /branches users = fred ethel joe linda betty

The match section uses the Perl regular expression syntax to match directories to apply the permissions to. The syntax itself is beyond the scope of this book, but you should have little trouble finding good Perl documentation if you look online or at your local bookstore.

If you know a little bit of Perl, you should feel free to examine the source code for commit-access-control.pl to see how it works. You should also feel free to experiment a little and modify the script to better fit your needs. It's not only in the spirit of open source, but it's also a great way to learn.

`svnperms.py`

If you are thinking that you would like to make a few of your own custom modifications to the commit-access-control.pl file, but Perl isn't your cup of tea, you might want to take a look at svnperms.py. This script performs almost exactly the same function as commit-access-control.pl, but is written in Python instead of Perl. Like commit-access-control.pl, svnperms.py takes a repository and transaction, and determines whether the user has write permissions based on a supplied configuration file. The syntax of svnperms.py is a little different than commit-access-control.pl, as is the syntax of its permissions configuration file. If you look at the svnperms.py source, you should quickly see how it differs though. You can also run svnperms.py by itself with no options to see the usage message.

 $ ./svnperms.py missing required option(s): repository, either transaction or a revision Usage: svnperms.py OPTIONS Options:     -r PATH    Use repository at PATH to check transactions     -t TXN     Query transaction TXN for commit information     -f PATH    Use PATH as configuration file (default is repository                path + /conf/svnperms.conf)     -s NAME    Use section NAME as permission section (default is                repository name, extracted from repository path)     -R REV     Query revision REV for commit information (for tests)     -A AUTHOR  Check commit as if AUTHOR had committed it (for tests)     -h         Show this message

11.2.4. Enforce Policy

Any software development project is going to have a number of policies that are unique to that project (even though they may be similar to policies on other projects). One of the jobs of a project manager is to help ensure that those policies are correctly followed. Due to forgetfulness, laziness, stubbornness, and occasionally incompetence, ensuring that policies are followed can be a tough job, and allowing them to slip (even slightly, sometimes) can cause a lot of headaches down the road.

In many cases, though, the policies that need to be followed are well-defined enough that a script can be written to parse the source that is being committed to a repository and check it for compliance with project policies. If that is the case, the script can be run as part of a pre-commit hook script, which allows Subversion to reject any commits that don't comply with policy.

Some of the policies that you might want to consider testing in your pre-commit script are

Check compliance with source code style rules. Many projects have style rules (indentation, bracket placement, variable naming, and so on) that all code committed to the project should follow. Many of these rules are easily tested by an automated checker (GNU Indent is a popular choice for C code), and either fixed or rejected with reasons for failure. If your project requires submitters to check their code against a standard before committing, you can have a script run the checker when it receives the commit and reject any code that doesn't fit the requirements.
Ensure that submitted source compiles. If a user commits code to the repository that doesn't compile, it can cause delays and headaches as other developers have to sort out why things no longer work and are potentially blocked in their own development until a fix is committed. By running a build of the source before allowing the commit, you can help prevent broken source trees. This tends to be a more useful hook if it only checks against the trunk (or other shared branches) and allows branches used only by individual developers to be committed broken.
Validate submitted changes with the project's test suite. Many projects have a suite of test programs to help ensure that features work (and continue to work after changes are made). If such tests exist, it is usually important for developers to run those tests before submitting changes to the Subversion repository. Unfortunately, that doesn't always happen, and submitted changes may introduce subtle problems in areas other than their main area of operation. By automatically running the project's test suite (or a subset, if it's too large to run on every commit), you can help reduce these instances. This tends to be a more useful hook if it only checks against the trunk (or other shared branches) and allows branches used only by individual developers to be committed broken.
Use properties to check status of outside processes. For example, you might require that all source code be validated in a peer review before it is placed into the main source trunk. To help ensure that those peer reviews have taken place, you could require that all changes submitted to the repository include a property change that adds the date or the peer review for those changes to a property showing the peer review history of the file.
Enforce repository modification policies. For instance, users should be able to create new tags in tags/, but you probably don't want them to modify anything in those tags (tags/*/*). Nor do you likely want to have users create files directly in branches/ or tags/. Instead, they should only create directories. Furthermore, you could limit those directories to directories that have history, thus preventing a tag or branch created from a fresh directory addition.

11.2.5. Log Revision Property Changes

When revision properties are changed, the change is applied immediately and the old value is lost forever. This makes revision properties extremely volatile if you allow them to be changed. On the other hand, there are times when changing revision properties can be useful, especially if you add your own revision properties to support your development process. Therefore, the best solution for overcoming the shortcomings of the revision property, while allowing them to be changed for reasonable purposes, is to create a log of each revision property's history. Whenever that revision property changes, you log the previous value of the property into an unversioned file stored somewhere on disk. Then, later, if someone needs to retrieve the old value for a revision property, he can check that file and find the information he wants.

The following pre-revprop-change hook script shows how you might go about logging all of your revision property changes.

[View full width]
 #!/bin/sh REPOS="$1" REV="$2" USER="$3" PROPNAME="$4" echo "Changing revision property ${PROPNAME} on revision ${REV} at ` /bin/date ` " >>  ${REPOS}/revprop.log echo "========== Old Value ==========" echo ` /usr/bin/svn propget ${PROPNAME} --revprop --revision ${REV} file://${REPOS} ` >>  ${REPOS}/revprop.log echo "========== End Old Value ==========" echo exit 0

As you can see, this is a pretty simple script. First, it echoes some information about the property being changed (the property name, the revision number, and the date/time of the change). Then, it retrieves the old value by running svn propget, and echoes that value into the log file, too. Finally, it exits with status zero, so that Subversion will allow the property change to take place.

You may be asking why I use svn propget instead of svnlook propget. The answer is that svnlook propget doesn't allow you to retrieve revision properties. Because svn doesn't take raw revision paths, though, I have to add the file:// schema onto the beginning of $REPOS when I put it on the command line.

11.2.6. Make Tags Immutable

One of Subversion's more controversial features is its lack of CVS-style tags (or VSS-style labels), where a particular revision can be "tagged" with an identifier that gives it special meaning. In Subversion, tagging is done with cheap copies, and are technically identical to branches. The only thing that sets tags apart from branches is the convention that copies placed into "branches" directories are branches, and copies placed into "tags" directories are tags. The upside to this is flexibility (hierarchical or alternate branches and tags directories), but the downside is a lack of enforcement for the immutability that is generally desired for tags.

Generally, tags are meant to be static identifiers of the state of the repository at a given point in time. If you want people making changes and committing them to the tag, you would make it a branch, right? The problem with using Subversion copies for tags is that those tags are not immutable. In fact, you can check out a tag and freely commit changes to it, just as with any other directory, because Subversion doesn't have any concept of tags being anything special. Of course, your history isn't lost, because the tag will be fully versioned. But if someone accidentally commits a change to a tag, it may not be noticed by others who check out the tag, thinking they are getting a static snapshot of the repository at a specific point in time.

The easiest solution for keeping tags static is to simply make it project policy. Make sure everyone on the development team knows not to modify any files in the tags directory, and let the team police itself by occasionally checking histories and ensuring that no one has made any changes they weren't supposed to. Because everything is versioned, it will be relatively easy to undo any changes that are made, and everything should run smoothly.

Are you laughing yet? If you have much experience with development projects (and, more specifically, developers), you will know that relying on everyone to always do the correct thing is setting yourself up for problems. People make mistakes, and occasionally do malicious things (even on a small project). Therefore, if a policy can be enforced through technical means, without unduly causing detriment to the developer's productivity, that is almost always better than just stating the policy and hoping everyone follows it correctly.

One way that you can enforce the immutability of tags in Subversion is to use hook scripts that check data that is being committed, and ensure that nothing in the tags directory is being modified. You can do this, for instance, in a pre-commit script that checks which files are being modified, using svnlook changed, and rejects any commits with changes inside the tags directory. Of course, you still want to be able to add new tags, and probably want to be able to delete tags, too, so you'll want to check specifically for files that have been updated, while allowing adds and deletes. The following example script shows one way that you might implement this functionality using the svnperms.py script.

 #!/bin/sh # Grab the repository name and the transaction number from # the script's arguments. REPOS="$1" TXN="$2" # Run svnperms.py to check the permissions /usr/bin/svnperms.py -r ${REPOS} -t ${TXN} -s SimpleAuth exit 0

The matching svnperms.conf file should be created in $REPOS/conf/, and will look something like the following example. In this example, the trunk and branches directories are fully modifiable, but users can only create or delete directories at the top level of the tags directory. Any attempts to add, modify, or remove files or directories inside a tag will fail.

 [SimpleAuth] trunk/.* = *(add,remove,update) branches/.* = *(add,remove,update) tags/*/* = *() tags/[^/]+/ = *(add,remove)

It might be helpful to also be able to set properties on the tags themselves (i.e., the directory contained at the top level of the tags directory). If you'd like to allow properties to be set, you can add update to the list of actions that can be performed in the last entry of the svnperms.conf file, so that it looks like this:

 tags/[^/]+/ = *(add,remove,update)

Because the only modification you can do to a directory (other than move or delete it) is modify properties, this has the effect of just allowing properties to be set for the tag directories, without allowing the contents of the tag to be modified.

Another common concern with tags: What happens when you do have to change a tag, but modifications have been disallowed? If someone accidentally commits a tag prematurely, or tags the wrong directory, you don't want to be stuck with an incorrect tag. Also, you might find it useful to regularly change some tags to point to a different part of the repository, such as with a "current development tree" or "last successful build" tag. In these cases, you don't want a hook script that disallows modifications to get in the way.

To allow certain users to modify tags, without opening up modification permissions to everybody, you can make use of svnperms.py's groups. By adding an admin group, you can assign specific users to have permission to modify tags. The updated svnperms.conf file with this added in will look something like the following.

 [groups] admins = bill fred [SimpleAuth] trunk/.* = *(add,remove,update) branches/.* = *(add,remove,update) tags/*/* = *() @admins(add,remove,update) tags/[^/]+/ = *(add,remove)