Section 13.2. Using SVN in Peer Reviews

13.2. Using SVN in Peer Reviews

Peer reviews can be a useful tool for helping to ensure code quality, but no one likes them. They tend to feel like wastes of time, even when they're not, and are extremely prone to devolving into chaos. Many developers feel like their time could be better spent doing "real work," and few cherish losing a day, or even an afternoon, to reviewing others' work. Subversion doesn't have some magical key to making peer reviews perfect, of course, but you can use Subversion to help organize them so that they can at least feel more productive. In distributed development environments (such as open source projects), you can even use Subversion to help support distributed peer review.

13.2.1. Tracking Peer Review Status

The most obvious use of Subversion in peer reviews is as a tool to track what has been peer reviewed already and what has not. By using Subversion properties and other metadata, you can easily track peer reviews on a per-file basis, per-directory basis, or even per-project basis. That way, when you're ready to do a peer review, it's already obvious what needs to be reviewed in that session. Developers can prepare ahead of time, and a checker script can ensure that no code accidently slips through unreviewed.

The exact methods used for tracking peer reviews depends mainly on what code you want to have reviewed, how often you want reviews, and in what granularity of code. To give you some ideas for how you might go about tracking reviews, here are a number of different ways to keep track of them.

You can track reviews per-revision, recording which revisions have been reviewed and which have not. When you perform a review, you would then be able to call up all of the changes made to the repository in a given block of unreviewed revisions.
To track the revisions that have been reviewed, you can use a property set on the root directory, called something such as review:last_rev. Whenever you perform a review, modify that property to hold the revision number of the last reviewed revision. If you would rather separate your reviews by project (or even "part of project"), you can use the base directories of each project to store the property, instead of the root directory of the repository.
Another method of tracking reviews is to do them based on revision dates. For instance, you could hold regular peer reviews twice a month, and always review the changes that have been committed to the repository since the last peer review.
In this scenario, you may not even need to explicitly store something in the repository to track the reviews. Instead, you could just use Subversion's capability to retrieve revisions by date to always get the correct revisions. Of course, it could still be useful to store the dates in the repository (just in case), which can easily be done with properties, the same as when tracking by revision number.
If reviews aren't always done sequentially, storing the last reviewed revision or date doesn't do you much good. In this case, it might be more useful to use revision properties to store review status. When an individual revision is reviewed, set a revision property on that revision that indicates it has been reviewed.
Alternately, you could use a single versioned property to store a list of nonsequential reviewed revisions. Using the single versioned property has the advantage of allowing you to easily view (or parse in a script) all of the reviewed revisions for a particular project. On the other hand, it decouples the indicator of review status from the revision itself, making it difficult to figure out when a revision was reviewed.
As an alternative to using properties to signify peer review status, you can also use tags. For instance, you could keep a sliding tag of the last reviewed revision for each of your projects. When you have a new review, you can use the tag to see what has changed since your last review, and then recreate the tag against the HEAD of your repository.

Finding Unreviewed Revisions

If you are using revision properties to mark which revisions have been reviewed and which haven't, finding unreviewed revisions can be a difficult process. The following script, however, could be used to automate the process. It takes a URL into a repository that points to the base of the project being checked, as well as a range of revisions to check. It then searches through the revision properties for that project, and finds any revisions where the designated project was changed but the revision has never been reviewed (it also outputs revisions that have not had the property set). Because it has to search linearly through the revision properties, this can be a slow script to run (especially on large repositories), but presumably you don't need to run it frequently. In this version of the script, the revision range must be given in the form of revision numbers. The script could be expanded to parse dates or symbolic revision labels though.

 #!/bin/bash # review_status.sh # Finds all revisions with the review:status property set to 'false' # The URL of the repository URL=$1 # The range of revisions to check LOW_REV=$2 HIGH_REV=$3 # Cycle through the range of revisions for i in ` seq ${LOW_REV} ${HIGH_REV} ` do     # Check the review status of this revision     REVSTAT= ` svn propget reviewstat --revprop -r ${i} ${URL} `     # If the revision hasn't been reviewed, output the revision number     if [ ${REVSTAT} = "false" ]     then         echo ${i}     # If the reviewstat property hasn't been set,     # output the revision number with an asterisk     elif [ ${REVSTAT} = "" ]     then         echo "${i}*"     fi done

This script can be run from the command line, as in the following example.

 $ review_status.sh https://svn.mydomain.com/repos 425 780

13.2.2. Distributing Material for Peer Reviews

When it comes time to do a peer review, it helps greatly to distribute the code to be reviewed to the reviewers a few days ahead of time. That way, when everyone comes together for the review, they are prepared for the discussion. The process of getting that code to the reviewers is where Subversion comes in. There are several ways that you can go about distributing the code to be reviewed, each with different advantages and disadvantages.

You can inform each reviewer of the revisions that will be reviewed, and allow her to examine the revisions as she wants. This has the advantage of putting the least amount of burden on the organizer of the review, while maximizing the flexibility of the reviewers to examine the changes as they wish. However, flexibility is not always the best virtue, as reviewers tend to be less likely to examine the changes to be reviewed if they have to work at getting at them. Additionally, in some cases, the reviewers may not readily have a working copy of the revision to play around with. Also, I should note here that none of my other suggestions for distribution in any way limit the developer from using Subversion to get more information about the changes, nor is there anything I talk about in the following examples that wouldn't normally be available for the reviewer to personally generate.
You can distribute diffs of all the changes in the revisions under review. These can be easily obtained using Subversion, and they give you a tangible set of changes that can be sent to reviewers or posted on a mailing list. The downside is that many developers find diffs hard to read, and next to useless as part of a code review. On the other hand, they do put something heftier than a couple of revision numbers into the reviewer's hands, which may improve the chances that the code under review will at least be looked at before the review takes place.
Distribute svn blame output for the project to be reviewed. This has the advantage of putting the entire source of the project being reviewed in front of the reviewer. However, it can make it difficult to pinpoint exactly what changes need to be reviewed, because every line is annotated, not just the lines added in the desired revisions. A potential solution to that problem, though, is to process the output of svn blame to generate a reviewer's version that has all of the annotations for irrelevant revisions stripped out.
Instead of distributing the source directly to reviewers, you could generate a Web page to display the changes to be reviewed. The Web page could then be generated from the output of svn diff or svn blame, wrapped in markup text that could color-code sections or provide links to other parts of the project that are referenced. Every developer participating in the review would then be able to view this Web site to get up to speed on the changes. Or, if a Web site isn't practical, you could also use the output to generate an annotated and colored version of the source to be reviewed.

An `svn blame` Postprocessor

If you would like to use the output of svn blame to create a peer review copy of your source files, it can be useful to filter svn blame's output to only annotate the specific range of revisions that you are reviewing. The following example shows how you can create a Perl script that will do just that.

 #!/usr/bin/perl # blame_filter.pl # Filters the output of svn blame for specific revisions # Get the range of revisions to include in the output $LOW_REV = $ARGV[0]; $HIGH_REV = $ARGV[1]; # Iterate through each line of svn blame output and filter the annotation while ($LINE = <STDIN>) {     # Get the revision number for this line     $CUR_REV = substr $LINE, 0, 6;     # Compare the revision number to the revision range     if(($LOW_REV <= $CUR_REV) && ($HIGH_REV >= $CUR_REV)) {         # This revision is in our range, include the annotation         print $LINE;     }     else {         # This revision is outside the range, strip the annotation         print "                  " . substr($LINE, 18);     } }

This script can then be run as a filter for the output of svn blame from the command line or in a script, as in the following example (which includes the annotation for revisions 435 through 502).

 $ svn blame test.cpp | blame_filter.pl 435 502 | test-annotated.cpp

13.2.3. Performing Peer Reviews

When it comes time to actually perform peer review, there are a few areas where Subversion can be a useful tool. How it can best be used as a tool depends on what type of peer review you are looking to use. There is the traditional (and often dreaded) code review that consists of several developers sitting around a table and critiquing sections of code, or a more one-on-one style review, where each developer's work is sent to directly to one or more other developers who review and comment directly back to the developer at their own convenience. Alternately, peer review can be accomplished through a forum style review (usually hosted on a message board or e-mail mailing list), where sections of a project to be reviewed are made available for comment by a wide array of interested parties. This sort of review is especially popular for open source projects, where the project's developers are widely scattered and diverse.

Group Reviews

For a group review, developers get together either as a group to critique each other's work or as a panel to critique the work of one or more reviewees. The code to be reviewed is usually either disseminated in paper or electronic form (i.e., to developer's laptops), or displayed on some sort of overhead projection for all to see and comment on.

The logistics of getting the code to be reviewed to the reviewers has already been discussed in some detail, so I won't bore you further. As for getting the code so that it can be displayed on an overhead projector, that should be trivial to do, and doesn't require any special tricks from Subversion. Just check out the code and display it, possibly in an svn blame format. There is, however, one other component to the review that Subversion can be useful for, which does require more than trivial implementation. During any review, there is useful commentary provided by the reviewers involvedif there weren't, you wouldn't be having the review in the first place. Obviously, you want to keep a record of those comments as they are made, and what better place to keep that record than in the form of Subversion properties that are tied to the individual files and easily accessible after the review.

Individual Review

With individual reviews, the reviewee's code is sent to one or more reviewers, who examine the code and make comments directly back to the developers in their own time. For the most part, the discussion on group reviews holds true for individual reviews. Distribution of source for review can usually be handled the same, and the suggestions for using properties for storage of reviews holds equally useful. There is one catch with using properties in the context of individual reviews, though. One of the weaknesses of the Subversion property system is its lack of support for searching through properties. If another developer attaches reviews to your source files, it can be difficult to figure out exactly which files have had the reviews attached. One solution is to use log messages to indicate where reviews have been added, but that is error prone and inelegant. A better solution is to set up a post-commit script, which watches for changes on the peer review properties. When a change is made, the modified peer review can be sent via e-mail to the target developer. That way, you always have a complete record of all peer review, readily accessible along with the file itself, but each developer is also given a much more immediate indication of peer reviews.

Forum Review

The final sort of review is a forum review, which is the sort of review most likely to be used in an open source project. In this type of review, changes to a project are made available through some sort of public forum (although "public" could well be limited to within a company or project group) where developers are free to view the changes and comment as they feel necessary. An example of this sort of review would be the Linux Kernel Mailing List, where potential additions to the kernel are sent to the mailing list in the form of patches, to be put up for comment and discussion prior to possible inclusion in the repository.

With Subversion, you can support this sort of review by setting up the repository to automatically send diffs and log messages to a mailing list whenever commits to certain project branches are made. For instance, if you have branches for each developer authorized to commit to an open source project, you could set up a post-commit script that would use svnlook diff and svnlook log to retrieve the changes made for each commit to one of those branches and send it to a developer mailing list. Then, when the discussion on the commit is concluded, the appropriate developers can merge the changes into the main project trunk. It is even possible to have a script that receives the mails sent to the developer list and automatically archives peer review discussions into properties on the appropriate files (the downside being that thread tracking isn't always perfect, and you might end up missing some e-mails, or getting other unrelated ones thrown in).