Section 9.3. Migrating an Existing Repository


9.3. Migrating an Existing Repository

Sometimes, a Subversion repository will be created as part of a brand new project. In those cases, getting initial data into the repository is easythere is none. More often than not, though (especially because Subversion is so new), a new Subversion repository will be part of a migration away from another version control system. As part of that migration, there is a whole history of data that most people aren't going to want to lose. Therefore, the ideal solution is to be able to take the entire repository history from the old system and migrate it over to Subversion.

The two most common version control systems that people migrate to Subversion from are almost certainly CVS and Microsoft's Visual SourceSafe. This has led to the creation of migration tools that allow you to take repositories from both systems and create a Subversion repository that preserves the history of all of the files in the old repository.

9.3.1. The Basic Migration Process

Whatever the system that you are migrating from, there are a few things that you should always remember. Failure to heed these warnings will not harm pets or small children, but could result in loss of data, or even loss of a job.

  • Always back up your existing repository before attempting any sort of migration. Just because the migration tool shouldn't mess with your old repository doesn't mean that, if something bad happens, it won't.

  • Always back up your existing repository before attempting any sort of migration. Just because the migration tool shouldn't mess with your old repository doesn't mean that, if something bad happens, it won't. (Yes, I meant to say that twice.)

  • Have a migration plan. Do you intend to move everyone over to Subversion immediately? If not, are some people going to continue using the previous system as their primary VCS while others migrate completely, or is everyone going to mirror all of their changes into both systems during a transition period?

  • Keep the old repository around, just in case. Until you are positive that your new Subversion repository is going to work out, make sure that you can go back to the old system.

  • Test everything in the new repository after the migration. Make certain that the HEAD revision of your repository is correct and working inside the Subversion repository. It might even be a good idea to run a diff on all of the files in a working copy of your Subversion repository, to make sure they match the files from your old VCS.

  • Know what you're losing. Because the VCS that you're migrating from is not Subversion, it doesn't store exactly the same things that Subversion does. Invariably, some data (however minor) will be lost in the transition. Make sure you know what you are losing, and store it somewhere else if it's important to keep (properties may be a good place to store information that you want to save).

9.3.2. Migrating from CVS

If your existing project source is stored in a CVS repository, you are in luck. The cvs2svn utility provides excellent conversion tools for migrating a CVS repository to SVN, while preserving most (if not all) of your history data. You can even import data from a CVS repository into an existing Subversion repository that already contains other data, and can pick and choose exactly which data you want to import.

You can acquire cvs2svn from the project's Web site, cvs2svn.tigris.org. The program is a Python script, so it doesn't require any installation, and can run on either MS Windows or a UNIX-like system, as long as you have Python and a couple of other prerequisites installed. To find out exactly what you need to install, you can look at the official cvs2svn documentation on the project's Web site.

Full Repository Migration

A complete migration of an existing CVS repository to a brand new Subversion repository can be accomplished by running cvs2svn with the name of the Subversion repository and the CVS repository. If the Subversion repository referred to doesn't already exist, cvs2svn will create it for you (unless you pass --existing-svnrepos to tell it to only use a Subversion repository that already exists).

 $ cvs2svn -s /var/svnrepos /var/cvsroot 

If you would rather have cvs2svn create a Subversion dumpfile, instead of directly importing into a repository, you can pass --dump-only instead of -s repository.

 $ cvs2svn --dump-only /var/cvsroot 

Then, you can load the dumpfile into a Subversion repository later, using the svnadmin load command.

 $ cat cvs2svn-dump | svnadmin load /var/svnrepos 

Partial Repository Migration

If you don't want to migrate an entire CVS repository, cvs2svn allows you to only migrate part of the repository. For example, you can migrate just the trunk of a repository by running the conversion with the --trunk-only option.

 $ cvs2svn --trunk-only -s /var/svnrepos /var/cvsroot 

Or, you can convert a custom selection of branches and tags by using the --exclude option to tell cvs2svn what parts of the repository you don't want to be converted. The exclude option allows you to pass regular expressions that cvs2svn will use to determine which branches/tags to ignore during the conversion. For instance, the following example will convert an entire repository, except for the branches that were used for fixing issues in the issue-tracking system.

 $ cvs2svn --exclude= 'issue-*' -s /var/svnrepos /var/cvsroot 

Handling Data Differences

CVS and Subversion are very similar, but they don't store data in exactly the same way. The most obvious difference, of course, is the way the two handle branches and tags. Instead of using copies, like Subversion does, CVS deals with tags and branches differently than it deals with the repository trunk. That means that when cvs2svn converts the repository, it needs to convert the CVS branches and tags into copied directories inside the Subversion repository.

By default, cvs2svn creates top-level branches, tags, and TRunk directories and places branches and tags correctly into their respective directories. If you want to create a repository that places branches, tags, and the trunk somewhere other than the default top-level directories, you can do so by passing the --branches, --tags, and --TRunk options, respectively. For instance, the following example shows a conversion that will place the converted repository into a subdirectory specific to the CVS repository's project.

[View full width]

$ cvs2svn --trunk= 'myproject/trunk' --branches= 'myproject/branches' --tags= 'myproject /tags' -s /var/svnrepos /var/cvsroot

Another fairly major difference between Subversion and CVS is the handling of revision numbers. CVS keeps revision numbers for each file individually, whereas Subversion keeps a global repository revision number. In most cases, this change isn't a big deal, but sometimes developers will remember the revision numbers to use later. If you don't want to lose the file-specific CVS revision numbers when you perform the migration, you can pass cvs2svn the --cvs-revnums option. This tells it to create a property to store the CVS revision numbers for each file that is converted.

Handling end-of-line markers can be another sticky area of conversion. CVS's standard mode of operation is to convert line endings to the native line-ending format for the local operating system of the working copy when a file is checked out. Subversion, on the other hand, never makes any modifications to the file, by default. If you have a CVS repository, though, it is likely that some of your developers have come to rely on the default CVS line-ending modifications. To make the conversion a little bit easier, cvs2svn automatically sets the svn:eol-style property to native for all files that CVS hasn't been explicitly told not to do line-ending conversions for. If you don't want cvs2svn to set all of the files from your CVS repository to do line-ending conversions when they're checked out, you can pass the --no-default-eol option when it converts the repository.

CVS doesn't know anything about MIME types for files. Subversion, however, can use MIME types constructively in a number of situations, which would make it useful if the repository conversion could automatically set the svn:mime-type property for all of the files in your CVS repository. Well, as you may have guessed already, it can do just that. If you pass cvs2svn the --mime-types=FILE options, with FILE pointing to a mime.types file, it will attempt to assign the MIME type for every file it converts.

The mime.types file tells cvs2svn what MIME types it should match to files with given file extensions. Each entry in the file will contain a MIME type, followed by a list of the file extensions that should be matched with it. For example, you might have an entry in your mime.types file that told cvs2svn to give all files that ended in .c, .cpp, or .h the MIME type text/x-c, which would look something like the following.

 text/x-c       c   cpp   h 

If you have Apache installed on your system, you probably have a default mime.types file somewhere. You may want to find that file and use it as a starting point for writing your own mime.types file to use when converting your repository.

If you're using --mime-types, you may also want to have cvs2svn decide whether it should set the svn:eol-style based on the MIME type that it sets for each file. To do so, you need to pass the --eol-from-mime-type option to cvs2svn. However, this option will only have an effect if the --mime-types option is also used.

A final difference between CVS and Subversion that needs to be addressed is the way keywords are handled. CVS automatically performs keyword substitutions on all files that aren't explicitly identified as binary when the file is added to the repository. Conversely, Subversion doesn't perform keyword expansions on any files, unless it is explicitly told to. However, if you use a lot of keyword expansions in your CVS repository, the odds are that you would like to continue to use them in your new Subversion repository. Therefore, by default, cvs2svn will set the svn:keywords property on all of the files it converts to "author id date" (except ones marked as binary in CVS). If you don't want the property set, you can turn it off with --keywords-off.

9.3.3. Migrating from SourceSafe

Microsoft's Visual SourceSafe is not the darling of the version control market. In fact, it seems to be a common wisdom within Subversion circles that there are two kinds of VSS users in the world: those who have lost data to a corrupted database, and those who will. With such a charmless reputation, it's no wonder that migrations to VSS seem to be one of the most common types of migration performed. Fortunately, that means that if you find yourself clamoring to get away from VSS, there are a number of tools available to aid you in your plight.

The most full-featured tool available appears to be the vss2svn.pl conversion script, which is available from vss2svn.tigris.org. As I'm writing this book, the script is still listed as being in an alpha release, but it does have support for converting most of the information in a typical VSS repository. When you run the script, you will give it an existing VSS repository and an existing SVN repository (which can be a brand new one you just created). It then gets the data out of the VSS repository and inserts it into the Subversion.

The basic operation for the vss2svn.pl script is pretty simple. If you have VSS properly installed on your system, you can run the script from a Windows command prompt; just tell it which VSS project to migrate, and what Subversion repository to migrate it into. For example, if you want to migrate the project foo from your VSS repository into a newly created (empty) repository, you could run the following.

 C:\vss2svn>vss2svn.pl --vssproject $/foo --svnrepo http://svn.example.com/svnrepos/ 

The vss2svn.pl script also lets you do more complicated processing by allowing you to specify projects that should be excluded from the migration using --vssexclude, listing either absolute paths to exclude or paths relative to the project specified by the --vssproject option. You can also perform other processing on the migration, such as specifying messages that should be appended to every log message for migrated files (--comment), or you can tell vss2svn.pl to set the svn:date property for each migrated revision to reflect the original VSS commit date. If your VSS repository requires a login and password, you can specify that with the --vsslogin option, giving it a username and password, separated by colons.

If vss2svn.pl turns out to be insufficient for converting your repository, you may want to do a bit of searching online, as there are a few other VSS conversion tools being passed around. I can't vouch for how well or poorly any of them work, but as long as you have your database backed up, no harm should come to your data.

9.3.4. Migrating from Other VCSs

There are a number of other version control systems for which people have generated conversion scripts. The scripts appear to be in varying degrees of completion, and look to have often been created with just enough power and flexibility to convert the scriptwriter's own repository. However, that may just be enough for your repository too, and if it isn't, the modifications necessary to make it work may very well be easier than writing your own conversion tool from scratch.

Some of the version control system converters that I was able to find include a converter for a Perforce repository (this converter is linked to from Subversion's Web site), and a number of converters for a ClearCase repository.

9.3.5. What If There's No Migration Tool?

So, what if you don't use CVS or SourceSafe, but instead your entire code repository is in Bob's Discount VCS (or more likely, Bob's Mind-Bogglingly Expensive VCS)? In that case, you have a couple of options. The first step, of course, is to search online to see if someone else needed to migrate from your VCS to Subversion and wrote a conversion tool. You may also want to search the Subversion users' mailing list to see if someone out there did a similar conversion and is willing to share any tools that she created (or just some good advice about the problems she ran into). You also have the option of writing your own conversion tool if there is no sufficient tool already written. The source for existing conversion tools may be invaluable here. Or, you can just keep the old repository running as a reference and go from there.

If no tool exists (and creating one is impractical), your best bet is to check out a working copy of your current repository's HEAD, and then svn import that into a new Subversion repository. Then, you may want to go through and recreate important tags or long-running branches by hand. Simply check out the appropriate tags/branches and use svn import to add them to the new repository in the appropriate place. The new branches and tags won't have any history link to the files in the main trunk, but nothing else will have a history either, and Subversion won't care that there's no link down the road when you want to merge files from a branch or tag into the main trunk. If, for some reason, you decide that it is important for the branches and tags to be linked to the front, you can achieve that by creating the branch or tag inside the Subversion repository (using the newly imported trunk HEAD), and then copy over the versions of the files in the tag/branch from a working copy of the old repository.

After you have the new repository created, it's a good idea to keep the old repository up and running, in case someone finds that he needs some of the older repository information (I suggest making it read-only). In most cases, though, you'll probably find that the old repository is rarely accessed. After a time, you may even find that you can mothball the old repository, and just keep a backup of the data that could be restored later if needed.

On the other hand, if your project's development process depends heavily on retrieving and comparing data from old revisions, separating your new and old data into different repositories may not seem like such a good idea. If there are no tools available (and creating them isn't an option), two separate repositories may be your only option. There are, however, a couple of things that you can do, which may make things a little bit easier to deal with.

  • Keep working copies of both repositories handy. That way, if you need some information about a revision from the old repository, you can just hop over to that repository real quick and check what you need.

  • Create wrapper scripts that will bind your new Subversion repository and your old repository together. If you have scripts for all of the common (read-only) repository history querying commands you perform (such as diffs and log checking), you will be able to compare information almost seamlessly between the two repositories. (I don't suggest trying to allow cross-VCS commands that write to both repositories, at least without an awful lot of testing, as this is likely to introduce subtle bugs that could result in data loss.)

    For instance, if you want to create a wrapper for the diff command that spanned two different repositories, you could write a script that took revision identifiers for either repository (perhaps with qualifiers, if the repository identifiers are ambiguous about the version control system they refer to). Then, if it got two revisions for the same repository, it could simply run the native diff command for that repository. If, instead, it was passed two revisions on different version control systems, it could retrieve those files from their respective system and run an external diff program to compare the two.

  • After you have the new repository established, you very well may find points where you want to merge data that is in a revision from the old repository into a new revision in the new repository. If this happens, create a new tag in the Subversion repository that identifies the revision from the old repository you'll be merging from. (If it doesn't make sense to use the directory's name to identify the revision, use a property.) Then, import the revision to be merged into the new repository directory from the working copy of the old repository, just as if it were a set of unversioned files. After you have the new tag directory with the old repository revision, you can merge that directory into your Subversion repository trunk, using Subversion's standard merging tools.

    The downside to this method is that it causes your Subversion repository to grow, and if these sorts of merges occur often, they may grow the repository too much. If that is a concern, you can use external merging tools to perform the merge directly from your old repository's working copy into your Subversion repository. If you use this approach, it's best to use a property to keep track of where the merge occurred from, or at the very least make sure it's documented in the log for the merge commit.

  • If you're really ambitious, you might want to modify a Web-based repository browsing tool (such as ViewCVS or WebSVN) to support both Subversion and your old version control system. Then, the Web site could serve as a frontend to both VCSs, making the transition between the two as seamless as possible.



    Subversion Version Control. Using The Subversion Version Control System in Development Projects
    Subversion Version Control. Using The Subversion Version Control System in Development Projects
    ISBN: 131855182
    EAN: N/A
    Year: 2005
    Pages: 132

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net