14.2. Real-world StudiesHypothetical case studies based around common project archetypes have their uses, but it can also be helpful to take a look at some real-world projects, and how they manage many of the issues associated with using Subversion. As with the archetypal studies, in this section I will examine the different choices made by these projects, and how those choices fit in with the topics discussed in previous chapters. 14.2.1. KeyGhost LtdKeyGhost, Ltd. is a developer of embedded and PC software and hardware that uses Subversion for storing not only software source code, but also documentation in the form of Open Office files, and hardware designs. It chose to use Subversion based on indications that it is poised to become the next open source version control standard. RepositoryKeyGhost arranges its projects into 29 separate repositories, one for each project, most of which are legacy projects that see little activity. Its more active projects have around 500 revisions, with a total repository size of 1GB. In total, the repositories are used by less than 10 developers who all have rights to commit changes. Each repository is organized into a top-level directory, named for the project, with TRunk, branches, and tags directories in each project. Under the trunk directory, developers categorize project files into source code, documentation, and hardware designs (using source, docs, and pcb, respectively). Figure 14.5 shows an example of the standard layout for a KeyGhost repository. KeyGhost makes use of tags for storing project releases, and uses branches whenever it has an appropriate need for branching a project's development. Figure 14.5. The standard KeyGhost repository layout.
The repositories themselves are hosted on a Microsoft Windows 2000 server, using the Berkeley DB database backend. To share the repository, KeyGhost uses the Apache server, largely due to its ease of setup and administration. It also uses secure HTTP over SSL to secure the repository for remote access. The KeyGhost developers access Subversion from Windows 2000 client machines, and use TortoiseSVN as a GUI client. Migrating to SubversionThe KeyGhost migration to Subversion involved a conceptual change, from using a paradigm where files were locked to limit a single developer to making modifications to a particular file at a time, to the Subversion merge paradigm without locking. Additionally, they had to overcome the hurdle of a user base without previous experience in version control. To overcome this issue, developers were given training in version control, and provided with the TortoiseSVN GUI client to make the learning curve significantly less steep. Storing Binary FilesIn addition to storing text-based source code files, KeyGhost also uses its Subversion repository for storing binary files from Open Office and its circuit board design package. Despite Subversion's use of a binary difference algorithm to store only changes to a binary file, developers found the storage requirements from one version of a file to the next to be hefty. In order to limit unbounded exponential repository growth, KeyGhost has made a policy of limiting commits to its binary files. Repository MigrationSubversion makes a valiant attempt to make restructuring of a repository a simple process. However, KeyGhost discovered that simple does not mean trouble free, and it can be prudent to put some long-term thought into structure. KeyGhost began with a single Subversion repository, using a single top-level /trunk directory with individual projects in subdirectories under that. After using Subversion for a while, however, KeyGhost decided to migrate to its current structure of multiple repositories, with one project per repository. Because a number of files had been moved or deleted, KeyGhost found that svnadmin export and svndumpfilter were unable to properly migrate all of the projects with their full histories. In the end, KeyGhost was forced to resort to checking out working copies and reimporting those into a new repository (which still caused a loss of history). 14.2.2. Error Free SoftwareError Free Software (EFS) develops a proprietary trading system, which it stores in a Subversion repository. EFS chose Subversion after examining a number of different version control systems, and settled on Subversion due to its snug fit with the EFS environment. The developers found it to have a full feature set, without any undue complexity. RepositoryThe EFS repository is arranged with a number of different top-level directories with a variety of purposes, as you can see in Figure 14.6. Figure 14.6. The Error Free Software repository layout./branches This directory stores project branches. EFS doesn't make very much use of this directory, and as of this writing only had a total of six branches. /dailyLibraryBuild Daily builds of the repository are stored here. Each daily build is placed in its own directory. The directories are named for the date of the build, using two-digit year, month, and day numbers (YYMMDD). /releases Project releases are stored here. /doc This directory is used to store documentation for individual projects. /projects The EFS trading system consists of a large number of application suites and libraries. This directory is used to store individual application suites, which are linked to the various libraries using svn:externals properties. /src The actual source code for the repository (which is linked via svn:externals in /projects) is stored in this directory. This acts as EFS's /trunk directory. /spd EFS stores its design documents for its software in this directory. The repository itself is very large, totaling more than 35,000 revisions in a 2GB database. Much of the repository, however, was preexisting when EFS migrated to Subversion, and was carried over from SCCS. The repository is hosted on a machine running RedHat Linux, and uses Berkeley DB as its repository database backend. The repository is also accessed by about 30 people every day, most of whom perform regular commits. The developers access the repository from a mix of machines running Sun's Solaris and machines running Microsoft's Windows XP. Remote access to the repository is done through Apache, which was chosen due to its ease of integration into the existing authentication infrastructure, previous familiarity with Apache, and general all-round good looks. It also made it easy to make the repository accessible from a Web browser. 14.2.3. Teledata CommunicationsTeledata Communications, Inc. (TCI) uses Subversion to store all of its source code, documentation, and build projects, as well as information from data providers, and development documentation. TCI began testing Subversion fairly early on in its development, at around version 0.24, and have been using it in a production setting since July of 2003, after giving it a thorough run through all of its paces. One of the major reasons for switching to Subversion from TCI's previous (commercial) version control system was to save costs on per-seat developer licensing. As the company was growing in size, it came to the conclusion that its previous VCS solution wasn't worth the cost. So instead of shelling out more money to license new developers, TCI decided to make the jump to Subversion instead. Even though their developers had experience with the previous system, as did most of their new hires, the benefits of moving to Subversion outweighed the costs of training. The other reason for TCI's switch to Subversion is best illustrated by Mark Bohlman, the Software Development Manager at TCI.
RepositoryTeledata Communications' data is split into three separate repositories, each of which holds a different type of data.
Branches and TagsThe TCI developer repository uses tags in its automated build process. The Java applications that developers build run under WebLogic and have an Ant-based build process that involves creating a tag for each build provided to a development test, QA test, or production environment. To ensure consistency between the three builds, they are all done at the same time. Custom properties are used to indicate the configuration files that should be used for determining build environments. TCI also makes use of branches for a variety of uses. Changes in branches are periodically merged back into the trunk, as appropriate.
Branches are also sometimes used for bug fixes. Whether to do bug fixes in a branch or in the trunk is dependent on the development state of an application (i.e., QA, beta, or production). Hook ScriptsHook scripts are used for
14.2.4. GladeSoftThis case study looks at GladeSoft, Inc. GladeSoft is the smallest company among the various case studies (it has three developers accessing the repository). GladeSoft migrated to Subversion under familiar circumstances, after finding CVS too painful to continue using. Within the company, Subversion repositories are used to store source code, corporate data, and the GladeSoft Web site. GladeSoft's choice to use Subversion came down to a variety of different requirements it had for a version control system.
RepositoryGladeSoft uses three separate repositories for storing information.
The source code repository is arranged with a standard /trunk, /branches, and /tags. Inside /branches are several subdirectories that allow them to categorize their branches. Tags are created to mark feature freezes and release points in the source code, and branches are used mainly for making customer-specific changes. The repository is small, holding less than 20MB of data, in over 1500 revisions, and is accessed by three developers who commit changes. The other two repositories don't make use of branching or tagging, and simply have their main file tree at the top level. These repositories are even smaller than the source code repository, clocking in at around 5MB, with even fewer revisionsGladesoft doesn't like doing paperwork. All three repositories are served via Apache, from an old 200MHz PowerPC running Gentoo Linux. HTTP was chosen for its capability to work without local shell accounts (for individual users), which provides extra security. It is also used for its source browsing capability, which makes it easy for GladeSoft to quickly check a source file or do an informal code review. Client connections are made from a menagerie of operating systems, including Windows, Linux, various BSDs, and OS X. Hook ScriptsGladeSoft also uses two hook scripts for its source repository.
14.2.5. ExCoIn this case study, we will look at a company that uses Subversion to store its complete source code base, as well as its build tools. The company declined to have its real name mentioned, so to protect the innocent I will call it "ExCo" instead. ExCo began using Subversion after migrating from its CVS repository in 2003, which it had previously migrated to from Microsoft Visual SourceSafe in 2000. The migration occurred because CVS was not meeting ExCo's needs (although it was still better than VSS). Subversion allowed ExCo to maintain a similar development paradigm (thus less training) while making almost everything easier to perform. Some of the other reasons for the migration include
RepositoryExCo's repositories are set up with a fairly standard arrangement. The top-level directories are made up of /trunk, /branches, and /tags, as well as a directory named /devbranches, where individual developers can create their own task branches (see Figure 14.7). Figure 14.7. The ExCo repository layout.ExCo has three repositories, which hold 8,000, 9,600, and 2,400 files, respectively, and are used to store different sets of projects. Access to the repositories are through an Apache server, due to its stability and security (via SSL). Because ExCo has developers overseas, the security was an important feature. Commits to the repository are allowed for all developers who have access, which comprises approximately 20 to 28 developers. The server is hosted on a Solaris machine, with users connecting from Windows 2000 clients. Branches and TagsBranches and tags are heavily used inside the ExCo Subversion repositories.
Because of ExCo's heavy use of branches, it has found it necessary to deal with merge tracking (which Subversion lacks in any real form). Instead of having an overall merge tracking plan, however, ExCo relies on individual developers to track their own changes and merges. To date, this has worked well and not caused any real problems. The People ProblemOne of the issues noted by ExCo as a problem to be dealt with was not technical at all. It is the problem of getting developers to integrate their work process with a version control system. Ron Bieber (of "ExCo") explained this issue.
14.2.6. Wye CorpIn this case study, we'll look at a company that does embedded development, and uses Subversion to store its firmware and hardware specifications, as well as source for device drivers and testing applications. The company in question declined to be identified by name for this book, so I'll refer to it as "Wye Corp." Wye Corp switched to Subversion after hitting one too many walls while dealing with CVS's limitations. Many of its projects had started out for internal use only, but as time passed, its customers started using their tools, which inevitably led to requests from the customers to add features and expand the projects. Attempts to expand the projects, however, quickly hit a wall with CVS, as developers attempted to restructure file and directory layouts and found it impossible without breaking CVS's file history. RepositoryInstead of setting up a single repository, Wye Corp uses a different repository for each project, 16 in all. The repositories range in size up to 400MB, but have relatively low revision counts (under 500). Some of the repositories are as old as two years. Overall, the company has 14 developers accessing its various repositories, and limits access to only those people who have a need. Repository layout for Wye Corp is a fairly standard /trunk, /branches, and /tags setup, with the addition of a dedicated /releases directory. The releases directory allows Wye Corp to separate internal tags used for marking milestones (such as points where support for a special feature was incorporated) from releases that were delivered to a customer, which is something Wye Corp needs to keep careful track of. Inside /releases, there is also a directory called info, which contains a file releases.txt. Wye Corp uses the releases.txt file to keep track of which customer received each released version of its software. Figure 14.8 shows how this layout is arranged. Figure 14.8. The Wye Corp repository layout.The repository itself is hosted on a Dell server with 400GB of RAID5 storage space and 1GB of memory, running RedHat Enterprise Linux. Remote access to the repository is served via Apache. Originally, HTTP was chosen because it was the only network capable server available (Wye Corp was a very early adopter of Subversion, starting at version 0.17). Later on, however, it began to rely on the convenience of browsing the repository over the Web, as well as the ability to authenticate against its LDAP servers. Branches and TagsWye Corp has a few different uses for branches and tags.
For the most part, Wye Corp has found little reason to do many merges from one branch to another, because most of their development occurs on the main trunk, which it uses to directly create release tags. Occasionally, however, circumstances have required merges to a release that used only certain revisions. In those cases, Wye Corp has used a separate file in the repository to keep track of all merges. Hook ScriptsWye Corp has several hook scripts in place to enforce policy and provide automatic notifications.
14.2.7. ZedComFor our final case study, let's look at one more company that decided not to have its name used. We'll call it "ZedCom." ZedCom uses its Subversion repository to store firmware for multimedia embedded systems. Like many of the previous case studies, ZedCom migrated to Subversion from CVS, due to the limitations that I've already exhaustively discussed. RepositoryThe ZedCom repository is arranged with four top-level directories: /trunk, /users, /branches, and /tags. The familiar directories have the functions you would normally expect, although I should note that the /trunk directory contains subdirectories for individual projects. Additionally, the /users directory contains subdirectories for each Subversion user (named after his username), where individual users can create their own private branches. The repository has relatively few users, with 10 who can access the repository and six who perform commits. It contains about 600MB of data in 1,800 revisions. Access is performed via svnserve, due to its ease of setup (ZedCom has no need for secure authentication). Branching is limited to user branching, and although ZedCom has a /branches directory, it has never actually used it. Merge tracking of the branches is done manually via commit logs, which works, but serves as a source of irritation for the developers. |