Practice 14. Manage Versions

Bruce MacIsaac

Effective version management is critical to managing complex software and its changes over time.

Problem

Many software development efforts involve multiple people and multiple teams, often geographically separated, working in parallel on interdependent software, developed over multiple iterations and targeted at multiple products, releases, and platforms. It is easy to lose track of what has changed and why, and how the pieces fit together. The results can have a serious impact on cost, schedule, and quality.

This practice describes how to manage versions of files, components, and products while keeping chaos at bay.

Background

My wife's family loves to solve jigsaw puzzles, and I have spent many a Christmas puzzling with them. They are very good at it. However, whereas I, as a novice, may pick a piece out of the pile and spend ages trying to make it fit, they purposefully group similar pieces, connect them into meaningful chunks, and then deliver those chunks to the evolving puzzle.

Version management is similar. If you treat each change to a file separately, you can easily lose track of where it fits into the bigger picture. If instead you organize changes into meaningful sets and deliver them into stable environments, you can work much more effectively.

Organize changes into meaningful sets and deliver them into stable environments.

Software developers working in parallel need stable workspaces into which changes can be incorporated. Integrators need to pull the right versions of software together in order to make them work together. Testers and final product delivery need to make sure that they are testing and delivering builds composed of the right pieces.

Figure 5.5. Delivering Meaningful Sets of Changes.

To manage change effectively, identify meaningful sets of changes, and deliver them into stable environments.

Change management is becoming increasingly difficult for the following reasons:

Iterative development approaches mean constantly evolving components.
Shared components across products means that a single component may have many variants, each evolving in parallel.
Increasingly dynamic component and service-oriented architectures mean more ways in which components, services, and legacy systems can be combined to serve new purposes.

These challenges demand that teams evolve from managing change at the individual file level to delivering meaningful sets of changes into stable workspace environments.

Applying the Practice

Effective version control requires secure storage, balance of control and access, and a logical approach to managing sets of changes at all levels.

Store Artifacts Securely

Most software projects today use some kind of version control system to track versions of individual files in a repository. This is a good fundamental practice, as it allows multiple developers to work on a file without losing each other's changes. It also allows developers to recover from data losses; easily back out changes to individual files; and track the history of changes, for example, to determine when a defect was introduced.

Version control is not limited to source code. Other artifacts, including documents, data, models, and executables, should be placed under version control whenever you wish to avoid loss, recover earlier versions, and track history.

The repository stores all versions of your files and so must be fault-tolerant, reliable, and backed up with appropriate disaster recovery procedures. In addition, you may need it to be scalable and distributed to support larger distributed teams.

Balance Control and Freedom of Access

In some development environments people fear that unauthorized persons will introduce defects, and so impose many controls to prevent such changes. In other environments developers are encouraged to make changes wherever they see something that can be improved. Which is the right approach?

I consider the following to be a balanced approach: allow anyone to make changes, but require such changes to be codeveloped or reviewed by someone intimately familiar with the element being changed. For large projects, permissions should be organized around the team structure, as described in Practice 13: Organize Around the Architecture.

Allow anyone to make changes, but require changes to be codeveloped or reviewed by an expert.

Freedom to make changes encourages collaboration in the software development, enabling improvements to be incorporated quickly, while involving an expert ensures that naïve developers don't introduce problems as they add "improvements."

Use Version Control for Components and Component Configurations

Version control shouldn't be limited to individual files. Of increasing importance is version control of components and configurations:

It is easier to assemble consistent systems from consistent component baselines than from multitudes of individual files.

It is easier to assemble consistent systems from consistent component baselines than from multitudes of individual files.
Components and configurations can be tested and documented as having reached a specified quality level, making it easier for clients to decide when to migrate to a newer version.

Whether your project is producing an entire system or a component in a larger system, you should create a baseline of that product at each major milestone. In an iterative development approach you should create a baseline at least once at the end of each iteration. Baselines allow you to track progress between iterations and reproduce earlier versions. This can be helpful in identifying when and how defects were introduced, generating release notes, and gathering progress measures.

More than just the source code should be baselined. Instructions for how to build the executable software and dependencies such as compiler version, operating system version, and command line options should also be baselined, to ensure that builds can be reproduced. This step becomes particularly important when a baseline is released to customers, as the ability to recreate a customer's build is important to providing system support and maintenance.

Provide Controlled Workspaces

There is a practical joke that works as follows: first you buy several sizes of a hat and give the one that fits to a friend as a gift. Then you periodically switch that hat with a larger or smaller one. Your friend will be convinced that his or her head is changing size.

It is very difficult to reason in an unstable environment. A developer needs a workspace in which to make changes and test them without being impacted by unexpected changes other developers have made. The answer to the problem is to ensure that each developer has a private workspace where he or she can periodically recreate a baseline to pick up changes made by other developers in a controlled manner. It is important to pick up a consistent set of changes that completely implement a capability or fix a bug. And it helps to know what has changednot just the details of what changed in each file, but what are the new capabilities and what bugs have been fixedso that the impacts can be identified.

Workspaces let developers make changes without being impacted by others.

Similarly, testers and integrators need controlled workspaces that allow them to pick up complete sets of changes and know what has changed.

Organize Consistent Sets of Changes with Activities

When accepting changes into a controlled workspace, the first question you should ask is, "What changed?" The changes are easiest to understand if all those that serve a single purpose are identified and managed as a set. Let's call this set an "activity change set,"^[20] because it is the result of performing a complete activity such as fixing a defect or adding a feature.

^[20] IBM Rational Clearcase refers to this set of changes as an "activity." Other tools use other terms; for example, Bitkeeper calls it "change set," while Perforce uses the term "change list."

The simplest way to define a change set is to work on one activity at a time and then deliver the completed changes to the integration environment. This method works well, provided you have a single code stream and have no need to back out or track changes.

An "activity change set" is the result of performing a complete activity, such as fixing a defect or adding a feature.

If you want to deliver activity change sets to multiple code streams, or track changes at the activity level, you want tool environment support. Ideally, your version control environment allows you to indicate the activity you are working on (such as a feature or a defect); the environment then tracks your changes until you finish or until you switch to another activity. Figure 5.6 shows a typical flow for this way of working.

Figure 5.6. Making Changes Associated with an Activity.

All the changes associated with the activity (reason for the change) are delivered as a consistent whole.

Since all changes are associated with an activity, they can be delivered and accepted into other workspaces as a consistent whole (see Figure 5.6). It thus becomes possible to generate reports of how two baselines differ, not just as a list of files but as a list of activities that have been performed. This approach aids in creating release notes, assists testers in determining appropriate tests, and so on.

Enable Parallel Development with Streams

As you deliver changes into a workspace, that workspace evolves. A logical workspace, consisting of a baselined set of versioned files plus activities that describe what has changed since the baseline, is called a "stream." Figure 5.7 shows a typical set of streams for parallel development.

Figure 5.7. Streams for Parallel Development.

This figure shows an integration stream and the streams of two developers, jsmith and mcarroll. Jsmith completes two changes, CR002 and CR005, and then delivers them for integration, as shown by the arrow. Mcarroll does CR003 in parallel and delivers it for integration as well. Jsmith recreates a baseline and then continues work.

(Adapted from Bellagio 2004.)

Other stream variations are possible. For example, you can create a separate stream to stabilize each build and then, once the build is stable, deliver the changes back into the integration stream^[21] (see Figure 5.8).

^[21] See Berczuk 2002 for detailed guidance on using streams to manage integration.

Figure 5.8. Separate Stream to Stabilize the Build.

This figure shows changes delivered by a developer (mcarroll) to an integration stream, stabilized in a separate build stream, and then returned to the integration stream.

(Adapted from Bellagio 2004.)

Parallel development is much easier with streams and activities. Developers can work in different streams, and as they complete activities, their changes can be selectively delivered into integration and other streams. Maintaining each stream as a controlled environment, and each activity change set as an evolution of that environment, helps you to understand exactly which changes have been applied, and where.

Streams and activities make parallel development easier, since changes can be delivered as sets into parallel streams.

Use Activities and Streams to Manage Variants

Ideally, a software component is developed and then evolves over time. However, it is sometimes necessary to create more than one variant of the same component, for a number of reasons:

Functionality may be required to support variations in user needs.
A single product may be targeted at different hardware and software platforms or a different country, culture, or language.
Older versions of a product may need to be maintained to support an existing customer base.

Activities and streams can make it easier to manage multiple variants being developed in parallel.

Let's consider an example in which a company has a product line with three variants of a particular productlet's call them A, B, and C. A defect in the installer component for version A has been found. The component has been fixed for A and a baseline recreated. What do we do with products B and C?

Product B uses the same version of the installer (prior to the fix). The integrator for product B picks up the new baseline for the component. Tests for B may need to be updated. Tests for B are rerun to ensure that nothing has broken.
Product C uses a different version of the component. The activity change set can be applied to C's product stream and the changes merged, as shown in Figure 5.9. A good merging facility can in most cases merge changes and prompt when there are conflicts, such as if the defect corrects code that does not exist in C. Because a fix has been merged into a different version of the component, inspection and testing are critical to ensure that this alternative version of the fix works.

Figure 5.9. Merging a Fix into Different Product Streams.

This figure shows three product streams using a common installer component. A fix in A is picked up by B. C is using a different version of the component, making a merge necessary.

Maintaining different evolutionary paths of a component is more expensive in the long run, as each change needs to be evaluated for applicability to the different versions and may need adjustment and specialized testing in those different versions. In general, it is better to use an architecture that isolates those elements that can change, using components that are easy to swap out or otherwise configure. For example, one telecommunications company had a large catalog of telephone switches, but underneath the hood, the software was exactly the samethe cheaper version simply had some of the features disabled. During development, some components would temporarily evolve independently, but the changes were always pulled back into the base. Figure 5.10 shows how this can be done, with a "main" stream used as the basis for two project streams, REL1 and REL2.

Figure 5.10. Parallel Projects.

This figure shows integration streams for two parallel projects, REL1 and REL2. REL1 initially diverges, but changes in REL1 are then delivered back to a "main" stream and later incorporated into the REL2 project.

(Adapted from Bellagio 2004.)

As these examples show, dealing with multiple streams is complicated, even with the advantages that activities provide. With activities, you can manage streams at a high levelthe bug fixes and features that differentiate the streams. But without activities, you have to deal with all the individual file differences, an explosion of complexity that can be impossible to manage.

Activities let you manage streams in terms of the bug fixes and features that differentiate them, rather than getting lost in the complexity of individual file differences.

Other Methods

Agile methods, such as Scrum and XP, generally focus on the needs of smaller teams. XP in particular recommends a single code stream and "continuous integration," in which every few hours changes are incorporated and tested in the integration environment. This simple approach ensures continuous improvement.

Unified Process similarly recommends that integration be as continuous as possible and change management as simple as possible. However, the need to track changes increases with product and organizational complexity. RUP's guidance for configuration and change management extends to cover the needs of larger projects, multiple product lines, standards compliance, enterprise reuse, and enterprise architectures. These challenges increasingly demand that teams organize and deliver changes as meaningful "activity" sets.

Levels of Adoption

This practice can be adopted at different levels:

Basic. Version control of files and builds allows the history of individual files and builds to be tracked. Changes for a specific purpose are collected and integrated as a set.
This fits with the agile practice of continuous integration and requires little in terms of tools or training.
Intermediate. Isolated workspaces reduce interference among team members working in parallel. Activity-based control and delivery of change sets help to manage changes.
A good version control tool can help you keep track of your work and so increase your ability to work in an iterative development environment. Some overheads with tooling and training exist, but they are not a significant barrier for most projects, even most small projects.
Advanced. Shared components are managed across different product lines.
Managing components shared across different product lines requires disciplined change management. Such efforts are usually associated with large projects, or enterprise-level reuse and architecture efforts.

Related Practices

Practice 2: Execute Your Project in Iterations describes the importance of growing systems in increments that are continuously integrated. Incremental change and continuous integration are facilitated by the ability to deliver meaningful sets of changes.
Practice 13: Organize Around the Architecture describes how teams are organized around the components in the architecture. Team interactions are facilitated by effective version control of those components.

Additional Information

Information in the Unified Process

OpenUP/Basic describes a basic activity-based configuration and change management process suited to a typical small project. OpenUP/Basic assumes that the project has the tools and environment in place, and thus focuses on execution of the project, not on plans and setup.

RUP adds guidance to address tool selection, environment setup, configuration management plans, audits, multiple product lines, standards compliance, enterprise reuse, and enterprise architectures.

Additional Reading

For additional guidance on version control and change management, see the following:

David Bellagio and Tom Milligan. Software Configuration Management Strategies and IBM Rational ClearCase: A Practical Introduction, Second Edition. IBM Press, 2005.
S. Berczuk and B. Appleton. Software Configuration Management Patterns: Effective Teamwork, Practical Integration. Addison-Wesley, 2002.
Jason Leonard. "Simplifying Product Line Development Using UCM Streams." The Rational Edge, http://www.ibm.com/developerworks/rational/library/1748.html.

Problem

Background

Figure 5.5. Delivering Meaningful Sets of Changes.

To manage change effectively, identify meaningful sets of changes, and deliver them into stable environments.

Applying the Practice

Store Artifacts Securely

Balance Control and Freedom of Access

Use Version Control for Components and Component Configurations

Provide Controlled Workspaces

Organize Consistent Sets of Changes with Activities

Figure 5.6. Making Changes Associated with an Activity.

All the changes associated with the activity (reason for the change) are delivered as a consistent whole.

Enable Parallel Development with Streams

Figure 5.7. Streams for Parallel Development.

Figure 5.8. Separate Stream to Stabilize the Build.

This figure shows changes delivered by a developer (mcarroll) to an integration stream, stabilized in a separate build stream, and then returned to the integration stream.

Use Activities and Streams to Manage Variants

Figure 5.9. Merging a Fix into Different Product Streams.

This figure shows three product streams using a common installer component. A fix in A is picked up by B. C is using a different version of the component, making a merge necessary.

Figure 5.10. Parallel Projects.

This figure shows integration streams for two parallel projects, REL1 and REL2. REL1 initially diverges, but changes in REL1 are then delivered back to a "main" stream and later incorporated into the REL2 project.

Other Methods

Levels of Adoption

Related Practices

Additional Information

Information in the Unified Process

Additional Reading