Bruce MacIsaac Effective version management is critical to managing complex software and its changes over time. ProblemMany software development efforts involve multiple people and multiple teams, often geographically separated, working in parallel on interdependent software, developed over multiple iterations and targeted at multiple products, releases, and platforms. It is easy to lose track of what has changed and why, and how the pieces fit together. The results can have a serious impact on cost, schedule, and quality. This practice describes how to manage versions of files, components, and products while keeping chaos at bay. BackgroundMy wife's family loves to solve jigsaw puzzles, and I have spent many a Christmas puzzling with them. They are very good at it. However, whereas I, as a novice, may pick a piece out of the pile and spend ages trying to make it fit, they purposefully group similar pieces, connect them into meaningful chunks, and then deliver those chunks to the evolving puzzle. Version management is similar. If you treat each change to a file separately, you can easily lose track of where it fits into the bigger picture. If instead you organize changes into meaningful sets and deliver them into stable environments, you can work much more effectively.
Software developers working in parallel need stable workspaces into which changes can be incorporated. Integrators need to pull the right versions of software together in order to make them work together. Testers and final product delivery need to make sure that they are testing and delivering builds composed of the right pieces. Figure 5.5. Delivering Meaningful Sets of Changes.To manage change effectively, identify meaningful sets of changes, and deliver them into stable environments.
Change management is becoming increasingly difficult for the following reasons:
These challenges demand that teams evolve from managing change at the individual file level to delivering meaningful sets of changes into stable workspace environments. Applying the PracticeEffective version control requires secure storage, balance of control and access, and a logical approach to managing sets of changes at all levels. Store Artifacts SecurelyMost software projects today use some kind of version control system to track versions of individual files in a repository. This is a good fundamental practice, as it allows multiple developers to work on a file without losing each other's changes. It also allows developers to recover from data losses; easily back out changes to individual files; and track the history of changes, for example, to determine when a defect was introduced. Version control is not limited to source code. Other artifacts, including documents, data, models, and executables, should be placed under version control whenever you wish to avoid loss, recover earlier versions, and track history. The repository stores all versions of your files and so must be fault-tolerant, reliable, and backed up with appropriate disaster recovery procedures. In addition, you may need it to be scalable and distributed to support larger distributed teams. Balance Control and Freedom of AccessIn some development environments people fear that unauthorized persons will introduce defects, and so impose many controls to prevent such changes. In other environments developers are encouraged to make changes wherever they see something that can be improved. Which is the right approach? I consider the following to be a balanced approach: allow anyone to make changes, but require such changes to be codeveloped or reviewed by someone intimately familiar with the element being changed. For large projects, permissions should be organized around the team structure, as described in Practice 13: Organize Around the Architecture.
Freedom to make changes encourages collaboration in the software development, enabling improvements to be incorporated quickly, while involving an expert ensures that naïve developers don't introduce problems as they add "improvements." Use Version Control for Components and Component ConfigurationsVersion control shouldn't be limited to individual files. Of increasing importance is version control of components and configurations:
Whether your project is producing an entire system or a component in a larger system, you should create a baseline of that product at each major milestone. In an iterative development approach you should create a baseline at least once at the end of each iteration. Baselines allow you to track progress between iterations and reproduce earlier versions. This can be helpful in identifying when and how defects were introduced, generating release notes, and gathering progress measures. More than just the source code should be baselined. Instructions for how to build the executable software and dependencies such as compiler version, operating system version, and command line options should also be baselined, to ensure that builds can be reproduced. This step becomes particularly important when a baseline is released to customers, as the ability to recreate a customer's build is important to providing system support and maintenance. Provide Controlled WorkspacesThere is a practical joke that works as follows: first you buy several sizes of a hat and give the one that fits to a friend as a gift. Then you periodically switch that hat with a larger or smaller one. Your friend will be convinced that his or her head is changing size. It is very difficult to reason in an unstable environment. A developer needs a workspace in which to make changes and test them without being impacted by unexpected changes other developers have made. The answer to the problem is to ensure that each developer has a private workspace where he or she can periodically recreate a baseline to pick up changes made by other developers in a controlled manner. It is important to pick up a consistent set of changes that completely implement a capability or fix a bug. And it helps to know what has changednot just the details of what changed in each file, but what are the new capabilities and what bugs have been fixedso that the impacts can be identified.
Similarly, testers and integrators need controlled workspaces that allow them to pick up complete sets of changes and know what has changed. Organize Consistent Sets of Changes with ActivitiesWhen accepting changes into a controlled workspace, the first question you should ask is, "What changed?" The changes are easiest to understand if all those that serve a single purpose are identified and managed as a set. Let's call this set an "activity change set,"[20] because it is the result of performing a complete activity such as fixing a defect or adding a feature.
The simplest way to define a change set is to work on one activity at a time and then deliver the completed changes to the integration environment. This method works well, provided you have a single code stream and have no need to back out or track changes.
If you want to deliver activity change sets to multiple code streams, or track changes at the activity level, you want tool environment support. Ideally, your version control environment allows you to indicate the activity you are working on (such as a feature or a defect); the environment then tracks your changes until you finish or until you switch to another activity. Figure 5.6 shows a typical flow for this way of working. Figure 5.6. Making Changes Associated with an Activity.All the changes associated with the activity (reason for the change) are delivered as a consistent whole.
Since all changes are associated with an activity, they can be delivered and accepted into other workspaces as a consistent whole (see Figure 5.6). It thus becomes possible to generate reports of how two baselines differ, not just as a list of files but as a list of activities that have been performed. This approach aids in creating release notes, assists testers in determining appropriate tests, and so on. Enable Parallel Development with StreamsAs you deliver changes into a workspace, that workspace evolves. A logical workspace, consisting of a baselined set of versioned files plus activities that describe what has changed since the baseline, is called a "stream." Figure 5.7 shows a typical set of streams for parallel development. Figure 5.7. Streams for Parallel Development.This figure shows an integration stream and the streams of two developers, jsmith and mcarroll. Jsmith completes two changes, CR002 and CR005, and then delivers them for integration, as shown by the arrow. Mcarroll does CR003 in parallel and delivers it for integration as well. Jsmith recreates a baseline and then continues work. (Adapted from Bellagio 2004.) Other stream variations are possible. For example, you can create a separate stream to stabilize each build and then, once the build is stable, deliver the changes back into the integration stream[21] (see Figure 5.8).
Figure 5.8. Separate Stream to Stabilize the Build.This figure shows changes delivered by a developer (mcarroll) to an integration stream, stabilized in a separate build stream, and then returned to the integration stream. (Adapted from Bellagio 2004.) Parallel development is much easier with streams and activities. Developers can work in different streams, and as they complete activities, their changes can be selectively delivered into integration and other streams. Maintaining each stream as a controlled environment, and each activity change set as an evolution of that environment, helps you to understand exactly which changes have been applied, and where.
Use Activities and Streams to Manage VariantsIdeally, a software component is developed and then evolves over time. However, it is sometimes necessary to create more than one variant of the same component, for a number of reasons:
Activities and streams can make it easier to manage multiple variants being developed in parallel. Let's consider an example in which a company has a product line with three variants of a particular productlet's call them A, B, and C. A defect in the installer component for version A has been found. The component has been fixed for A and a baseline recreated. What do we do with products B and C?
Maintaining different evolutionary paths of a component is more expensive in the long run, as each change needs to be evaluated for applicability to the different versions and may need adjustment and specialized testing in those different versions. In general, it is better to use an architecture that isolates those elements that can change, using components that are easy to swap out or otherwise configure. For example, one telecommunications company had a large catalog of telephone switches, but underneath the hood, the software was exactly the samethe cheaper version simply had some of the features disabled. During development, some components would temporarily evolve independently, but the changes were always pulled back into the base. Figure 5.10 shows how this can be done, with a "main" stream used as the basis for two project streams, REL1 and REL2. Figure 5.10. Parallel Projects.This figure shows integration streams for two parallel projects, REL1 and REL2. REL1 initially diverges, but changes in REL1 are then delivered back to a "main" stream and later incorporated into the REL2 project. (Adapted from Bellagio 2004.) As these examples show, dealing with multiple streams is complicated, even with the advantages that activities provide. With activities, you can manage streams at a high levelthe bug fixes and features that differentiate the streams. But without activities, you have to deal with all the individual file differences, an explosion of complexity that can be impossible to manage.
Other MethodsAgile methods, such as Scrum and XP, generally focus on the needs of smaller teams. XP in particular recommends a single code stream and "continuous integration," in which every few hours changes are incorporated and tested in the integration environment. This simple approach ensures continuous improvement. Unified Process similarly recommends that integration be as continuous as possible and change management as simple as possible. However, the need to track changes increases with product and organizational complexity. RUP's guidance for configuration and change management extends to cover the needs of larger projects, multiple product lines, standards compliance, enterprise reuse, and enterprise architectures. These challenges increasingly demand that teams organize and deliver changes as meaningful "activity" sets. Levels of AdoptionThis practice can be adopted at different levels:
Related Practices
Additional InformationInformation in the Unified ProcessOpenUP/Basic describes a basic activity-based configuration and change management process suited to a typical small project. OpenUP/Basic assumes that the project has the tools and environment in place, and thus focuses on execution of the project, not on plans and setup. RUP adds guidance to address tool selection, environment setup, configuration management plans, audits, multiple product lines, standards compliance, enterprise reuse, and enterprise architectures. Additional ReadingFor additional guidance on version control and change management, see the following:
|