Hatteras is an enterprise-class Software Configuration Management (SCM) product. The codename Hatteras comes from a lighthouse on the shores of North Carolina where the product is being developed. The final name of the product is Team Foundation, and it includes more than just the source control functionality. The Hatteras piece is referred to as Team Foundation Source Control (TFSC). The other pieces of the Team Foundation product are touched on in Chapter 18. I wanted to include this tool in this chapter as I just briefly talk about the upcoming VSTS tools in Chapter 18 but wanted to go into more details on TFSC. Another reason for me to include this section is that there are some important definitions that need to be added to our build dialect, such as all of the branching definitions. This tool has been completely designed and developed from scratch; in other words, this is not a new generation of Microsoft's infamous VSS. It provides standard source code version control functionality that scales across thousands of developers, such as Microsoft's own development teams. As part of the Visual Studio (VS) 2005 release, Hatteras provides integration with the Visual Studio IDE and with other enterprise tools such as the Visual Studio work item (bug) tracking tool. Hatteras also provides a standalone GUI, a command-line interface, and a Web-based interface. Let's define some new terms as they relate to TFSC:
Some of the features in TFSC are fairly standard among SCC tools:
What really sets TFSC apart from the competition is its powerful merging and branching features. I don't try to explain the entire product here, but just touch on why I think these two features are so cool. Merging Functionality in TFSCThe merging functionality in TFSC is centered on the following typical development scenarios:
How TFSC Addresses the ScenariosTFSC merging is designed to provide users with an extremely powerful and flexible tool for managing the contents of branches. Merges can be made into a single file or into a tree of related files. Merges can also migrate the entire change history of the specified source files or an individual change set or revision that might contain a specific fix or feature that should be migrated without moving other changes from the source in the process. Merging the entire change history prior to a given point in time is known as a catch-up merge (Scenarios 1 and 2), whereas selecting individual change sets or revisions to merge is known as a cherry-pick merge (Scenarios 3 and 4). The merge command also allows users to query for merge history and merge candidates and perform the actual merge operation. TFSC presents merge history and candidate merges as a list of change sets that have or can be migrated between a source and a target branch. Merges can be made to a subset of files in a change set, creating a situation in which a partial change set has been merged. In this case, TFSC represents the partial state of the merge and allows the user to finish merging the change set later. Merges are pending changes in TFSC. The user can choose to perform several merge operations within a workspace without committing changes following each merge. All these merges can be staged in the user's workspace and committed with a single check-in as a single change set. In addition, the pending merge operation can be combined with the checkout and rename commands to interject additional changes to the files that will be committed with the merge. Hopefully you followed this summary and are still with me. Now let's go into how branching works in TFSC. Branching in TFSCBranching is the SCM operation of creating an independent line of development for one or more files. In a sense, branching a file results in two identical copies of the original file that can be modified as desired. Changes in the old line are not, by default, reflected in the new line and vice versa. Explicit operations can be performed to merge changes from one branch into another. There are many different reasons for branching and many different techniques to accomplish it. In the most common scenarios, branching is reasonably simple, but branching can become complicated. A complex system with lots of branched files can be hard to visualize. I recommend mapping this with a visual product (such as Visio) so that the picture is clear. Following are a handful of scenarios in which branching is interesting. Any SCM team should adopt these definitions. Release BranchingWe've been working on a Version 1 release for a year now, and it is time to begin work on Version 2. We need to finish coding Version 1 fixing bugs, running tests, and so on but many of the developers are finished with their Version 1 work (other than occasional interruption for bug fixes) and want to start designing and implementing features for Version 2. To enable this, we want to create a branch off the Version 1 tree for the Version 2 work. Over time, we want to migrate all the bug fixes we make in the process of releasing Version 1 into the Version 2 code base. Furthermore, we occasionally find a Version 1 bug that happens to be fixed already in Version 2. We want to migrate the fix from the Version 2 tree into the Version 1 tree. Promotion ModelingPromotion modeling is equivalent to release branching, where each phase is a release. It is a development methodology in which source files go through stages. Source files might start in the development phase, be promoted to the test phase, and then go through integration testing, release candidate, and release. This phasing serves a couple of purposes. It allows parallel work in different phases, and it clearly identifies the status of all the sources. Separate branches are sometimes used for each phase of the development process. Developer IsolationA developer (or a group) needs to work on a new feature that will be destabilizing and take a long time to implement. In the meantime, the developer needs to be able to version his changes (check in intermediate progress, and so on). To accomplish this, he branches the code that he intends to work on and does all his work independently. Periodically, he can merge changes from the main branch to make sure that his changes don't get too far out of sync with the work of other developers. When he is done, he can merge his changes back into the main branch. Developer isolation also applies when semi-independent teams collaborate on a product. Each team wants to work with the latest version of its own source but wants to use an approved version of source from other teams. The teams can accomplish this in two ways. In the first way, the subscribing team "pulls" the snapshot that it wants into its configuration, and in the second way, the publishing team publishes the "approved" version for all the client teams to pick up automatically. Label BranchingWe label important points in time, such as every build that we produce. A partner team picks up and uses our published builds on a periodic basis, perhaps monthly. A couple of weeks after picking up a build, the team discovers a blocking bug. It needs a fix quickly but can't afford the time to go through the approval process of picking up an entirely new build. The team needs the build it picked up before plus one fix. To do this, we create a branch of the source tree that contains all the appropriate file versions that are labeled with the selected build number. We can fix the bug in that branch directly and migrate the changes into the "main" branch, or we can migrate the existing fix (if it had been done) from the "main" branch into the new partner build branch. Component BranchingWe have a component that performs a function (for simplicity, let's imagine it is a single file component). We discover that we need another component that does nearly the same thing but with some level of change. We don't want to modify the code to perform both functions; rather, we want to use the code for the old component as the basis for creating the new component. We could just copy the code into another file and check it in, but among other things, the new copy loses all the history of what brought it to this point. The solution is to branch the file. That way, both files can be modified independently, both can preserve their history, and bug fixes can be migrated between them if necessary. Partial BranchingPartial branching is equivalent to component branching, where the "component" is the versioned product. In this case, we work on a product that has a series of releases. We shipped the Everett release and are working on the Whidbey release. As a general rule, all artifacts that make up each version should be branched for the release (source, tools, specs, and so on). However, some versioned files aren't release specific. For example, we have an emergency contact list that has the home phone numbers for team members. When we update the list, we don't want to be bothered with having to merge the changes into each of the product version branches, yet the developers who are enlisted in each version branch want to be able to sync the file to their enlistment. Identifying Branches (Configurations)When a file is branched, it is as if a new file is created. We need a way to identify that new file. Historically, this has been done by including the version number of the file as part of the name of the file. In such a mechanism, the version number consists of a branch number and a revision number. A branch number is formed by taking the version number of the file to be branched, appending an integer, and then adding a second integer as a revision number. For example, 1.2 becomes 1.2.1.1 (where 1.2.1 is the branch number and 1 is the revision number). See Chapter 16 for more details on branch labeling. This is all well and good, but it quickly becomes unwieldy not only from the standpoint of dealing with individual files, but also from the standpoint of trying to pick version numbers apart to understand what they mean. To address these issues, the notion of "configurations" was developed. A configuration is a collection of files and their version number. Configurations generally have a human-readable name, such as Acme 1.0. Having named configurations is great, but before long, even that will get to be a problem. You will need a way to organize them. An interesting way to address this organization problem is to make configurations part of the actual source code hierarchy. This method of organization is natural because it is how people do it without version control. It avoids the problem of having to teach most people the concept of configuration, and it provides a great deal of flexibility in how you combine configurations. For example, two versions of an Acme product (where Version 2.0 is branched from Version 1.0) might look something like this:
Branching granularity has different approaches. In the traditional approach, branching is done on a file-by-file basis. Each file can be branched independently at different times, from different versions, and so on. Configurations help prevent this from becoming chaotic. They provide an umbrella to help people understand the purpose of the various branches. File-by-file branching is flexible, but you must take care to ensure that it doesn't get out of hand. In addition, file-by-file branching can be hard to visualize. Another technique is always to do branching globally. Whenever a branch is created, all files in the system are branched. (There are ways to do this efficiently, so it's not as bad as it sounds.) The upside of this global branching is that it is easy to understand and visualize. The downsides include the fact that it forces a new namespace (the branches namespace) and is less flexible. For example, I can't have a single configuration that includes two copies of the same file from different configurations, as in the previous component branching scenario. More ScenariosShelving and offline work are such excellent features that they alone justifies moving from whatever SCC tool you currently use to TFSC. Shelving Current Changes
Offline Checkout/Check-In
I have designed VBLs with customers using several different SCC tools. Some worked better than others, but what I really like about TFSC is that it is designed from the ground up to work most efficiently with the way that developers and projects interact. It's not necessary to customize the tool with hacks or tricks to get it to do what you want. All the features are there. |