A Review of ClearCase Concepts | IBM Rational ClearCase, Ant, and CruiseControl: The Java Developers Guide to Accelerating and Automating the Build Process

ClearCase essentially comes in two flavors: Base ClearCase and Unified Change Management (UCM). Base ClearCase is the traditional method of ClearCase implementation; essentially it gives you a set of building blocks to define your own usage model. With a Base ClearCase implementation you must explicitly define the project's usage model via branches, labels, and other metadata types, and you have to manually (or via scripts) configure the user's workspaces to pick them up. UCM, on the other hand, is a predefined (but still customizable) ClearCase usage model. It is based on SCM best practices, and it hides (to a certain degree) the underlying metadata that is required to be implemented for a Base ClearCase process.

There is no additional cost to purchase UCM because it is built into the product; you either use it or you don't. It doesn't necessarily fit every project's development process needs. However, UCM has a number of additional features above and beyond Base ClearCase that you can make use of in your automation efforts. If you are new to ClearCase, I recommend looking at the UCM usage model first to see if it can work with your existing development process. Chances are that it can. This section discusses the fundamental concepts of both Base ClearCase and UCM so that you will be familiar with them for later chapters. For more details on the different usage models, refer to the ClearCase product manuals [ClearCase03] or Bellagio and Milligan [Bellagio05], which has been updated with additional content for UCM.

ClearCase Fundamentals

The following sections cover the fundamental ClearCase concepts and terms.

What Is a VOB?

The ClearCase Versioned Object Base (VOB) is a read-only repository for versioned files and directories, which in ClearCase are called elements. Essentially, a VOB is a database of element versions that maintains history and metadata about these file system elements as they evolve over time. ClearCase manages all access to its VOBs. All users access the data in a VOB via a workspace (called a view) configured to look into it. A development project may encompass multiple VOBs.

The VOBs (from a build point of view) contain all the source code that you will be building and releasing. Additionally, it is also possible to "stage" build-derived objects such as libraries and executables into a VOB if required.

What Is an Element?

A ClearCase element is a versioned object that resides in a VOB. Any data that can be stored in a native file can be created as an element under ClearCase version control, such as source code, libraries, documents, and log files. Each element is organized as a tree that iterates over a number of versions, as shown in Figure 2.4. Elements can be branched to support parallel development. Each element always has a /main branch, and its initial version is 0, which is an empty version. ClearCase supports unlimited branching and corresponding merging to synchronize modifications carried out on different branches.

Figure 2.4. ClearCase version tree

It is traditional for the branch's name to be in lowercase and to reflect its purpose, such as release2 or bugfix_52. Any particular version can also be labeled to indicate its importance or maturity. Typically, a label is placed across a set of compatible element versions, with the label name given in uppercase, such as REL1.0.

This book uses branches and labels to identify the set of files to build and the files that have already been built.

What Is a View?

A ClearCase view is an individual workspace that is created to select a set of compatible file and directory versions from across one or more VOBs. The view presents all the files and directories as a "native" file system. Typically a view is created per user and per task. For example, a view called fred_release1_dev is Fred's own development view for working on release 1. A view acts as a development sandbox and controls the private/public visibility of your work. For example, when you create a new file within your view, it is initially view-private: it exists only within that viewno other users can see the file. Only when you add the file to source control do other users (who are working on the same branch) see the file. Similarly, when you check out a file, the checked-out copy resides only within your view. Any changes you make to that file while it is checked out are visible to other users only when you check the file back in.

Figure 2.5 illustrates an environment with two views working across multiple VOB repositories. The rules that define what versions of the files each of the views picks up is called a configuration specification (described in more detail in the following section).

Figure 2.5. ClearCase view mechanism

The two types of views are snapshot and dynamic. A snapshot view is a traditional file system workspace. In other words, it contains physical instances of files and directories copied from the VOB. A dynamic view looks and behaves like a physical copy, but it actually uses the ClearCase multiversion file system (MVFS). MVFS allows dynamic views to access the server directly, rather like mounting a shared drive; very little content is downloaded to your local machine. In general, dynamic views are faster to create, faster to modify, and take up little disk space. Also, if you run builds within a dynamic view, ClearCase can audit what you are doing. For example, if you are building a Java library, ClearCase can audit all the versions of the files that the view selects and that are built into that library. This can be very useful for impact assessment and traceability purposes.

The Third of Two Types of Views

There is in fact a third type of viewa Web view. A Web view is effectively a variation of a snapshot view and is created by the ClearCase Web client or the ClearCase Remote Client (CCRC). The CCRC lets a thick client (based on the Eclipse framework) connect to the ClearCase server via HTTP. For more information on the CCRC, see Bellagio and Milligan [Bellagio05].

To bring a snapshot view in line with changes made by another user (but on the same branch), you need to initiate an action to update the view, which may take several minutes. With dynamic views, however, this process is transparent. As soon as a file is checked in on a branch, any other user's view configured to look at the same branch is updated automatically. One of the caveats with dynamic views involves build performance. Because each file system call goes through the MVFS, the time taken to build a Java library within a dynamic view might be significantly longer than the time it took to build the same library in a snapshot view. (This is particularly true on networks that have high latency.) Obviously there is a need to look at the benefits of build time and performance with snapshot views versus the benefits of auditing and traceability with dynamic views. I will be using both types of views in the construction of our build environment.

Continuous Integration and Dynamic Views

If your build server is on the same subnet as your ClearCase server, or if you use fast network storage, it might be possible to reduce your "total build time" using dynamic views. This is because total build time also includes workspace setup (this setup is required for snapshot views but not for dynamic ones). By its very nature, Continuous Integration is very sensitive to total build timethe faster you can build, the faster you can expose any potential problems. Also, if there is an error, the compilation and unit testing would not be executed from beginning to end; however, the workspace setup phase would still need to be executed each time. I would therefore encourage you to run some test scenarios to see which approach is fastest in your own environment.

What Is a Config Spec?

As illustrated in Figure 2.5, each view specifies a set of rules to select the versions of files and directories to present to the user. This set of rules is called a configuration specification, or config spec for short. The config spec for a newly created Base ClearCase view is typically set to the following:

element  *  CHECKEDOUT element  *  /main/LATEST

Each time a user accesses a VOB file or navigates a VOB directory, the view's config spec is evaluated to select the proper version of the file or directory. With each evaluation, parsing runs top to bottom. The evaluation stops as soon as a rule is encountered that successfully selects a version of the element in question. In the preceding example, the first line looks to see if a particular element has been checked out to the current view. If it has, the view selects the checked-out copy. If it hasn't, the view selects the latest version of the particular element that is on the /main branch. The * indicates that the rule applies to any element of any VOB.

The following configuration specification is a more sophisticated example:

element  *  CHECKEDOUT element  *  /main/release2_dev/LATEST element  *  REL1 -mkbranch release2_dev element  *  /main/0 -mkbranch release2_dev load \RationalBank

This example has the same first line for matching checked-out elements. However, the second line, rather than matching elements on the /main branch, matches those that are on the release2_dev branch (which is off the /main branch). The third line is a bit more interesting. Here, rather than a branch, the starting point is a labelin this case, REL1, or release 1. If any of the previous rules do not match, this line finds where in the version tree the label REL1 has been placed and then automatically creates a release2_dev branch off this version on checkout. The fourth line is included to match any new elements that are created, since they obviously have not had the label applied. Finally, the fifth line indicates a snapshot view. It is a load rule specifying what files to copy from the VOBin this case, all the files from the RationalBank VOB. If there is a large amount of data across your VOBs, you can reduce the scope of the load rules to copy only the files you need, thus reducing the load time and the disk space required on your local machine.

Next Stop, UCM

Now that I have discussed the fundamental capabilities of Base ClearCase, we can move on to UCM. It is worth pointing out that UCM is built on top of the capabilities of Base ClearCase, implementing new objects. VOB, views, branches, and element versions are all still visible behind the scenes. However, the fundamental tenet behind implementing UCM was to simplify the adoption of ClearCase by raising the level of abstraction and provide an out-of-the-box usage model. Rather than making you manually create your own branches and views and define your own config specs, UCM automates this process and introduces several high-level conceptsprojects, components, baselines, streams, and activities.

What Is a Project?

When software development teams are organized into individual projects, each project may deliver a release of a product or some discrete and planned set of functionality. A UCM project is the physical realization of this concept. Making the project a physical object of the SCM system lets you automate and manage a number of aspects of security and collaboration. A UCM project lets you define scope and development policies, such as what components (or set of files and directories from the VOB repositories) the project is allowed to change. It also defines a common integration area where all the project changes are collected and integrated.

The ClearCase Command Line

ClearCase has a powerful character-based command-line interface called cleartool that exists on both UNIX and Windows. Creating scripts around ClearCase command-line invocations is a typical way of automating manual activities. The cleartool command can be invoked in single-command mode:

% cleartool command [options-and-args]

or in interactive mode:

% cleartool cleartool> setview fred_release2_dev cleartool> checkout Account.java cleartool> quit

Since ClearCase was originally developed on UNIX and then ported to Windows (without modifying the cleartool command-line parser), the command-line interface uses UNIX-style parsing. Command-line options are styled with a dash (such as -long) instead of a slash, and all commands are case-sensitive. Help exists for every cleartool command. To find out how a particular command works and to see a description of all its parameters, use the following:

% cleartool man [command]

This command works on both UNIX and Windows. Help pages are included for many other topics as well, including VOB, view, and config_spec.

What Are Components, Baselines, and Streams?

I have already discussed the concept of a VOB as a version repository; a UCM component takes this idea one step further. As with a VOB, a component allows you to organize files and directories into a particular structure representing part of an application or system architecture. However, a component is different in three ways. First, a VOB may contain many components. Second, a component can be made writeable or readable within a UCM project (thus allowing you to define consumer/producer relationships between projects). Third, each version of a component can be identified by a UCM baseline.

A UCM baseline is like a label, but it is treated as an object in its own right and collects additional information such as the baseline's level of maturity. An example is a baseline that has just been created as opposed to one that has gone through testing or into the live environment. Once created, UCM baselines are immutable, so they can be used to define higher-level configurations. An entire system, for example, can be assembled from a set of such component baselines. If you have multiple components, one of the more powerful UCM features you can use is composite baselines. Basically, a composite baseline allows you to create a baseline hierarchya single baseline that refers to a set of baselines, one from each component of your system or product. This makes it much easier to manage and recommend a baseline to developersyou have to remember only one baseline rather than many. To create a composite baseline, you create a dependency between component baselines. For more details on implementing UCM composite baselines, refer to Bellagio and Milligan [Bellagio05].

Also worthy of note is the concept of a UCM stream. A stream is like a Base ClearCase branch in that it catalogs the series of element versions that change over time. However, unlike a branch, a stream captures additional informationin this case, the UCM component baselines that it is configured for and the UCM activities that are being worked on in it (I discuss activities next). Streams also hide a lot of the complexity of configuration specifications, and users no longer need to edit and maintain them. Each UCM project has a single integration stream where changes are delivered and usually has one or more development streams per developer.

What Are Activities?

Perhaps the most fundamental difference between Base ClearCase and UCM is that it is an activity-based change management model. This means that changes to files are grouped according to the reason for their change, such as making an enhancement, fixing a defect, or doing simple code refactoring. Whenever you change a file in UCM, you are prompted for this activity; you cannot make a change without declaring this association. This has two benefits. First, it ensures that no files are changed without an associated reason, and second, it allows sets of changes to be referred to and integrated as a whole.

What this means is that when you want to deliver the files that you have been working on in your own workspace to the common project integration stream, you don't have to remember the exact set of files you changed. Instead, you can simply deliver the activity. Because the activity captures the change set for you as you are working, it can figure out the set of files to be delivered. This is illustrated in Figure 2.6.

Figure 2.6. ClearCase UCM activity and change set

Finally, UCM activities and baselines also work in combination. When you create a new baseline, you say what activities to place the baseline on.

Through the use of activities and baselines, you can automate the process of determining what is different between one baseline and another. This comparison between baselines produces not only a list of files that have changed from one baseline to the next, but also a list of activities! This has enormous advantages. You can automatically generate release notes, help testers determine the necessary set of regression tests to run after the nightly build, and so on. For Software Build and Release Management, this should be music to your ears!