4.2. What SCM Is and Is NotA simple description of SCM is that it's a way to keep track of the different versions (the configuration part of SCM) of everything that is necessary for a software project over time. What is tracked is usually files of one kind or another, but could just as well be versions of entries in a database. SCM tools are usually separate applications from the filesystem, though this is by no means always the case. Sometimes people confuse build tools and SCM tools, but the difference is simple. Keeping track of which files go into a product is the task of build tools. Keeping track of all the versions of those files as they change is the task of SCM tools. Some build tools can use SCM tools to obtain the files they need to build a product, but that doesn't make them SCM tools. Using an SCM tool, you can recover older versions of files after the files have been changed later on. This is very useful when you make a mistake. One view of SCM is that it gives you the ability to retrieve a snapshot of the project at a moment in time and then allows you to move forward or backward in time from that point. You can often tag or label the project at different moments in time and then retrieve the files exactly as they were when the tag was applied. You can also use an SCM tool to share your changes to files with other people in a controlled manner. Many SCM tools show the differences (or diffs) between two versions of a file, as well as who made the changes, when the changes were made, and which other files changed at the same time. Many SCM tools also support the idea of branches, which are versions of files in parallel universes. What that means is that you can have two (or more) different versions of a file, both derived from a common version, and you can work with either version at the same time. Branches let you support an existing product made from one set of files, while you develop the next release based on different versions of those same files. Many SCM tools help you with merging changes between branches. Figure 4-4 (in Section 4.5.1, later in this chapter) shows this diagramatically.
SCM tools can be divided into two different kinds: centralized and distributed. Centralized tools store the different versions of the files in a central location, usually on a single server. Distributed tools store the different versions on multiple machines. The difference is somewhat blurred, since distributed tools can choose to use a single location (just like centralized tools), and some centralized tools support distributing their files to multiple servers. There are also SCM tools that support replication, where for performance reasons their files can be read from many different servers but are written to only one server. The difference sometimes simply comes down to how the tool was originally designed. Another way in which SCM tools can differ is whether they expect each file to be changed by more than one person at a time. Some SCM tools stop other people from changing a file while you are editing it; this is known as a locking or serial model. Other tools expect you to resolve changes that other people may have made while you were all editing the same file; this is the concurrent model. All SCM tools have different ways of declaring who can read and write the files that are controlled by the tool. These permissions are often described using a list of permissions, also known as an access control list (ACL), for each file. Some SCM tools use simple text files ("flat text") while others use a database to store their files. This is a sure source of discussion about the merit of each tool. On one hand, simple text files make it somewhat easier to detect corruption, and you can use existing, independent tools to inspect and edit the files. Text files scale well enough for most projects, and you don't have to be a database administrator to use them. On the other hand, databases have many useful properties such as atomic transactions and faster access times. Also, since flat text files generally don't scale as well as databases do, you might as well use a database right from the start. Databases also let you search more efficiently within the older versions of your files. Subversion (see Section 4.6.2, later in this chapter) allows you to choose either approach. The jury is still out on this choice, perhaps because tools based on the two different approaches are aimed at different-sized projects. Some modern SCM tools support the concept of changesets. A changeset is a group of changes to the files controlled by the SCM tool that were made as one logical operation. The advantage of changesets is that they can be applied or later removed as a single operation.
|