We develop our graph-processing algorithms using an ADT that defines the fundamental tasks , using the standard mechanisms introduced in Chapter 4. Program 17.1 is the ADT interface that we use for this purpose. Basic graph representations and implementations for this ADT are the topics of Sections 17.3 through 17.5. Later in the book, whenever we consider a new graph-processing problem, we consider the algorithms that solve it and their implementations in the context of client programs and ADTs that access graphs through this interface. This scheme allows us to address graph-processing tasks ranging from elementary maintenance operations to sophisticated solutions of difficult problems.
The interface is based on our standard mechanism that hides representations and implementations from client programs (see Section 4.9). It provides the basic mechanisms that allow clients to build graphs (by constructing the graph and then adding the edges), to maintain the graphs (by removing some edges and adding others), and to test whether an edge exists. The contructor's second argument is a flag indicating whether the graph is directed; the interface also includes a method that allows clients to test this condition. Beyond these basic operations, the Graph interface of Program 17.1 also specifies the basic mechanism that we use to examine graphs: an iterator AdjList for processing the vertices adjacent to any given vertex. Our approach is to require that any such iterator must implement a Java interface that we use only for the purpose of processing the vertices adjacent to a given vertex, in a manner that will become plain when we consider clients and implementations. This interface is defined as follows : interface AdjList { int beg() int nxt() boolean end() } The first two of these methods are to return a vertex name (the first and the next , respectively, in a sequence of vertices); the third method is for testing whether there are more vertices to process. The Graph interface of Program 17.1 refers to a simple class that allows our programs to manipulate edges in a uniform way, which may be implemented as follows: class Edge { int v, w; Edge(int v, int w) { this.v = v; this.w = w; } } This implementation suffices for basic graph-processing algorithms; we consider a more sophisticated one in Section 20.1. The ADT in Program 17.1 is primarily a vehicle to allow us to develop and test algorithms; it is not a general-purpose interface. As usual, we work with the simplest interface that supports the basic graph-processing operations that we wish to consider. Defining such an interface for use in practical applications involves making numerous tradeoffs among simplicity, efficiency, and generality. We consider a few of these tradeoffs next; we address many others in the context of implementations and applications throughout this book. The graph constructor takes the maximum possible number of vertices in the graph as an argument so that implementations can allocate memory accordingly . We adopt this convention solely to make the code compact and readable. A more general graph ADT might include in its interface the capability to add and remove vertices as well as edges; this would impose more demanding requirements on the data structures used to implement the ADT. We might also choose to work at an intermediate level of abstraction and consider the design of interfaces that support higher-level abstract operations on graphs that we can use in implementations. We revisit this idea briefly in Section 17.5, after we consider several concrete representations and implementations.
A general graph ADT needs to take into account parallel edges and self- loops , because nothing prevents a client program from invoking insert with an edge that is already present in the graph (parallel edge) or with an edge whose two vertex indices are the same (self-loop). It might be necessary to disallow such edges in some applications, desirable to include them in other applications, and possible to ignore them in still other applications. Self-loops are trivial to handle, but parallel edges can be costly to handle, depending on the graph representation. In certain situations, including a remove parallel edges ADT operation might be appropriate; then, implementations can let parallel edges collect, and clients can remove or otherwise process parallel edges when warranted. We will revisit these issues when we examine graph representations in Sections 17.3 and 17.4.
Program 17.2 is a method that illustrates the use of the iterator class in the graph ADT. This method extracts a graph's set of edges and returns it in a client-supplied array. A graph is nothing more nor less than its set of edges, and we often need a way to retrieve a graph in this form, regardless of its internal representation. The order in which the edges appear in the array is immaterial and will differ from implementation to implementation. Program 17.3 is another example of the use of the iterator class in the graph ADT, to print out a table of the vertices adjacent to each vertex, as shown in Figure 17.7. The code in these two examples is quite similar and is similar to the code in numerous graph-processing algorithms. Remarkably, we can build all of the algorithms that we consider in this book on this basic abstraction of processing all the vertices adjacent to each vertex (which is equivalent to processing all the edges in the graph), as in these methods. As discussed in Section 17.5, it is convenient to package related graph-processing methods into a single class. Program 17.4 is an ADT interface for such a class, which is named GraphIO . It defines the show method of Program 17.3 and two methods for inserting into a graph edges taken from standard input (see Exercise 17.12 and Program 17.14 for implementations of these methods). We use GraphIO throughout the book for input/output and a similar class named GraphUtilities for utility methods such as the extract-edges method of Program 17.2.
Generally, the graph-processing tasks that we consider in this book fall into one of three broad categories:
Examples of the first are the number of connected components and the length of the shortest path between two given vertices in the graph; examples of the second are a spanning tree and the longest cycle containing a given vertex; examples of the third are whether two given vertices are in the same connected component. Indeed, the terms that we defined in Section 17.1 immediately bring to mind a host of computational problems. Our convention for addressing such tasks will be to build ADTs that are clients of the basic ADT in Program 17.1 but that, in turn , allow us to separate client programs that need to solve a problem at hand from implementations of graph-processing algorithms. For example, Program 17.5 is an interface for a graph-connectivity ADT. We can write client programs that use this ADT to create objects that can provide the number of connected components in the graph and that can test whether or not any two vertices are in the same connected component. We describe implementations of this ADT and their performance characteristics in Section 18.5, and we develop similar ADTs throughout the book. Typically, such ADTs include a preprocessing method (the constructor), private data fields that keep information learned during the preprocessing, and query methods that use this information to provide clients with information about the graph.
In this book, we generally work with static graphs, which have a fixed number of vertices V and edges E . Generally, we build the graphs by executing E invocations of insert , then process them either by using some ADT operation that takes a graph as argument and returns some information about that graph or by using objects of the kind just described to preprocess the graph so as to be able to efficiently answer queries about it. In either case, changing the graph by invoking insert or remove necessitates reprocessing the graph. Dynamic problems, where we want to intermix graph processing with edge and vertex insertion and removal, take us into the realm of online algorithms (also known as dynamic algorithms ), which present a different set of challenges. For example, the connectivity problem that we solved with union-find algorithms in Chapter 1 is an example of an online algorithm, because we can get information about the connectivity of a graph as we insert edges. The ADT in Program 17.1 supports insert edge and remove edge operations, so clients are free to use them to make changes in graphs, but there may be performance penalties for certain sequences of operations. For example, union-find algorithms may require reprocessing the whole graph if a client uses remove edge . For most of the graph-processing problems that we consider, adding or deleting a few edges can dramatically change the nature of the graph and thus necessitate reprocessing it. One of our most important challenges in graph processing is to have a clear understanding of performance characteristics of implementations and to make sure that client programs make appropriate use of them. As with the simpler problems that we considered in Parts 1 through 4, our use of ADTs makes it possible to address such issues in a coherent manner. Program 17.6 is an example of a graph-processing client. It uses the ADT of Program 17.1, the input-output class of Program 17.4 to read the graph from standard input and print it to standard output, and the connectivity class of Program 17.5 to find its number of connected components. We use similar but more sophisticated clients to generate other types of graphs, to test algorithms, to learn other properties of graphs, and to use graphs to solve other problems. The basic scheme is amenable for use in any graph-processing application. In Sections 17.3 through 17.5, we examine the primary classical graph representations and implementations of the ADT operations in Program 17.1. These implementations provide a basis for us to expand the interface to include the graph-processing tasks that are our focus for the next several chapters. The first decision that we face in developing an ADT implementation is which graph representation to use. We have three basic requirements. First, we must be able to accommodate the types of graphs that we are likely to encounter in applications (and we also would prefer not to waste space). Second, we should be able to construct the requisite data structures efficiently. Third, we want to develop efficient algorithms to solve our graph-processing problems without being unduly hampered by any restrictions imposed by the representation. Such requirements are standard ones for any domain that we consider ”we emphasize them again them here because, as we shall see, different representations give rise to huge performance differences for even the simplest of problems.
For example, we might consider an array of edges representation as the basis for an ADT implementation (see Exercise 17.16). That direct representation is simple, but it does not allow us to perform efficiently the basic graph-processing operations that we shall be studying . As we will see, most graph-processing applications can be handled reasonably with one of two straightforward classical representations that are only slightly more complicated than the array-of-edges representation: the adjacency-matrix or the adjacency-lists representation. These representations, which we consider in detail in Sections 17.3 and 17.4, are based on elementary data structures (indeed, we discussed them both in Chapters 3 and 5 as example applications of sequential and linked allocation). The choice between the two depends primarily on whether the graph is dense or sparse, although, as usual, the nature of the operations to be performed also plays an important role in the decision on which to use. Exercises
|