The goal of topological sorting is to be able to process the vertices of a DAG such that every vertex is processed before all the vertices to which it points. There are two natural ways to define this basic operation; they are essentially equivalent. Both tasks call for a permutation of the integers through V-1 , which we put in vertex-indexed arrays, as usual. Topological sort (relabel) Given a DAG, relabel its vertices such that every directed edge points from a lower-numbered vertex to a higher-numbered one (see Figure 19.21). Figure 19.21. Topological sort (relabeling)Given any DAG (top), topological sorting allows us to relabel its vertices so that every edge points from a lower-numbered vertex to a higher-numbered one (bottom). In this example, we relabel 4 , 5 , 7 , and 8 to 7 , 8 , 5 , and 4 , respectively, as indicated in the array tsI . There are many possible labelings that achieve the desired result .
Topological sort (rearrange) Given a DAG, rearrange its vertices on a horizontal line such that all the directed edges point from left to right (see Figure 19.22). Figure 19.22. Topological sorting ( rearrangement )This diagram shows another way to look at the topological sort in Figure 19.21, where we specify a way to rearrange the vertices, rather than relabel them. When we place the vertices in the order specified in the array ts , from left to right, then all directed edges point from left to right. The inverse of the permutation ts is the permutation tsI that specifies the relabeling described in Figure 19.21 .
As indicated in Figure 19.22, it is easy to establish that the relabeling and rearrangement permutations are inverses of one another: Given a rearrangement, we can obtain a relabeling by assigning the label to the first vertex on the list, 1 to the second label on the list, and so forth. For example, if an array ts has the vertices in topologically sorted order, then the loop for (i = 0; i < V; i++) tsI[ts[i]] = i; defines a relabeling in the vertex-indexed array tsI . Conversely, we can get the rearrangement from the relabeling with the loop for (i = 0; i < V; i++) ts[tsI[i]] = i; which puts the vertex that would have label first in the list, the vertex that would have label 1 second in the list, and so forth. Most often, we use the term topological sort to refer to the rearrangement version of the problem. Note that ts is not a vertex-indexed array. In general, the vertex order produced by a topological sort is not unique. For example,
are all topological sorts of the example DAG in Figure 19.6 (and there are many others). In a scheduling application, this situation arises whenever one task has no direct or indirect dependence on another and thus they can be performed either before or after the other (or even in parallel). The number of possible schedules grows exponentially with the number of such pairs of tasks. As we have noted, it is sometimes useful to interpret the edges in a digraph the other way around: We say that an edge directed from s to t means that vertex s "depends" on vertex t . For example, the vertices might represent terms to be defined in a book, with an edge from s to t if the definition of s uses t . In this case, it would be useful to find an ordering with the property that every term is defined before it is used in another definition. Using this ordering corresponds to positioning the vertices in a line such that edges all go from right to left ”a reverse topological sort . Figure 19.23 illustrates a reverse topological sort of our sample DAG. Figure 19.23. Reverse topological sortIn this reverse topological sort of our sample digraph, the edges all point from right to left. Numbering the vertices as specified by the inverse permutation tsI gives a graph where every edge points from a higher-numbered vertex to a lower-numbered vertex .
Now, it turns out that we have already seen an algorithm for reverse topological sorting: our standard recursive DFS! When the input graph is a DAG, a postorder numbering puts the vertices in reverse topological order. That is, we number each vertex as the final action of the recursive DFS method, as in the post array in the DFS implementation in Program 19.2. As illustrated in Figure 19.24, using this numbering is equivalent to numbering the nodes in the DFS forest in postorder, and gives a topological sort: The vertex-indexed array post gives the relabeling and its inverse the rearrangement depicted in Figure 19.23 ”a reverse topological sort of the DAG. Figure 19.24. DFS forest for a DAGA DFS forest of a digraph has no back edges (edges to nodes with a higher postorder number) if and only if the digraph is a DAG. The non-tree edges in this DFS forest for the DAG of Figure 19.21 are either down edges (shaded squares) or cross edges (unshaded squares). The order in which vertices are encountered in a postorder walk of the forest, shown at the bottom, is a reverse topological sort (see Figure 19.23) .
Property 19.11 Postorder numbering in DFS yields a reverse topological sort for any DAG . Proof : Suppose that s and t are two vertices such that s appears before t in the postorder numbering even though there is a directed edge s-t in the graph. Since we are finished with the recursive DFS for s at the time that we assign s its number, we have examined, in particular, the edge s-t . But if s-t were a tree, down, or cross edge, the recursive DFS for t would be complete, and t would have a lower number; however, s-t cannot be a back edge because that would imply a cycle. This contradiction implies that such an edge s-t cannot exist.
Thus, we can easily adapt a standard DFS to do a topological sort, as shown in Program 19.6. This implementation does a reverse topological sort: It computes the postorder numbering permutation and its inverse so that clients can relabel or rearrange vertices. Computationally, the distinction between topological sort and reverse topological sort is not crucial. We can simply change the order method to return postI[G.V()-1 - v] or we can modify the implementation in one of the following ways:
The proofs that these changes give a proper topological ordering are left for you to do as an exercise (see Exercise 19.97). To implement the first of the options listed in the previous paragraph for sparse graphs (represented with adjacency lists), we would need to use Program 19.1 to compute the reverse graph. Doing so essentially doubles our space usage, which may thus become onerous for huge graphs. For dense graphs (represented with an adjacency matrix), as noted in Section 19.1, we can do DFS on the reverse without using any extra space or doing any extra work, simply by exchanging rows and columns when referring to the matrix, as illustrated in Program 19.7. Next, we consider an alternative classical method for topological sorting that is more like breadth-first search (BFS) (see Section 18.7). It is based on the following property of DAGs. Property 19.12 Every DAG has at least one source and at least one sink . Proof : Suppose that we have a DAG that has no sinks. Then, starting at any vertex, we can build an arbitrarily long directed path by following any edge from that vertex to any other vertex (there is at least one edge, since there are no sinks), then following another edge from that vertex, and so on. But once we have been to V + 1 vertices, we must have seen a directed cycle, by the pigeonhole principle (see Property 19.6, which contradicts the assumption that we have a DAG. Therefore, every DAG has at least one sink. It follows that every DAG also has at least one source: its reverse's sink. From this fact, we can derive a topological-sort algorithm: Label any source with the smallest unused label, then remove it and label the rest of the DAG, using the same algorithm. Figure 19.25 is a trace of this algorithm in operation for our sample DAG. Figure 19.25. Topologically sorting a DAG by removing sourcesSince it is a source (no edges point to it) , can appear first in a topological sort of this example graph (left, top). If we remove (and all the edges that point from it to other vertices), then 1 and 2 become sources in the resulting DAG (left, second from top), which we can then sort using the same algorithm. This figure illustrates the operation of Program 19.8, which picks from among the sources (the shaded nodes in each diagram) using the FIFO discipline, though any of the sources could be chosen at each step. See Figure 19.26 for the contents of the data structures that control the specific choices that the algorithm makes. The result of the topological sort illustrated here is the node order 0 8 21 7 3 65 4 9 11 10 12.
Figure 19.26. Indegree table and queue tentsThis sequence depicts the contents of the indegree table (left) and the source queue (right) during the execution of Program 19.8 on the sample DAG corresponding to Figure 19.25. At any given point in time, the source queue contains the nodes with indegree . Reading from top to bottom, we remove the leftmost node from the source queue, decrement the indegree entry corresponding to every edge leaving that node, and add any vertices whose entries become to the source queue. For example, the second line of the table reflects the result of removing from the source queue, then (because the DAG has the edges 0-1 , 0-2 , 0-3 , 0-5 , and 0-6 ) decrementing the indegree entries corresponding to 1 , 2 , 3 , 5 , and 6 and adding 2 and 1 to the source queue (because decrementing made their indegree entries ). Reading the leftmost entries in the source queue from top to bottom gives a topological ordering for the graph .
Implementing this algorithm efficiently is a classic exercise in data-structure design ( see reference section ). First, there may be multiple sources, so we need to maintain a queue to keep track of them (any generalized queue will do). Second, we need to identify the sources in the DAG that remains when we remove a source. We can accomplish this task by maintaining a vertex-indexed array that keeps track of the indegree of each vertex. Vertices with indegree 0 are sources, so we can initialize the queue with one scan through the DAG (using DFS or any other method that examines all of the edges). Then, we perform the following operations until the source queue is empty:
Program 19.8 is an implementation of this method, using a FIFO queue, and Figure 19.26 illustrates its operation on our sample DAG, providing the details behind the dynamics of the example in Figure 19.25. The source queue does not empty until every vertex in the DAG is labeled, because the subgraph induced by the vertices not yet labeled is always a DAG, and every DAG has at least one source. Indeed, we can use the algorithm to test whether a graph is a DAG by inferring that there must be a cycle in the subgraph induced by the vertices not yet labeled if the queue empties before all the vertices are labeled (see Exercise 19.104). Processing vertices in topologically sorted order is a basic technique in processing DAGs. A classic example is the problem of finding the length of the longest path in a DAG. Considering the vertices in reverse topologically sorted order, the length of the longest path originating at each vertex v is easy to compute: Add one to the maximum of the lengths of the longest paths originating at each of the vertices reachable by a single edge from v . The topological sort ensures that all those lengths are known when v is processed, and that no other paths from v will be found afterwards. For example, taking a left-to-right scan of the reverse topological sort shown in Figure 19.23, we can quickly compute the following table of lengths of the longest paths originating at each vertex in the sample graph in Figure 19.21.
For example, the 6 corresponding to (third column from the right) says that there is a path of length 6 originating at , which we know because there is an edge 0-2 , we previously found the length of the longest path from 2 to be 5 , and no other edge from leads to a node having a longer path. Whenever we use topological sorting for such an application, we have several choices in developing an implementation:
All of these methods are used in DAG-processing implementations in the literature, and it is important to know that they are all equivalent. We will consider other topological-sort applications in Exercises 19.111 and 19.114 and in Sections 19.7 and 21.4. Exercises
|