Section 13.2. SortedSet and NavigableSet


13.2. SortedSet and NavigableSet

Set has one subinterface, SortedSet (Figure 13.4), which adds to the Set contract a guarantee that its iterator will traverse the set in ascending element order. SortedSet was itself extended in Java 6 by the interface NavigableSet (see Figure 13.5), which adds methods to find the closest matches to a target element. The only implementation of SortedSet before Java 6 was treeSet, which has been retrofitted with the methods required to implement the new interface. Since there is no platform implementation of SortedSet in Java 6 that does not also implement NavigableSet, it makes sense to discuss them in the same section. For new client code developed for the Java 6 platform, there is no need to use the SortedSet interface at all, but for the benefit of readers still constrained to use Java 5 we shall present the methods of the two interfaces separately in this section.

Figure 13-4. SortedSet


Figure 13-5. NavigableSet


In Chapter 3 we saw that element ordering can either be defined by the element class itself, if that implements Comparable, or it can be imposed by an external Comparator, supplied by a constructor such as this one, for treeSet:

 TreeSet(Comparator<? super E> comparator) 

Task does implement Comparable (its natural ordering is the natural ordering of its string representation), so we don't need to supply a separate comparator. Now merging two ordered lists, which was quite tricky using parallel iterators, is trivial if we get a SortedSet to do the work. Using the task collections of Example 12.1, it requires two lines of code:

 Set<Task> naturallyOrderedTasks = new TreeSet<Task>(mondayTasks); naturallyOrderedTasks.addAll(tuesdayTasks); assert naturallyOrderedTasks.toString().equals (   "[code db, code gui, code logic, phone Mike, phone Paul]"); 

This simplicity comes at a price, though; merging two sorted lists of size n is O(n), but adding n elements to a TreeSet of size n is O(n log n).

We could use SortedSet to add some function to the to-do manager. Until now, the methods of Collection and Set have given us no help in ordering our taskssurely one of the central requirements of a to-do manager. Example 13.1 defines a class PriorityTask which attaches a priority to a task. There are three priorities, HIGH, MEDIUM, and LOW, declared so that HIGH priority comes first in the natural ordering. To compare two PriorityTasks, we first compare their priorities; if the priorities are unequal, the higher priority tasks comes first, and if the priorities are equal, we use the natural ordering on the underlying tasks. To test whether two PriorityTasks are equal, we check whether they have the same priority and the same task. These definitions ensure that the natural ordering is consistent with equals (see Section 3.1). As when we defined tasks in Section 12.1, we have followed good practice by making PriorityTask immutable.

The class PriorityTask

 public enum Priority { HIGH, MEDIUM, LOW } public final class PriorityTask implements Comparable<PriorityTask> {   private final Task task;   private final Priority priority;   PriorityTask(Task task, Priority priority) {     this.task = task;     this.priority = priority;   }   public Task getTask() { return task; }   public Priority getPriority() { return priority; }   public int compareTo(PriorityTask pt) {     int c = priority.compareTo(pt.priority);     return c != 0 ? c : task.compareTo(pt.task);   }   public boolean equals(Object o) {     if (o instanceof PriorityTask) {       PriorityTask pt = (PriorityTask)o;       return task.equals(pt.task) && priority.equals(pt.priority);     } else return false;   }   public int hashCode() { return task.hashCode(); }   public String toString() { return task + ": " + priority; } } 

The following code shows SortedSet working with a set of PriorityTasks (in fact, we have declared a NavigableSet so that we can use the same set in later examples. But for the moment, we will just use the methods of SortedSet):

 NavigableSet<PriorityTask> priorityTasks = new TreeSet<PriorityTask>(); priorityTasks.add(new PriorityTask(mikePhone, Priority.MEDIUM)); priorityTasks.add(new PriorityTask(paulPhone, Priority.HIGH)); priorityTasks.add(new PriorityTask(databaseCode, Priority.MEDIUM)); priorityTasks.add(new PriorityTask(interfaceCode, Priority.LOW)); assert(priorityTasks.toString()).equals(   "[phone Paul: HIGH, code db: MEDIUM, phone Mike: MEDIUM, code gui: LOW]"); 

Could you not simply compare the priorities of the tasks, without using the string representation as a secondary key? A partial ordering like that would be useful if you want to preserve some aspects of the original ordering; for example, you might wish to sort tasks by priority but, within each priority, preserve the order in which they were added to the set. But the contract for SortedSet (and, as we shall see later, SortedMap) states that it will use the compare method of its Comparatoror, if it does not have one, the compareTo method of its elementsinstead of the elements' equals method to determine when elements are distinct. This means that if a number of elements compare as the same, the set will treat them as duplicates, and all but one will be discarded.

The methods defined by the SortedSet interface fall into three groups:

Getting the First and Last Elements

 E first()     // return the first element in the set E last()      // return the last element in the set 

If the set is empty, these operations throw NoSuchElementException.

Retrieving the Comparator

 Comparator<? super E> comparator() 

This method returns the set's comparator if it has been given one at construction time. The type Comparator<? super E> is used because a SortedSet parameterized on E can rely for ordering on a Comparator defined on any supertype of E. For example, recalling Section 3.3, a Comparator<Fruit> could be used with a SortedSet<Apple>.

Getting Range Views

 SortedSet<E> subSet(E fromElement, E toElement) SortedSet<E> headSet(E toElement) SortedSet<E> tailSet(E fromElement) 

The method subSet returns a set containing every element of the original set that is greater than or equal to fromElement and less than toElement. Similarly, the method headSet returns every element that is less than toElement, and tailSet returns every element that is greater than or equal to fromElement. Note that the arguments to these operations do not themselves have to be members of the set. The sets returned are half-open intervals: they are inclusive of the fromElementprovided it actually is a set member, of courseand exclusive of the toElement.

In our example, these methods could be useful in providing different views of the elements in priorityTasks. For instance, we can use headSet to obtain a view of the high- and medium-priority tasks. To do this, we need a special task that comes before all others in the task ordering; fortunately, we defined a class EmptyTask for just this purpose in Section 12.1. Using this, it is easy to extract all tasks that come before any low-priority task:

 PriorityTask firstLowPriorityTask =   new PriorityTask(new EmptyTask(), Priority.LOW); SortedSet<PriorityTask> highAndMediumPriorityTasks =   priorityTasks.headSet(firstLowPriorityTask); assert highAndMediumPriorityTasks.toString().equals(   "[phone Paul: HIGH, code db: MEDIUM, phone Mike: MEDIUM]"); 

In fact, because we know that tasks with empty details will never normally occur, we can also use one as the first endpoint in a half-open interval:

 PriorityTask firstMediumPriorityTask =   new PriorityTask(new EmptyTask(), Priority.MEDIUM); SortedSet<PriorityTask> mediumPriorityTasks =   priorityTasks.subSet(     firstMediumPriorityTask, firstLowPriorityTask); assert mediumPriorityTasks.toString().equals(   "[code db: MEDIUM, phone Mike: MEDIUM]"); 

Not all orderings can be treated so conveniently; suppose, for example, that we want to work with the set of all the medium-priority tasks up to and including the mikePhone task. To define that set as a half-open interval, users of SortedSet would need to construct the task that immediately follows the mikePhone task in the PriorityTask ordering, and for that you would need to know that the string that succeeds "Mike" in the natural ordering is "Mike\0" (that is, "Mike" with a null character appended). Fortunately, users of NavigableSet have a much more intuitive way of defining this set, as we shall see in a moment.

Notice that the sets returned by these operations are not independent sets but new views of the original SortedSet. So we can add elements to the original set and see the changes reflected in the view:

 PriorityTask logicCodeMedium =   new PriorityTask(logicCode, Priority.MEDIUM); priorityTasks.add(logicCodeMedium); assert mediumPriorityTasks.toString().equals(   "[code db: MEDIUM, code logic: MEDIUM, phone Mike: MEDIUM]"); 

The reverse applies also; changes in the view are reflected in the original set:

 mediumPriorityTasks.remove(logicCodeMedium); assert priorityTasks.toString().equals(   "[phone Paul: HIGH, code db: MEDIUM, phone Mike: MEDIUM, code gui: LOW]"); 

To understand how this works, think of all the possible values in an ordering as lying on a line, like the number line used in arithmetic. A range is defined as a fixed segment of that line, regardless of which values are actually in the original set. So a subset, defined on a SortedSet and a range, will allow you to work with whichever elements of the SortedSet currently lie within the range.

13.2.1.

13.2.1.1. NavigableSet

NavigableSet (see Figure 13.5) was introduced in Java 6 to supplement deficiencies in SortedSet. As we mentioned at the beginning of this section, new client code should use it in preference to SortedSet. It adds methods in four groups.

Getting the First and Last Elements

 E pollFirst()   // retrieve and remove the first (lowest) element,                 // or return null if this set is empty E pollLast()    // retrieve and remove the last (highest) element,                 // or return null if this set is empty 

These are analogous to the methods of the same name in Deque (see Section 14.4), and help to support the use of NavigableSet in applications which require queue functionality. For example, in the version of the to-do manager in this section, we could get the highest-priority task off the list, ready to be carried out, by means of this:

 PriorityTask nextTask = priorityTasks.pollFirst(); assert nextTask.toString().equals("phone Paul: HIGH"); 

Notice that although Deque also contains methods peekFirst and peekLastwhich allow clients to retrieve an element without removing itNavigableSet has no need of them, because their functions are already supplied by the methods first and last inherited from SortedSet.

Getting Range Views

 NavigableSet<E> subSet(E fromElement, boolean fromInclusive,                                       E toElement, boolean toInclusive) NavigableSet<E> headSet(E toElement, boolean inclusive) NavigableSet<E> tailSet(E fromElement, boolean inclusive) 

This group is an improvement on the methods of the same name in SortedSet, which return subsets that are always inclusive of the lower bound and exclusive of the higher one. The NavigableSet methods, by contrast, allow you to specify for each bound whether it should be inclusive or exclusive. This makes it much easier to define range views over some sets. We considered earlier the set containing all the medium-priority tasks up to and including the (medium-prioritized) mikePhone task. To obtain that set using SortedSet, we would have to define it as a half-open interval, using a little-known technicality of string ordering. But NavigableSet allows us to define it as a closed interval simply by specifying that the higher bound should be inclusive:

 PriorityTask mikePhoneMedium = new PriorityTask(mikePhone, Priority.MEDIUM); NavigableSet closedInterval = priorityTasks.subSet(   firstMediumPriorityTask, true, mikePhoneMedium, true); assert(closedInterval.toString()).equals(   "[code db: MEDIUM, phone Mike: MEDIUM]"); 

Getting Closest Matches

 E ceiling(E e)  // return the least element in this set greater than                 // or equal to e, or null if there is no such element E floor(E e)    // return the greatest element in this set less than                 // or equal to e, or null if there is no such element E higher(E e)   // return the least element in this set strictly                 // greater than e, or null if there is no such element E lower(E e)    // return the greatest element in this set strictly                 // less than e, or null if there is no such element 

These methods are useful for short-distance navigation. For example, suppose that we want to find, in a sorted set of strings, the last three strings in the subset that is bounded above by "x-ray", including that string itself if it is present in the set. NavigableSet methods make this easy:

 NavigableSet<String> stringSet = new TreeSet<String>(); Collections.addAll(stringSet, "abc", "cde", "x-ray" ,"zed"); String last = stringSet.floor("x-ray"); assert last.equals("x-ray"); String secondToLast =   last == null ? null : stringSet.lower(last); String thirdToLast =   secondToLast == null ? null : stringSet.lower(secondToLast); assert thirdToLast.equals("abc"); 

Notice that in line with a general trend in the design of the Collections Framework, NavigableSet returns null values to signify the absence of elements where, for example, the first and last methods of SortedSet would throw NoSuchElementException. For this reason, you should avoid null elements in NavigableSets, and in fact the newer implementation, ConcurrentSkipListSet, does not permit them (though treeSet must continue to do so, for backward compatibility).

Navigating the Set in Reverse Order

 NavigableSet<E> descendingSet()   // return a reverse-order view of                                   // the elements in this set Iterator<E> descendingIterator()  // return a reverse-order iterator 

Methods of this group make traversing a NavigableSet equally easy in the descending (that is, reverse) ordering. As a simple illustration, let's generalise the example above using the nearest-match methods. Suppose that, instead of finding just the last three strings in the sorted set bounded above by "x-ray", we want to iterate over all the strings in that set, in descending order:

 NavigableSet<String> headSet = stringSet.headSet(last, true); NavigableSet<String> reverseHeadSet = headSet.descendingSet(); assert reverseHeadSet.toString().equals("[x-ray, cde, abc]"); String conc = " "; for (String s : reverseHeadSet) {   conc += s + " "; } assert conc.equals(" x-ray cde abc "); 

If the iterative processing involves structural changes to the set, and the implementation being used is TReeSet (which has fail-fast iterators), we will have to use an explicit iterator to avoid ConcurrentModificationException:

 for (Iterator<String> itr = headSet.descendingIterator(); itr.hasNext(); ) {   itr.next(); itr.remove(); } assert headSet.isempty(); 

13.2.1. TreeSet

This is the first tree implementation that we have seen, so we should take a little time now to consider how trees perform in comparison to the other implementation types used by the Collections Framework.

Trees are the data structure you would choose for an application that needs fast insertion and retrieval of individual elements but which also requires that they be held in sorted order. For example, suppose you want to match all the words from a set against a given prefix, a common requirement in visual applications where a drop-down should ideally show all the possible elements that match against the prefix that the user has typed. A hash table can't return its elements in sorted order and a list can't retrieve its elements quickly by their content, but a tree can do both.

In computing, a tree is a branching structure that represents hierarchy. Computing trees borrow a lot of their terminology from genealogical trees, though there are some differences; the most important is that, in computing trees, each node has only one parent (except the root, which has none). An important class of tree often used in computing is a binary treeone in which each node can have at most two children. Figure 13.6 shows an example of a binary tree containing the words of this sentence in alphabetical order.

Figure 13-6. An ordered, balanced binary tree


The most important property of this tree can be seen if you look at any nonleaf nodesay, the one containing the word the: all the nodes below that on the left contain words that precede the alphabetically, and all those on the right, words that follow it. To locate a word, you would start at the root and descend level by level, doing an alphabetic comparison at each level, so the cost of retrieving or inserting an element is proportional to the depth of the tree.

How deep, then, is a tree that contains n elements? The complete binary tree with two levels has three elements (that's 22 - 1), and the one with three levels has seven elements (23 - 1). In general, a binary tree with n complete levels will have 2n -1 elements. Hence the depth of a tree with n elements will be bounded by log n (since 2log n = n). Just as n grows much more slowly than 2n, log n grows much more slowly than n. So contains on a large tree is much faster than on a list containing the same elements. It's still not as good as on a hash tablewhose operations can ideally work in constant timebut a tree has the big advantage over a hash table that its iterator can return its elements in sorted order.

Not all binary trees will have this nice performance, though. Figure 13.6 shows a balanced binary treeone in which each node has an equal number of descendants (or as near as possible) on each side. An unbalanced tree can give much worse performancein the worst case, as bad as a linked list (see Figure 13.7). treeSet uses a data type called a red-black tree, which has the advantage that if it becomes unbalanced through insertion or removal of an element, it can always be rebalanced in O(log n) time.

Figure 13-7. An unbalanced binary tree


The constructors for TReeSet include, besides the standard ones, one which allows you to supply a Comparator (see Section 3.4) and one which allows you to create one from another SortedSet:

 TreeSet(Comparator<? super E> c)               // construct an empty set which will be sorted using the               // specified comparator TreeSet(SortedSet<E> s)               // construct a new set containing the elements of the               // supplied set, sorted according to the same ordering 

The second of these is rather too close in its declaration to the standard "conversion constructor" (see Section 12.3):

 TreeSet(Collection<? extends E> c) 

As Joshua Bloch explains in Item 26 of Effective Java (Addison-Wesley), calling one of two constructor or method overloads which take parameters of related type can give confusing results. This is because, in Java, calls to overloaded constructors and methods are resolved at compile time on the basis of the static type of the argument, so applying a cast to an argument can make a big difference to the result of the call, as the following code shows:

 // construct and populate a verb$NavigableSet$ whose iterator returns its // elements in the reverse of natural order: NavigableSet<String> base = new TreeSet<String>(Collections.reverseOrder()); Collections.addAll(base, "b", "a", "c"); // call the two different constructors for verb$TreeSet$, supplying the // set just constructed, but with different static types: NavigableSet<String> sortedSet1 = new TreeSet<String>((Set<String>)base); NavigableSet<String> sortedSet2 = new TreeSet<String>(base); // and the two sets have different iteration orders: List<String> forward = new ArrayList<String>(); forward.addAll(sortedSet1); List<String> backward = new ArrayList<String>(); backward.addAll(sortedSet2); assert !forward.equals(backward); Collections.reverse(forward); assert forward.equals(backward); 

This problem afflicts the constructors for all the sorted collections in the Framework (treeSet, TReeMap, ConcurrentSkipListSet, and ConcurrentSkipListMap). To avoid it in your own class designs, choose parameter types for different overloads so that an argument of a type appropriate to one overload cannot be cast to the type appropriate to a different one. If that is not possible, the two overloads should be designed to behave identically with the same argument, regardless of its static type. For example, a PriorityQueue (Section 14.2.1) constructed from a collection uses the ordering of the original, whether the static type with which the constructor is supplied is one of the Comparator-containing types PriorityQueue or SortedSet, or just a plain Collection. To achieve this, the conversion constructor uses the Comparator of the supplied collection, only falling back on natural ordering if it does not have one.

treeSet is unsychronized and not thread-safe; its iterators are fail-fast.

13.2.2. ConcurrentSkipListSet

ConcurrentSkipListSet was introduced in Java 6 as the first concurrent set implementation. It is backed by a skip list, a modern alternative to the binary trees of the previous section. A skip list for a set is a series of linked lists, each of which is a chain of cells consisting of two fields: one to hold a value, and one to hold a reference to the next cell. Elements are inserted into and removed from a linked list in constant time by pointer rearrangement, as shown in Figure 13.8, parts (a) and (b) respectively.

Figure 13-8. Modifying a linked list


Figure 13.9 shows a skip list consisting of three linked lists, labelled levels 0, 1 and 2. The first linked list of the collection (level 0 in the figure) contains the elements of the set, sorted according to their natural order or by the comparator of the set. Each list above level 0 contains a subset of the list below, chosen randomly according to a fixed probability. For this example, let's suppose that the probability is 0.5; on average, each list will contain half the elements of the list below it. Navigating between links takes a fixed time, so the quickest way to find an element is to start at the beginning (the left-hand end) of the top list and to go as far as possible on each list before dropping to the one below it.

The curved arrows of Figure 13.9 shows the progress of a search for the element 55. The search starts with the element 12 at the top left of level 2, steps to the element 31 on that level, then finds that the next element is 61, higher than the search value. So it drops one level, and then repeats the process; element 47 is still smaller than 55, but 61 is again too large, so it once more drops a level and finds the search value in one further step.

Figure 13-9. Searching a skip list


Inserting an element into a skip list always involves at least inserting it at level 0. When that has been done, should it also be inserted at level 1? If level 1 contains, on average, half of the elements at level 0, then we should toss a coin (that is, randomly choose with probability 0.5) to decide whether it should be inserted at level 1 as well. If the coin toss does result in it being inserted at level 1, then the process is repeated for level 2, and so on. To remove an element from the skip list, it is removed from each level in which it occurs.

If the coin tossing goes badly, we could end up with every list above level 0 emptyor full, which would be just as bad. These outcomes have very low probability, however, and analysis shows that, in fact, the probability is very high that skip lists will give performance comparable to binary trees: search, insertion and removal all take O(log n). Their compelling advantage for concurrent use is that they have efficient lock-free insertion and deletion algorithms, whereas there are none known for binary trees.

The iterators of ConcurrentSkipListSet are weakly consistent.




Java Generics and Collections
Java Generics and Collections
ISBN: 0596527756
EAN: 2147483647
Year: 2006
Pages: 136

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net