The Set interface extends the Collection interface. It does not introduce new methods or constants, but it stipulates that an instance of Set contains no duplicate elements. The concrete classes that implement Set must ensure that no duplicate elements can be added to the set. That is, no two elements e1 and e2 can be in the set such that e1.equals(e2) is true .
The AbstractSet class is a convenience class that extends AbstractCollection and implements Set . The AbstractSet class provides concrete implementations for the equals method and the hashCode method. The hash code of a set is the sum of the hash codes of all the elements in the set. Since the size method and iterator method are not implemented in the AbstractSet class, AbstractSet is an abstract class.
Three concrete classes of Set are HashSet , LinkedHashSet , and TreeSet , as shown in Figure 22.4.
The HashSet class is a concrete class that implements Set . You can create an empty hash set using its no-arg constructor or create a hash set from an existing collection. A HashSet can be used to store duplicate-free elements. For efficiency, objects added to a hash set need to implement the hashCode method in a manner that properly disperses the hash code. Most of the classes in the Java API implement the hashCode method. For example, the hashCode in the Integer class returns its int value. The hashCode in the Character class returns the Unicode of the character. The hashCode in the String class returns s *31 n “1 + s 1 * 31 n “2 + ... + s n “1 , where s i is s.charAt(i) .
Recall that the hash codes of two objects must be the same if the two objects are equal. Two unequal objects may have the same hash code, but you should implement the hashCode method to avoid too many such cases. Additionally, it is required that invoking the hashCode method multiple times returns the same integer during one execution of the program.
Listing 22.1 gives a program that finds all the words used in a text. The program creates a hash set to store the words extracted from the text, and uses an iterator to traverse the elements in the set. The output of the program is shown in Figure 22.5.
1 import java.util.*; 2 3 public class TestHashSet { 4 public static void main(String[] args) { 5 // Create a hash set 6 Set<String> set = new HashSet<String>(); 7 8 // Add strings to the set 9 set.add( "London" ); 10 set.add( "Paris" ); 11 set.add( "New York" ); 12 set.add( "San Francisco" ); 13 set.add( "Beijing" ); 14 set.add( "New York" ); 15 16 System.out.println(set); 17 18 // Obtain an iterator for the hash set 19 Iterator iterator = set.iterator(); 20 21 // Display the elements in the hash set 22 while ( iterator.hasNext() ) { 23 System.out.print( iterator. next () + " " ); 24 } 25 } 26 } |
The strings are added to the set (lines 9 “14). "New York" is added to the set more than once, but only one is stored, because a set does not allow duplicates.
As shown in Figure 22.5, the strings are not stored in the order in which they are inserted into the set. There is no particular order for the elements in a hash set. To impose an order on them, you need to use the LinkedHashSet class, which is introduced in the next section.
Tip
You can simplify the code in lines 18 “24 using a JDK 1.5 foreach loop without using an iterator, as follows : for (Object element: set) System.out.print(element.toString() + " " ); This loop reads as "for each element in the set, do the following." The foreach loop can be used for arrays (see §6.2.7) as well as any instance of Collection . |
Caution
Since set is declared as Set<String> in line 6, all elements added to set must be strings. A compilation error would occur if a non-string object is added to set . |
LinkedHashSet was added in JDK 1.4. It extends HashSet with a linked list implementation that supports an ordering of the elements in the set. The elements in a HashSet are not ordered, but the elements in a LinkedHashSet can be retrieved in the order in which they were inserted into the set. A LinkedHashSet can be created by using its no-arg constructor.
Listing 22.2 rewrites the preceding example using LinkedHashSet . Simply replace HashSet by LinkedHashSet . The output of the program is shown in Figure 22.6.
1 import java.util.*; 2 3 public class TestLinkedHashSet { 4 public static void main(String[] args) { 5 // Create a hash set 6 Set<String> set = new LinkedHashSet<String>(); 7 8 // Add strings to the set 9 set.add( "London" ); 10 set.add( "Paris" ); 11 set.add( "New York" ); 12 set.add( "San Francisco" ); 13 set.add( "Beijing" ); 14 set.add( "New York" ); 15 16 System.out.println(set); 17 18 // Display the elements in the hash set 19 for (Object element: set) 20 System.out.print(element.toString() + " " ); 21 } 22 } |
A LinkedHashSet is created in line 6. As shown in Figure 22.6, the strings are stored in the order in which they are inserted. Since LinkedHashSet is a set, it does not store duplicate elements.
The LinkedHashSet maintains the order in which the elements are inserted. To impose a different order (e.g., increasing or decreasing order), you can use the TreeSet class introduced in the next section.
Tip
If you don't need to maintain the order in which the elements are inserted, use HashSet , which is more efficient than LinkedHashSet . |
SortedSet is a subinterface of Set , which guarantees that the elements in the set are sorted. Additionally, it provides the methods first() and last() for returning the first and last elements in the set, and headSet(toElement) and tailSet(fromElement) for returning a portion of the set whose elements are less than toElement and greater than fromElement .
TreeSet is a concrete class that implements the SortedSet interface. To create a TreeSet , use its no-arg constructor or use new TreeSet(Collection) . You can add objects into a tree set as long as they can be compared with each other. There are two ways to compare objects.
Use the Comparable interface. Since the objects added to the set are instances of Comparable , they can be compared using the compareTo method. The Comparable interface was introduced in §10.4, "Interfaces." Several classes in the Java API, such as String , Date , Calendar , and all the wrapper classes for the primitive types, implement the Comparable interface. This approach is referred to as natural order .
If the class for the elements does not implement the Comparable interfaces, or if you don't want to use the compareTo method in the class that implements the Comparable interface, specify a comparator for the elements in the set. This approach is referred to as order by comparator . It will be introduced in §22.4, "The Comparator Interface."
Listing 22.3 gives an example of ordering elements using the Comparable interface. The preceding example in Listing 22.2 displays all the words used in a text. The words are displayed in their insertion order. This example rewrites the preceding example to display the words in alphabetical order using the TreeSet class. Figure 22.7 shows a sample run of the program.
1 import java.util.*; 2 3 public class TestTreeSet { 4 public static void main(String[] args) { 5 // Create a hash set 6 Set<String> set = new HashSet<String>(); 7 8 // Add strings to the set 9 set.add( "London" ); 10 set.add( "Paris" ); 11 set.add( "New York" ); 12 set.add( "San Francisco" ); 13 set.add( "Beijing" ); 14 set.add( "New York" ); 15 16 TreeSet<String> treeSet = new TreeSet<String>(set); 17 System.out.println(treeSet); 18 19 // Display the elements in the hash set 20 for (Object element: set) 21 System.out.print(element.toString() + " " ); 22 } 23 } |
The example creates a hash set filled with strings, and then creates a tree set for the same strings. The strings are sorted in the tree set using the compareTo method in the Comparable interface.
The elements in the set are sorted once you create a TreeSet object from a HashSet object using new TreeSet(hashSet) (line 17). You may rewrite the program to create an instance of TreeSet using its no-arg constructor, and add the strings into the TreeSet object. Then, every time a string is added to the TreeSet object, the elements in it will be reordered. The approach used in the example is generally more efficient because it requires only a one-time sorting.
Note
All the classes in Figure 22.1 have at least two constructors. One is the no-arg constructor that constructs an empty collection. The other constructs instances from a collection. Thus the TreeSet class has the constructor TreeSet(Collection c) for constructing a TreeSet from a collection c . In this example, new TreeSet(hashSet) creates an instance of TreeSet from the collection hashSet . |
Tip
If you don't need to maintain a sorted set when updating a set, you can use a hash set, because it takes less time to insert and remove elements in a hash set. When you need a set to be sorted, you can convert it into a tree set. |