A dictionary is a collection that associates keys with values. You look up a key, and the dictionary provides you with the corresponding value. This is similar to the way a NameValueCollection works, except that a dictionary’s keys and values need not be strings, and a dictionary associates each key with a single value object.
Visual Studio provides several different kinds of dictionary classes that are optimized for different uses. Their differences come largely from the ways in which they store data internally. Although you don’t need to understand the details of how the dictionaries work internally, you do need to know how they behave so that you can pick the best one for a particular purpose.
Because all of the dictionary classes provide the same service (associating keys with values), they have roughly the same properties and methods. The following table describes some of the most useful of these.
Property/Method | Description |
---|---|
Add | Adds a key/value pair to the dictionary. |
Clear | Removes all key/value pairs from the dictionary. |
Contains | Returns True if the dictionary contains a specific key. |
CopyTo | Copies the dictionary’s data starting at a particular position into a one-dimensional array of DictionaryEntry objects. The DictionaryEntry class has Key and Value properties. |
Count | Returns the number of key/value pairs in the dictionary. |
Item | Gets or sets the value associated with a key. |
Keys | Returns a collection containing all of the dictionary’s keys. |
Remove | Removes the key/value pair with a specific key. |
Values | Returns a collection containing all of the dictionary’s values. |
The following sections describe different Visual Studio dictionary classes in more detail.
A ListDictionary stores its data in a linked list. In a linked list, each item is held in an object that contains its data plus a reference or link to the next item in the list. Figure 18-3 illustrates a linked list. This list contains the key/value pairs Appetizer/Salad, Entrée/Sandwich, Drink/Water, and Dessert/Cupcake. The link out of the Dessert/Cupcake item is set to Nothing, so the program can tell when it has reached the end of the list. A reference variable inside the ListDictionary class, labeled Top in Figure 18-3, points to the first item in the list.
Figure 18-3: Each item in a linked list keeps a reference to the next item in the list.
Adding and removing items in a linked list is simple. To add a new item at the top of the list, you create a new item, set its link to point to the item that is currently at the top, and then make the list’s Top variable point to the new item.
To remove a target item from the list, you find the item before the target and set its link to point to the item after the target. This is a lot easier than removing an item from the middle of an array of contiguous items and then shifting everything over to fill in the hole.
Figure 18-4 shows the linked list from Figure 18-3 with a new Side/Fries item added at the top and the Drink/Water item removed.
Figure 18-4: Adding and removing items from a linked list is relatively easy.
Linked lists make adding and removing items from the dictionary easy. By using links, the list can rearrange itself without your needing to slide items back and forth, as you might in a contiguous array.
Unfortunately, if the list grows long, finding items can take a long time. To find an item in the list, the program starts at the top and works its way down, following the links between items, until it finds the one it wants. If the list is short, that doesn’t take very long. If the list holds 100,000 items, this means potentially a 100,000-item crawl from top to bottom. That means a ListDictionary object’s performance degrades if it contains too many items.
If you only need to store a few hundred items in the dictionary and you don’t need to access them frequently, a ListDictionary is fine. If you need to store 100,000 entries, or if you need to access the dictionary’s entries a huge number of times, you may get better performance using a “bigger” object such as a Hashtable. A Hashtable has more overhead than a ListDictionary but is faster at accessing its entries.
A Hashtable looks a lot like a ListDictionary on the outside, but internally it stores its data in a very different way. Rather than using a linked list, this class uses a hash table to hold data.
A hash table is a data structure that allows extremely fast access to items using their keys. It works by mapping an object’s key into a bucket by calculating the key’s hash value.
For example, suppose that you want to make a dictionary associating Employee objects with their Social Security numbers. You could make an array holding 10 buckets and then map employees to bucket numbers using the last digit of their Social Security numbers. Employee 123-45-6789 goes in bucket 9, employee 111-22-3333 goes in bucket 3, and employee 865-29-8361 goes in bucket 1, as shown in Figure 18-5.
Figure 18-5: A Hashtable maps key values into buckets.
This kind of hash table is easy to enlarge. If you add a few dozen employees, you might use 100 buckets and map Social Security numbers using their last two digits. Then employee 123-45-6789 goes in bucket 89, employee 111-22-3333 goes in bucket 33, and employee 865-29-8361 goes in bucket 61.
More employees? Make more buckets! You can use 1000, 10,000, or more buckets if necessary.
To find an employee’s data in the dictionary, you need only to calculate the hash value (last digits of Social Security number) and then look in that bucket. For a large dictionary, this is much faster than digging through a linked list. If you have 10,000 employees perfectly divided one to a bucket in a table of 10,000 buckets, you need only to look in one bucket to find the data you want. That’s a lot faster than slogging through a 10,000-item linked list.
Of course, there is a catch. Actually, there are two catches.
First, if you have too many employees, you are eventually going to find two that map to the same bucket. Suppose that you have 10 buckets and two employees with Social Security numbers 732-45-7653 and 145-76-4583. These both map to bucket 3. This is called a key collision, and the hash table must use some sort of collision resolution policy to figure out what to do when two keys map to the same bucket. This policy is generally some simple method such as making a linked list in the bucket or letting the second item spill over into the following bucket. Whatever collision resolution scheme you use, it takes a little extra time to search through full buckets.
You can make the buckets less full if you use more of them. In this example, if you switched to 100 buckets, the two employees would map to buckets 53 and 83, so no collision occurs. That makes looking up these items faster.
The second catch is that you can make a hash table faster by adding more buckets, but using more buckets means using more space. When a hash table starts to fill up, collisions are common, so performance suffers. Add more space and performance improves, but there’s a lot of wasted space not occupied by any data. If the table grows too big, it may start to use up all of the system’s memory and starve the application for memory. The application must then page memory to disk when it needs more room, a process that can make the program extremely slow.
Those are the advantages and disadvantages of the Hashtable class. A Hashtable gives better performance than a ListDictionary, but takes more space.
In addition to the usual dictionary properties and methods, the Hashtable has a few that help manage the internal hash table that it uses to store its items.
One overloaded version of the Hashtable class’s constructor takes a parameter that tells how many items the table should initially be able to hold. If you know you are going to load 1000 items, you might make the table initially hold room for 1500. Then the program could add all the items without filling the table too much, so it would still give good performance. If you don’t set an initial size, the hash table might need to resize itself many times before it could hold 1000 items, and that will slow it down.
Another version of the constructor lets you specify the hash table’s load factor. The load factor is a number between 0.1 and 1.0 that gives the largest ratio of elements to buckets that the Hashtable will allow before it enlarges its internal table. For example, suppose that the hash table’s capacity is 100 and its load factor is 0.8. Then when it holds 80 elements, the Hashtable will enlarge its internal table.
A HybridDictionary is a cross between a ListDictionary and a Hashtable. If the dictionary is small, the HybridDictionary stores its data in a ListDictionary. If the dictionary grows too large, HybridDictionary switches to a Hashtable.
If you know that you will only need a few items, you can use a ListDictionary. If you know you will need to use a very large number of items, you can use a Hashtable. If you are unsure whether you will have few or many items, you can hedge your bet with a HybridDictionary.
Just as you can make strongly typed collections, you can also make strongly typed dictionaries. The idea is exactly the same. You derive your strongly typed class from a class that supports basic dictionary functionality. In this case, that parent class is DictionaryBase, and it provides a Dictionary object that your class can use to implement its dictionary features.
Next, you implement dictionary methods such as Add, Item, Keys, and so forth. Your code requires specific data types and uses the parent class’s Dictionary variable to do all the hard work through delegation.
The following table lists the standard methods provided by a strongly typed dictionary class. The third column indicates whether the DictionaryBase parent class automatically provides the method or whether you must delegate the method to the Dictionary object.
Method | Purpose | Provided By |
---|---|---|
Add | Adds a key/value pair to the dictionary | Dictionary |
Clear | Removes all key/value pairs from the dictionary | DictionaryBase |
Contains | Returns True if the dictionary contains a specific key | Dictionary |
CopyTo | Copies elements from the dictionary into an array of DictionaryEntry objects | DictionaryBase |
Count | Returns the number of key/value pairs in the dictionary | DictionaryBase |
Dictionary | Returns a Dictionary holding the dictionary’s key/value pairs | DictionaryBase |
InnerHashtable | Returns a Hashtable holding the dictionary’s key/value pairs | DictionaryBase |
Item | Returns the value corresponding to a specific key | Dictionary |
Keys | Returns an ICollection containing the dictionary’s keys | Dictionary |
Remove | Removes the key/value pair for a specific key | Dictionary |
Values | Returns an ICollection containing the dictionary’s values | Dictionary |
As is the case with strongly typed collections, you can add other, more specialized methods if they would be useful in your application.
The following code shows an EmployeeDictionary class with keys that are strings and values that are Employee objects. The Add method takes a string and an Employee object as parameters, and passes them on to the Dictionary object’s Add method. The Item property procedure gets or sets an item using a string key and an Employee object as a value. The property’s Get procedure uses the Dictionary object’s Item method to get the object for a particular key. It uses DirectCast to turn the resulting generic Object into an Employee. The property’s Set procedure uses the Dictionary object’s Item method to set the value for the key. The Keys and Values properties’ Get procedures simply pass their requests on to the Dictionary object. (Actually, the DictionaryBase class could provide these functions, since they return generic collections rather than, for example, collections of strings or Employee objects. For whatever reason, DictionaryBase doesn’t do that.) Finally, the Contains and Remove methods take strongly typed string parameters and pass their requests along to the Dictionary object.
Public Class EmployeeDictionary Inherits System.Collections.DictionaryBase ' Add a Dictionary entry. Public Sub Add(ByVal new_key As String, ByVal new_employee As Employee) Dictionary.Add(new_key, new_employee) End Sub ' Return an object with the given key. Default Public Property Item(ByVal key As String) As Employee Get Return DirectCast(Dictionary.Item(key), Employee) End Get Set(ByVal Value As Employee) Dictionary.Item(key) = Value End Set End Property ' Return a collection containing the Dictionary's keys. Public ReadOnly Property Keys() As ICollection Get Return Dictionary.Keys End Get End Property ' Return a collection containing the Dictionary's values Public ReadOnly Property Values() As ICollection Get Return Dictionary.Values End Get End Property ' Return True if the Dictionary contains this Employee. Public Function Contains(ByVal key As String) As Boolean Return Dictionary.Contains(key) End Function ' Remove this entry. Public Sub Remove(ByVal key As String) Dictionary.Remove(key) End Sub End Class
The following code uses the EmployeeDictionary class. It makes a new EmployeeDictionary object and adds some Employee objects to it. It then uses the Item method to locate a particular object and display its data.
Dim dict As New EmployeeDictionary dict.Add("123-45-6789", New Employee("Al", "Ankh")) dict.Add("111-22-3333", New Employee("Bertie", "Bithoin")) dict.Add("365-76-5476", New Employee("Carl", "Catabalpas")) dict.Add("832-77-6847", New Employee("Dee", "Divers")) Dim emp As Employee = dict.Item("365-76-5476") MessageBox.Show(emp.ToString)
How a class stores its data internally is generally not a developer’s concern. As long as it does its job, you shouldn’t care whether the DictionaryBase class stores its data in a linked list, hash table, or some other data structure. (Although the DictionaryBase class has an InnerHashtable method that returns its data in a Hashtable form, so perhaps that’s a hint.)
However, if you really want a strongly typed class that you know uses a ListDictionary instead of a hash table (or whatever CollectionBase uses), you could derive a strongly typed class from the ListDictionary class.
The following code shows the EmployeeListDictionary class, which is derived from the ListDictionary class. It uses the Shadows keyword to replace the ListDictionary class’s Add, Item, Contains, and Remove methods with new strongly typed versions. Those methods simply pass their requests along to the base class ListDictionary.
Public Class EmployeeListDictionary Inherits ListDictionary ' Add a Dictionary entry. Public Shadows Sub Add(ByVal new_key As String, ByVal new_employee As Employee) MyBase.Add(new_key, new_employee) End Sub ' Return an object with the given key. Default Public Shadows Property Item(ByVal key As String) As Employee Get Return DirectCast(MyBase.Item(key), Employee) End Get Set(ByVal Value As Employee) MyBase.Item(key) = Value End Set End Property ' Return True if the Dictionary contains this Employee. Public Shadows Function Contains(ByVal key As String) As Boolean Return MyBase.Contains(key) End Function ' Remove this entry. Public Shadows Sub Remove(ByVal key As String) MyBase.Remove(key) End Sub End Class
The StringDictionary class uses a hash table to manage keys and values that are all strings. Because it uses a hash table, it can handle very large data sets quickly.
Its methods are strongly typed to require strings, so they provide extra type checking that can make finding potential bugs easier. For that reason, you should use a StringDictionary instead of a generic ListDictionary or Hashtable if you want to work exclusively with strings.
The SortedList class acts as a Hashtable/Array hybrid. When you access a value by key, it acts as a hash table. When you access a value by index, it acts as an array containing items sorted by key value. For example, suppose that you add a number of Job objects to a SortedList named jobs using their priorities as keys. Then jobs(0) always returns the job with the smallest priority value.
The following code shows an example. It makes a SortedList and then adds four items with priorities. Notice that it adds leading zeros to priorities less than 10, so their alphabetical order is the same as their numeric order. It then loops through the jobs displaying their keys and values. Next, the code clears the SortedList and creates similar items using integer values instead of strings as keys. Again it loops through the items displaying their keys and values.
Dim jobs As New SortedList jobs.Add("02", "Two") jobs.Add("10", "Ten") jobs.Add("01", "One") jobs.Add("06", "Six") For i As Integer = 0 To jobs.Count - 1 Debug.WriteLine(jobs.GetKey(i).ToString() & ": " & _ jobs.GetByIndex(i).ToString()) Next i jobs.Clear() jobs.Add(2, "Two") jobs.Add(10, "Ten") jobs.Add(1, "One") jobs.Add(6, "Six") For i As Integer = 0 To jobs.Count - 1 Debug.WriteLine(jobs.GetKey(i).ToString() & ": " & _ jobs.GetByIndex(i).ToString()) Next i
The following text shows the results:
01: One 02: Two 06: Six 10: Ten 1: One 2: Two 6: Six 10: Ten
This code clears the SortedList before adding the second set of values with numeric keys. If it did not, the list would have raised an error when it tried to compare numeric and textual key values.