1.5 Data Structures

GLib has a number of standard implementations for common data structures, including lists, trees, and hash tables. As is the case for other GLib modules, the names for each data type's functions share a common prefix (for example, g_list_ for lists).

1.5.1 Strings

The standard fixed-length, NULL - terminated strings in C are occasionally error prone, not terribly easy to handle, and not to everyone's taste. Therefore, GLib provides an alternative called GString , similar to the length-tracked string in most Pascal implementations. A GString data structure grows upon demand so that there's never a question of falling off the end of the string. GLib manages its length at all times, and therefore, it doesn't need a special terminating character when it contains binary data. GLib functions that use this data type begin with g_string_ .

Note	The processing functions for GLib lists, arrays, and strings use a pointer to the data structure as their first parameter and return a pointer to the data structure as their return value. A typical statement might look like this:

 foo = g_string_do_something(foo, bar);

You should always remember to reassign the pointer to whatever the function returns, because you can lose track of the data if the function decides to alter its memory location.

This code shows how to create and initialize the GString type:

 #include <glib.h> GString *s1, *s2; s1 = g_string_new("Shallow Green"); s2 = g_string_sized_new(50); s2 = g_string_assign(s2, "Deep Purple"); g_print("%s\n", s2->str);

Here, g_string_new() takes a normal C string as a parameter and returns a pointer to a new GString string. On the other hand, if you want to assign a C string to a GString string that already exists, use

  string  = g_string_assign(  string  ,  c_string  );

The str field in GString points to the current contents of the string. As illustrated in the preceding example, you can use it just as you would a regular C string. GLib manages the NULL terminator for you.

Warning

The str field is read only. If you manage to write something to it, don't expect to be able to get it back or that your program will function in any meaningful sense. The value changes between g_string_*() calls.

If you have a fairly good idea of how much text you're going to store in a GString structure, use

 g_string_sized_new(  size  )

to reserve size bytes in advance. This can save some time later if you have a string that slowly grows.

As mentioned earlier, a GString string can contain binary data, including NULL bytes. Naturally, when you initialize these strings, you need to specify the length of the data along with the data itself, because there is no universal terminating character. To allocate such a string, use

 g_string_new_len(  initial_data  ,  length  )

Adding Characters

All of the functions below return GString * :

g_string_append(GString * gstring , const gchar * str )

Adds a str value to the end of gstring .
g_string_append_c(GString * gstring , gchar c )

Adds c to gstring .
g_string_append_unichar(GString * gstring , gunichar c )

Adds the Unicode character c to gstring .

If you want to insert something at the very beginning of a string, use the g_string_prepend functions; their names are otherwise as described in the preceding list.

The g_string_insert functions are the same, except that they take an additional index argument (for the index of where to insert in the string; the first index is 0):

g_string_insert(GString * gstring , gssize pos , const gchar * str )
g_string_insert_c(GString * gstring , gssize pos , gchar c )
g_string_insert_unichar(GString * gstring , gssize pos , gunichar c )

The following code demonstrates five of these functions on a string called s1 .

 s1 = g_string_assign(s1, "ar"); s1 = g_string_append(s1, "gh"); s1 = g_string_prepend(s1, "aa"); s1 = g_string_prepend_c(s1, 'A'); s1 = g_string_insert(s1, 4, "rr"); g_print("%s\n", s1->str);           /* prints "Aaaarrrgh" */

To insert binary data into a GString string, use these functions:

g_string_append_len(GString * gstring , gssize pos , const gchar * str , gssize length )
g_string_prepend_len(GString * gstring , gssize pos , const gchar * str , gssize length )
g_string_insert_len(GString * gstring , gssize pos , const gchar * str , gssize length )

Removing Characters

You can pull characters out of a GString string at an arbitrary location with

 g_string_erase(  string  ,  index  ,  num_to_remove  )

To chop a string down to a certain length, use

 g_string_truncate(  desired_length  )

where desired_length is the final length of the string. If you try to truncate a string to a size that is actually larger than the string, nothing happens. If, however, you also want the string's allocated size to grow to that length, this function can truncate and expand:

 g_string_set_size(  desired_length  )

Note	The new data at the end of such a newly expanded string is undefined . This result typically doesn't make a difference in practice, because GLib terminates the original string with a NULL byte.

Here are some examples of these functions in action:

 s1 = g_string_assign(s1, "Anyway"); s1 = g_string_erase(s1, 4, 1); /* s1 should now be "Anywy" */ s1 = g_string_truncate(s1, 3); g_print("%s\n", s1->str);          /* prints "Any" */

Miscellaneous String Functions

The following are miscellaneous string functions:

g_string_equal( string1, string2 )

Compares string1 and string2 and returns TRUE if they match. Note that this function is not like strcmp() .
g_string_hash( string )

Returns a 31-bit hash key for string . See Section 1.5.5 for more information on hash keys.
g_string_printf( string , format , ...)

Similar to sprintf() , except that it stores the output in GString string . The return value is a GString string and like the other string manipulation functions.
g_string_append_printf( string , format , ...)

Same as the preceding function, but appends the result to string instead of replacing the previous value.

Deallocating Strings

Deallocate GString string with

 g_string_free(  string  ,  free_orig  )

Here, free_orig is a gboolean value that indicates whether the string should be completely deallocated. If you do want to return all of the data to the free memory pool, g_string_free() returns NULL . However, if you want to keep the actual string data in memory, use FALSE as free_orig ; the function returns the str field of the structure that you just destroyed . Just remember that you're now responsible for deallocating that data as well with g_free() . Here are two examples:

 gchar *orig_str; orig_str = g_string_free(s1, TRUE); /* s1 and all of its fields are now gone;    orig_str is NULL */ orig_str = g_string_free(s2, TRUE); /* s1 is gone;    orig_str points to its old str field */

1.5.2 Lists

One of the simplest but most important data structures is the linked list. Implementing linked lists along with their elementary operations is more or less a finger exercise for experienced programmers.

That doesn't mean that their implementations are bug free, though, and who wants to write yet another linked-list library? GLib provides a GList data type and functions for doubly linked lists. (It also provides a GSList data type for singly linked lists, but this book doesn't cover those.)

Creating Lists

Creating a list is easy:

 #include <glib.h> GList *list = NULL;

In all of the examples that follow, list will be the general pointer to the list or the list's handle. You must initialize all new, empty lists to NULL .

Note	There's no special function to create a list; the NULL GList pointer is all you need.

A list consists of linked elements , sometimes called nodes . Each element is a GList structure that contains an untyped pointer ( gpointer ) to a block of data. Make sure you always know which pointer goes to an element and differentiate it from the pointer in the element that points to the actual data.

You can use any type that you like; the compiler won't care as long as you use the proper type cast. Most lists contain data of only one type; mixing types in lists requires an additional layer of bookkeeping and is somewhat inefficient and problematic .

Note	GLib manages the whole list with the pointer to the first node. However, a GList pointer also serves as a list iterator. When you program with GList structures, make sure that you keep careful track of your pointers.

Adding List Elements

To append a node to the end of a list, use g_list_append() :

 gint *data_ptr; data_ptr = g_new(gint, 1); *data_ptr = 42; list = g_list_append(list, (gpointer)data_ptr);

This fragment declares a new pointer data_ptr and sets it to a newly allocated data block. It then sets the memory block to the integer 42. Then g_list_append() takes the list handle as the first argument and the data pointer as the second; note that you must cast the data pointer to gpointer .

Note	All of the list examples in this section use integer data types.

As you might suspect from the name , g_list_prepend() operates just like the append function, except that it places the new element at the beginning of the list instead of the end.

Note

Keep in mind that g_list_prepend() doesn't need to run through the entire list from the list handle to find the end, and therefore is more efficient than its appending counterpart . If you need to add a lot nodes to a list, it is often faster to prepend them and then perhaps use g_list_reverse() if they are not in the desired order.

You can also insert elements at any arbitrary place in the list with

 g_list_insert(  list  ,  data  ,  index  )

Here, list and data are as usual, but the third parameter is an index. Note that the new element goes into the place just after index , not before. Here's an example:

 GList *tmp; /* Insert 2003 after the third element */ data_ptr = g_new(gint, 1); *data_ptr = 2001; list = g_list_insert(list, data_ptr, 3); /* Find the list element that was just inserted... */ tmp = g_list_find(list, data_ptr); /* ...and insert 2000 before that element */ data_ptr = g_new(gint, 1); *data_ptr = 2000; list = g_list_insert_before(list, tmp, data_ptr);

If you'd rather have a new element put in place before a certain element, try

 g_list_insert_before(  list  ,  node  ,  data  )

Notice that the parameters are different; here, the second parameter node is an element in list , not an index. The third parameter is the new data block; it will occupy a new node preceding node .

Navigating a List

The previous example used

 g_list_find(  list  ,  data  )

to find the node for data in list . This function searches the list and returns a pointer to a GList node that contains the same data block address if one happens to exist in the list (upon failure, it returns NULL ). This process is not particularly efficient, because a list may require complete traversal before the function finds (or fails to find) a certain node.

There are several other functions for moving around a list. It's perhaps best to illustrate how they work with an example:

 GList *list, *ptr; gint *data_ptr; gint pos, length;   << create a list in "list" variable >> /* point to element at position 3 */ ptr = g_list_nth(list, 3); /* point to the element _before_ position 3 */ ptr = g_list_nth_prev(list, 3); /* advance to the next element in the list */ ptr = g_list_next(ptr); /* record current position of ptr in list */ pos = g_list_position(list, ptr); /* move back one element */ ptr = g_list_prev(ptr); /* access the data in position 4 */ data_ptr = g_list_nth_data(list, 4); /* record position of data_ptr */ pos = g_list_index(list, data_ptr); /* change data in first and last elements to 42 */ ptr = g_list_first(list); *(gint *)(ptr->data) = 42; ptr = g_list_last(list); *(gint *)(ptr->data) = 42; /* also change the next-to-last element to 42 */ *(gint *)(ptr->prev->data) = 42; /* record the length of the list */ length = g_list_length(list);

The random-access functions in the preceding program are as follows :

g_list_nth( list , n ) : Returns the node at position n .
g_list_nth_prev( list , n ) : Returns the node just before position n .
g_list_nth_data( list , n ) : Returns a pointer to the data block of the node at position n .
g_list_first( list ) : Returns the first node in a list.
g_list_last( list ) : Returns the last node in a list.

Note	Keep in mind that a list's first position is 0.

If you have a pointer to node in a list, you can use it as the parameter for the following functions:

g_list_next( node ) : Returns the next node.
g_list_prev( node ) : Returns the previous node.

These operations pertain to a node's position (or index):

g_list_position( list , node ) : Returns the position of a node in a list.
g_list_index( list , data ) : Returns the position of a node in list that contains data ” basically, the reverse of g_list_nth_data() .
g_list_length( list ) : Returns list length.

If you know what you're doing, you can move list nodes around by following the pointers that make up a node, as the last few parts of the example code show. In addition to changing the memory behind data , you can follow the next and prev pointers to access adjacent nodes. Notice that the preceding example uses ptr->prev->data . You can use this approach only if you are absolutely sure that ptr->prev isn't NULL .

Removing Elements

Deleting elements from a list is a little different than you might expect. Remember that you are responsible for the management of each node's data, and just as you created space for each data block, you must also free this space when you're done with it. However, you don't have to worry about the actual nodes.

The functions for removing elements from a list are

 g_list_remove(  data  ) g_list_remove_all(  data  )

Notice that they do not take an index or a pointer to a node as the node to remove; instead, they want a pointer to the target node's data block . The idea is that because you probably need to do something with the data block anyway, you should always be able to find it after you delete a node that pointed to it.

This short example shows the functions in action:

 /* create a 42 */ data_ptr = g_new(gint, 1);  *data_ptr = 42; /* place three identical 42s at the beginning of a list */ list = g_list_prepend(list, (gpointer)data_ptr); list = g_list_prepend(list, (gpointer)data_ptr); list = g_list_prepend(list, (gpointer)data_ptr); /* remove the first 42 */ list = g_list_remove(list, (gconstpointer)data_ptr); /* remove the rest of them */ list = g_list_remove_all(list, (gconstpointer)data_ptr); /* free the 42 */ g_free(data_ptr);

If the list contains more than one instance of the data pointer, g_list_remove() deletes only the first node that contains the pointer; g_list_remove_all() removes all of them.

Iterating Through Lists

You can run a function on every element of a list at once, similar to mapping in a language like Lisp. This is called iteration ; the GList utility is

 g_list_foreach(  list  ,  func  ,  user_data  )

It's helpful to see an example first:

 void print_number(gpointer data_ptr, gpointer ignored) {   g_print("%d ", *(gint *)data_ptr); } g_list_foreach(list, print_number, NULL);

As usual, list is a list. func is a function with a prototype matching GFunc , and user_data is an untyped pointer. The GFunc definition is

 typedef void (*GFunc) (gpointer data, gpointer user_data);

When you put everything in this example together, g_list_foreach() steps through each element e of list , running print_number( e ->data, NULL) . Notice that the GType function takes the data of the element as the data argument, not the element itself. The second argument corresponds to the user_data parameter of g_list_foreach() . In this example, it is NULL and completely ignored by print_number() .

This example does involve user_data :

 void plus(gpointer data_ptr, gpointer addend_ptr) {   *(gint *)data_ptr += *(gint *)addend_ptr; } gint *num_ptr; /* Add 42 to each element */ num = 42; num_ptr = (gpointer)&num; g_list_foreach(list, plus, num_ptr); /* Subtract 65 from each element */ num = -65; num_ptr = (gpointer)&num; g_list_foreach(list, plus, num_ptr);

The only tricky part of this example is that plus accesses the actual addend data in a roundabout way; it takes a pointer in the form of addend_ptr and then dereferences it to get the value of the addend data. The example here uses this approach mostly to avoid loss of sign due to type casting.

Warning

When iterating over a list, don't add or delete any nodes from the list ” that is, unless you enjoy dancing with segmentation faults.

Of course, you may find this style of iteration excessive. It's fine to iterate like this instead:

 GList *l; gint *data_ptr; for(l = list; l; l = l->next) {   data_ptr = l->data;   ... }

Sorting Lists

If you have experience with the standard C library function qsort () , you'll have no problems with

 g_list_sort(  list  ,  comp_function  )

The return value is list sorted by comp_function . Here's a small example:

 gint gint_compare(gconstpointer ptr_a, gconstpointer ptr_b) {   gint a, b;   a = *(gint *)ptr_a;   b = *(gint *)ptr_b;   if (a > b) { return (1); }   if (a == b) { return (0); }   /* default: a < b */                 return (-1); } list = g_list_sort(list, gint_compare);

Here's the type definition for the comparison function:

 typedef gint (*GCompareFunc) (gconstpointer a, gconstpointer b);

To be specific, it takes two pieces of data as parameters (we'll call them a and b ) and returns one of the following:

A value less than 0 if a is less than b
0 if a is equal to b
A value greater than 0 if a is greater than b

As was the case with iteration, this function receives the elements' data block pointers as its parameters.

A second list sorting variant allows you to pass additional data to a comparison function. To use it, call

 g_list_sort_with_data(  list  ,  compare_func  ,  user_data  )

In this case, the comparison function has the GCompareDataFunc type and takes an additional data pointer argument.

Miscellaneous List Operations

Three functions take care of a few odds and ends with respect to lists:

 GList list2 = NULL; /* copy list into list2 */ list2 = g_list_copy(list); /* append list2 to the end of list1 */ list = g_list_concat(list, list2); list2 = NULL; /* reverse list */ list = g_list_reverse(list);

g_list_copy( list )

Creates a new copy of list and returns the copy.

Warning
This function creates copies of the nodes but does not copy the data blocks. Keep track of that memory.
g_list_concat( list , list2 )

Appends list2 to list and returns the result. This function does not make copies of any nodes; it uses the existing nodes. Therefore, be careful what you pass, and set the second list to NULL after running this function.
g_list_reverse( list )

Reverses the order of the nodes list .

Deallocating Lists

To return all of a list's nodes to the free memory pool, use g_free_list( list ) . Use this deallocation function only on the list's first element.

Warning

Keep in mind that GLib has no idea how you created the data blocks of your list elements; you're responsible for them and any memory holes that might have to do with them. If you're sure that each data block appears only once in the list, you can come up with a solution using g_list_foreach() . Just make sure you know what you're doing.

1.5.3 Arrays

GLib arrays ( GArray ) are like their C counterparts, except that they don't have a fixed size. They are much faster than GList structures for random access, but the potential cost of inserting data at various points within the array is higher, because they are just a set of contiguous blocks of memory in the actual implementation.

You can create an array with no particular starting size, or if you have an idea of how much space you need, you can preallocate it:

 #include <glib.h> GArray *array, *array2; /* array of integers, unspecified size */ array = g_array_new(TRUE,               /* use null terminator */                     FALSE,              /* don't blank memory */                     sizeof(gint));      /* element size */ /* array of unsigned chars, size 50 */ array2 = g_array_sized_new(FALSE,        /* no null terminator */                            TRUE,         /* zero memory */                            sizeof(guchar), /* element size */                            50);          /* create space for 50 */

Create an array with one of these functions:

 g_array_new(  null_terminated  ,  clear  ,  element_size  ) g_array_sized_new(  null_terminated  ,  clear  ,  element_size  ,  reserved_size  )

Here, null_terminated indicates the use of a NULL terminator, clear tells the function to zero out the memory before returning the array, and element_size is the byte length of each element in the array; reserved_size in g_array_sized_new() is the array's initial size. Upon success, these functions return the newly allocated array.

Note	It doesn't make too much sense to set the first two parameters to TRUE , because if you want to terminate your array with NULL , you don't want to have any NULL bytes that aren't actually the end of the array.

To access an element in an array, use the macro

 g_array_index(  a  ,  type  ,  i  )

where a is the array, type is the array's element type, and i is the index. Because this is a macro, you can write code like so (for example, to set the element at index 1 to 37):

 g_array_index(array, gint, 1) = 37;

Warning

This looks quite wrong in many ways, and you do need to be careful. In particular, you must be absolutely sure that the index exists in your array. Look at the len field of an array to check its length (for example, array->len in the preceding example). Also, although g_array_sized_new() preallocates space, its initial length is still zero.

To create elements in a GArray , you need to add them with a function or use g_array_set_size() .

Adding Elements

To add things to your GArray , you need to fill a regular C array with your data. You can add elements to an array in three places:

At the end: Use g_array_append_vals() .
At the beginning: Use g_array_prepend_vals() .
In the middle: Use g_array_insert_vals() .

This code illustrates how these functions work:

 gint c_array[3]; c_array[0] = 42; c_array[1] = 23; c_array[2] = 69; /* add the elements in c_array to the end of the GArray array */ array = g_array_append_vals(array, (gconstpointer)c_array, 3); /* insert 220 and DEADBEEF at index 1 */ c_array[0] = 220; c_array[2] = 0xdeadbeef; array = g_array_insert_vals(array, 1, (gconstpointer)c_array, 2);

There is a way to add a single item to an array, but you must have a variable that contains the data handy. Unfortunately, the names are confusing ” they are like the three macros in the preceding code, but end in val instead of vals . Because these are macros (and hence not terribly smart), you can't use a constant like 42 . In any case, here is a demonstration of g_array_prepend_val() :

 gint tmp; tmp = 380; /* insert 380 at the beginning of the array */ array = g_array_prepend_val(array, tmp);

Note	Of the functions here, only g_array_append_vals() has reasonable performance. The others must shift memory around; therefore, the characteristics of GArray in this respect are the opposite of GList .

If you want to create multiple elements in an array at once, use

 g_array_set_size(  array, size  )

You can then set the individual elements with g_array_index() .

Deleting Elements

You can remove elements in two ways. You can use g_array_remove_index() , which does what you would expect: It pulls an element at a given index out of the array:

 /* delete element at index 2 */ g_array_remove_index(array, 2);

This approach isn't terribly quick, so there is an alternative that replaces the deleted element with the last element in the array. However, if you care about the order of your array, this isn't for you:

 /* replace index 1 with last element and shorten array */ g_array_remove_index_fast(array, 1);

Sorting Arrays

If you perhaps called one too many g_array_remove_index_fast() functions, you can use g_array_sort() and g_array_sort_with_data() . These functions work just like their g_list_sort* counterparts; see Section 1.5.2.

Deallocating Arrays

As was the case with GString (see Section 1.5.1),

 g_array_free(  array  ,  preserve  )

the preserve Boolean value indicates whether you want to preserve the actual data in array or not. It returns a pointer to the data (type: gchar * ) if you use TRUE as the second parameter; you are responsible for deallocating this later with g_free() .

1.5.4 Trees

Another classic data structure is the tree. There are more types of trees than you probably want to know about (splay trees, threaded trees, red-black trees, and so forth), and if you really want to know about them, have a look at [Knuth] and [Cormen]. However, if you just want to use one of them, GLib's GTree type is a complete implementation of balanced binary trees.

Creating Trees

One of the most noteworthy things about GTree is that it doesn't just contain simple elements like GList and GArray . A leaf of a (search) tree not only contains some data, but also a key corresponding to that data. That key is available to help GLib sort through the tree and find the data. For example, you could use the customer number as a key in a customer database, a telephone number as the key for telephone information, the name of a participant spelled phonetically ” well, you get the idea.

You must define a comparison relation for your keys (greater or less than) so that the tree can be balanced. If you define it as GCompareFunc (see Section 1.5.2), you can call

 g_tree_new(  compare_func  )

to create your tree. However, if you opt for GCompareDataFunc instead, use

 g_tree_new_with_data(  comp_func  ,  comp_data  )

One step further is

 g_tree_new_full(  comp_func  ,  comp_data  ,  key_destroy_func  ,  value_destroy_func  )

which can also take care of your data's memory management with a pair of GDestroyNotify function definitions. You have to create these functions yourself; GLib calls value_destroy_func () when it needs to deallocate the data in a node, and key_destroy_func when it needs to free a node's key. It's a simple function prototype ” just a void function that takes a single untyped pointer as a parameter:

 typedef void (*GDestroyNotify) (gpointer data);

Note

In all of the functions described in this section, when you see something like "this or that will be freed," it means that GLib will do it, and only on the condition that you created the tree with g_tree_new_full() . Otherwise, you need to worry about it yourself. You probably don't want to think about that, though, because trees can get complicated.

As usual, GLib manipulates only the keys and data with untyped pointers. Furthermore, you do not reassign your GTree variables after every function call, as with the other types.

Enough talk; let's look at some actual code.

 #include <glib.h> GMemChunk *key_chunk; GTree *tree; /* compare gints; ignore extra data parameter */ gint key_cmp(gconstpointer a_ptr, gconstpointer b_ptr, gpointer ignored) {   gint a, b;   a = *(gint *)a_ptr;   b = *(gint *)b_ptr;   if (a < b)    { return (1); }   if (a == b)      { return (0); }   /* if a > b */  return (-1); } void free_key(gpointer key) {   g_mem_chunk_free(key_chunk, key); } void free_value(gpointer value) {   g_string_free((GString *)value, TRUE); } /* prepare memory for keys and values */ key_chunk = g_mem_chunk_create(gint, 1024, G_ALLOC_AND_FREE); /* create tree */ tree = g_tree_new_full(key_cmp,                        NULL,      /* data pointer, optional */                        free_key,                        free_value);

This program draws storage for the tree's keys from memory chunks and uses GString strings for its values. The three functions are the comparison, key deallocator, and value deallocator. You can see that once you have these three utility functions, you need only run a single function ” g_tree_new_full() ” to create the tree.

Adding and Replacing Nodes

Insert a new node into a tree with key key and value value with

 g_tree_insert(  tree  ,  key  ,  value  )

If key already exists in the tree, this function replaces the old value, and if the tree was created with deallocation functions, returns that value to the memory pool. It does not free the old key; instead, it frees the key that you just passed.

Therefore, you need to be careful if you want to use the key that you passed to g_tree_insert() after the function call. You may want to use this instead:

 g_tree_replace(  tree  ,  key  ,  value  )

It works just the same as insertion, but when it finds a node with a matching key, it deallocates the old value and the old key, replacing them with the new ones. Of course, if you use this version, you must make sure that the old key hasn't wandered somewhere else in your program.

Because both functions have pitfalls, the easiest way to avoid a core dump when using these functions is to reset any pointers to the key and value after placing them in a tree.

Here is an example:

 gint *key_ptr; GString *value; /* insert 42 into the tree */ key_ptr = g_chunk_new(gint, key_chunk); *key_ptr = 42; value = g_string_new("forty-two"); g_tree_insert(tree, key_ptr, value);

To create the node, you need to get a new memory chunk for the key and then a new GString for the value. Notice how this works in tandem with free_key() and free_value() , discussed earlier.

Finding Nodes

To find a node in tree matching key , use

 g_tree_lookup(  tree  ,  key  )

The return value is a pointer to the matching node's value if successful, or NULL if key isn't in the tree.

There's a slightly more complicated version:

 g_tree_lookup_extended(  tree  ,  key  ,  key_ptr_addr  ,  value_ptr_addr  )

Upon a match, this function sets the pointer behind key_ptr_addr to the key of the matching node, and likewise with value_ptr_addr and the matching value. The return value is TRUE if there's a match, and FALSE otherwise. Use this function only if you need to access the key in the tree for some reason (for example, if you didn't define a function to deallocate keys and need to do it by hand).

Warning

With g_tree_lookup_extended() , you can change keys that are in trees. Don't do this; GLib's tree navigation system won't be able to cope with the change.

Here are the functions in action:

 gint *key_ptr, *key_ptr2; /* look up 37 in the tree */ key_ptr = g_chunk_new(gint, key_chunk); *key_ptr = 37; value = (GString *) g_tree_lookup(tree, key_ptr); if (!value) {   g_print("%d not found in tree.\n", *key_ptr); } else {   g_print("%d found; value: %s.\n", *key_ptr, value->str); } /* See if 42 is in there */ *key_ptr = 42; if (!g_tree_lookup_extended(tree, key_ptr,                             (gpointer)&key_ptr2,                             (gpointer)&value)) {     g_print("%d not found in tree.\n", *key_ptr); } else {     g_print("%d found; value: %s.\n", *key_ptr, value->str); } g_mem_chunk_free(key_chunk, key_ptr);

Warning

If you choose not to provide key and value memory management functions when you create the tree, you need to know exactly what the keys and values look like in memory, and it's particularly important to keep track of your keys. For example, keys with pointers to any other data invite memory leaks.

Deleting Nodes

To completely remove a node from a tree, including its key and value, use

 g_tree_remove(  tree  ,  key  )

However, if you want to preserve the key and value, or you want to remove a node from a tree only temporarily, use

 g_tree_steal(  tree  ,  key  )

However, make sure that you have pointers to the original key and value before you run this, or you'll lose track of them. One way to do this is with g_tree_lookup_extended() ; the following code builds on the function call that you saw earlier:

 /* pull a node from the tree */ g_tree_steal(tree, key_ptr2); /* key_ptr2 and value contain the key and value (see above)--    now we'll throw them right back into the tree */ g_tree_insert(tree, key_ptr2, value); /* this time get rid of the node for good */ g_tree_remove(tree, key_ptr2);

Traversing a Tree

As with lists and arrays, you can iterate over a GTree tree. This is called traversing the tree, and you typically want to do it in the sort order of the nodes' keys. Use

 g_tree_foreach(  tree  ,  func  ,  data  )

to call func on every node in tree . Note that func has the GTraverseFunc definition:

 typedef gboolean (*GTraverseFunc)            (gpointer key, gpointer value, gpointer data);

The traversal goes in the order of the keys, with the smallest element first. The GTraverseFunc can halt the traversal at any time by returning TRUE ; otherwise, it returns FALSE to keep things moving (you could use this feature when looking for something). Here's an example that prints every node in the tree:

 /* print a node in a traversal */ gboolean print_node(gpointer key, gpointer value, gpointer ignored) {   g_print("[%d %s] ", *(gint *)key, ((GString *)value)->str);   return FALSE; } g_tree_foreach(tree, print_node, NULL);

This example uses the third parameter of the GTraverseFunc :

 /* add the keys; ignore the value */ gboolean sum_keys(gpointer key, gpointer value_ignored, gpointer sum) {     *(gint *)sum += *(gint*)key;     return FALSE; } gint sum = 0; g_tree_foreach(tree, sum_keys, &sum);

Tree Statistics

The following functions report on the size of a tree:

gint g_tree_nnodes( tree )

Returns the total number of nodes in tree .
gint g_tree_height()

Returns the height of tree .

Removing a Tree

To return a tree and its nodes to the free memory pool, use

 g_tree_destroy(  tree  )

Notice that this function doesn't end in _free like many of the other functions. If you provided deallocation functions for keys and values, this function also completely frees the keys and values. Otherwise, you're responsible for that memory.

1.5.5 Hash Tables

The last GLib data structure that this book covers is another perennial favorite: the hash table. These tables assign keys to values, and using an efficient internal representation, allow you to quickly access values using the key. The GLib data type for a hash table is GHashTable .

As with trees, you can choose any data type that you like for the keys and values of hash tables. An entry in a hash table consists of two untyped pointers: one for the key and the other for the type. GNOME software makes broad use of GHashTable because it can associate data between any two types. ^[3]

1.5.6 Creating Hash Tables

Use

 g_create_hash_table_new(  hash_func  ,  equal_func  )

to create a new hash table, returning the result as GHashTable. The two parameters are functions. The first is the hash function , with a type of GHashFunc. Following this is an equality test function (type GEqualFunc) that determines whether two keys are equal. Although you can probably use the built-in default functions, it never hurts to know the types of the following parameters:

 typedef guint (*GHashFunc)(gconstpointer key); typedef gboolean (*GEqualFunc)(gconstpointer a, gconstpointer b);

The equality function is simple; it takes two keys as parameters and, if they are equal, returns TRUE , or FALSE if they aren't equal.

Hash functions are a little trickier. They take a key as input and ( efficiently ) return a hash value , a guint integer that characterizes the key in some way. This isn't a unique mapping like a quark; you can't get a key back from a hash value. The important part about hash values is that they return values that are well distributed throughout the guint domain, even for similar keys.

If you want to know about the theory of hash values and algorithms, have a look at the algorithms books in the bibliography. For the most part, you will probably find that one of the following default hash functions fits your needs:

g_str_hash( string ) processes gchar * string into a hash value. If your keys are strings, use this function.
g_int_hash( int_ptr ) treats int_ptr as a pointer to a gint value and generates a hash value from the gint value. Use this function if your keys are of type gint * .
g_direct_hash( ptr ) uses ptr as the hash value. This function works when your keys are arbitrary pointers.

If you use one of these hash functions, there are corresponding key equality functions at your disposal: g_str_equal() , g_int_equal() , and g_direct_equal() .

Here is an example:

 GHashTable *hash1; hash1 = g_hash_table_new(g_direct_hash, g_direct_equal);

You are responsible for the memory management with hash tables created with g_hash_table_new() . However, just as in the case of trees in Section 1.5.4, the

 g_hash_table_new_full(  hash_func  ,  equal_func  ,  key_destroy  ,  value_destroy  )

function can deallocate your hash table entries automatically if you provide it with GDestroyNotify functions:

 GHashTable *hash2; hash2 = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, g_free);

In this example, the keys and values of the hash table could be dynamically allocated C strings, because g_free() returns these to the free memory pool.

Note	You can combine a string and integer hash values with XOR into a hash function like this:

 struct vpair {   gchar *str;   int value; }; GHashFunc (struct vpair *p) {   return(g_str_hash(p->str)^p->value); }

Inserting and Replacing Values

Most GLib programmers add new values to a hash table with this function:

 g_hash_table_replace(  hash_table  ,  key  ,  value  )

Here is an example:

 SomeType *key; AnotherType *value; key = g_new(SomeType, 1); value = g_new(AnotherType, 1);   << ... >> g_hash_table_replace(hash1,                      key,                /* key */                      value);             /* value */ g_hash_table_replace(hash2,                      g_strdup("foo"),    /* key */                      g_strdup("bar"));   /* value */

As with many other GLib functions, the key and value are untyped pointers. If the key is already in the hash table, this function replaces the value corresponding to that key. However, if you created the table with deallocation functions to manage the key and value memory, g_hash_table_replace() frees the old key's memory because it isn't needed.

There is a seldom-used alternative to g_hash_table_replace() :

 g_hash_table_insert(  hash_table  ,  key  ,  value  )

This function works just like g_hash_table_replace() , except that it deallocates the new key if an old key in the hash table matches the new key. Therefore, you must be careful if you still have a pointer to key somewhere.

If you want to be completely safe, NULL out any pointers to the key (and value) that you add with either of these functions.

Note	The difference between these two functions with respect to the actual content of your keys is an issue only when the key equality function doesn't take all of the data in the key into account. If your keys use a simple data type, such as an integer or string, there is no difference.

To find out how many entries are in a hash table (that is, the number of key-value pairs), use

 g_hash_table_size(  table  )

Finding Values

The easiest way to find something in a hash table is to use

 g_hash_table_lookup(  table  ,  key  )

The return value is a pointer to the value, or NULL if the key isn't in any of the hash table's entries.

 gchar *key, *value; value = (gchar*)g_hash_table_lookup(hash2, "foo"); if (value) {   g_print("hash2{\"foo\"} = %s\n", value); } else {   g_print("foo isn't in the hash table\n"); }

If you need access to the keys and values in the hash table, try

 g_hash_table_lookup_extended(  table  ,  key  ,  key_addr  ,  ptr_addr  )

Here's an example:

 if (g_hash_table_lookup_extended(hash2,                                  "foo",                                  (gpointer)&key,                                  (gpointer)&value)) {   g_print("hash2{\"%s\"} = %s\n", key, value); } else {   g_print("foo isn't in the hash table\n"); }

This function takes two pointer addresses as the third and fourth parameters: one for a key and the other for a value. If the given key (the second parameter) is in the hash table, this function sets those pointers to the key and value inside the hash table. It returns TRUE upon success.

This function is useful when you need to deallocate your key and value manually.

Warning

g_hash_table_lookup_extended() gives you direct access to the keys in the table entries, meaning that you also have the ability to change them around. That's a really bad idea ” you risk inaccessible entries and key duplication.

Deleting Entries

To delete an entry in a hash table, use

 g_hash_table_remove(  table  ,  key  )

where table is the hash table and key is the entry's key. When successful, this function returns TRUE , and if you created the hash table with automatic key and value deallocation functions, it also frees the key and value memory. Note that this procedure does not try to deallocate the key that you gave as a parameter, so you can write statements with constants, such as this:

 g_hash_table_remove(hash2, "this");

You can also prevent the GLib hash table from trying to deallocate the key and value with

 g_hash_table_steal(  table  ,  key  )

Iterating Through Hash Tables

You can call a function on every single entry in a hash table, just as you can for lists, arrays, and trees. There are three ways to iterate:

void g_hash_table_foreach(GHashTable * table , GHFunc func , gpointer user_data )

Runs func on every entry in table . This function passes user_data as the third parameter to func .
void g_hash_table_foreach_remove(GHashTable * table , GHRFunc func , gpointer user_data )

Same as the preceding function, but removes the entry if func returns TRUE . This approach includes running any deallocation functions on the key and value that you might have specified when you created the hash table. This function is useful if you need to filter a hash table.
void g_hash_table_foreach_steal(GHashTable * table , GHRFunc func , gpointer user_data )

Same as the preceding function, but when removing entries, this function doesn't ever try to deallocate the key and data. This function is useful for moving entries to other hash tables or data structures.

Here is the type definition for GHFunc ( GHRFunc is the same, except that it returns gboolean ):

 typedef void (*GHFunc)(gpointer key, gpointer value, gpointer user_data);

The following is a short example. If you need to see something that involves the user_data parameter, check out the example for trees in Section 1.5.4.

 void print_entry(gpointer key, gpointer data, gpointer user_data) {                                      /* user_data not used */     g_print("key: %-10s     value: %-10s\n",             (gchar *)key, (gchar *)data); } g_print("Hash table entries:\n"); g_hash_table_foreach(hash2, print_entry, NULL);

Deleting Hash Tables

To completely remove a GHashTable , call

 g_hash_table_destroy(  hash_table  )

This approach includes the key and value data if you supplied deallocation functions when you created the hash table. Otherwise, you'll have to take care of the key and value in each entry yourself; g_hash_table_foreach() is a handy means of doing this.

^[3] This sets GLib hash tables apart from the hash tables (or dictionaries) of many interpreted languages; those typically allow only strings as keys.