 < Day Day Up > 

The variance in subjective interpretations of a presentation can be reduced through good design. This includes ensuring that presentation styles reflect the datamining task, and that the mapping between mining and graphical primitives is intuitive, takes into consideration the user's objectives and facilitates interaction techniques. Clearly there is no single best presentation technique for a datamining task as there are too many factors that depend upon both the user and the problem domain. The solution is therefore to create a flexible set of presentation formats for each mining task that can satisfactorily be applied to a wide range of problems.
Clustering refers to a group of automated techniques that objectively group items into classes based upon selected item attributes, maximising intraclass similarity and interclass dissimilarity. These techniques are used to discover attribute correlations and overall distribution patterns within sets of data, helping the user understand the natural grouping structure of the data (see Fasulo, 1999; Hartigan, 1975; Jain & Dubes, 1988; Rasmussen, 1992 for detailed discussions on clustering algorithms).
Clustering algorithms are based upon calculating the similarity measure of selected attributes between items. This requires the participating attributes to be in numeric form. Hence clustering visualisations are usually presented in coordinate space within which the item's coordinates are directly mapped to the environment's coordinates. The base criterion for a clustering visualisation is an understandable representation of both the participating items and the clusters to which they belong. However the visualisation of these criteria are subjective. The choice of mapping depends upon the type of clustering algorithm used, the objectives of the clustering task and the user's perception of how this information is best represented.
This section provides an overview of clustering visualisations that facilitate perception. The visualisations are divided into partitionbased and densitybased presentations. Gridbased techniques are incorporated within the densitybased section due to their common grounding. This is followed by discussions on volume rendering and projection, which deals with clustering visualisation issues. The section concludes with discussions on the inclusion of hierarchical, spatial and temporal semantics into clustering presentations.
Figure 2: Centroidbased planar clustering presentation
Partitionbased methods analytically subdivide items into a number of clusters such that items in a cluster are more similar to each other than they are to items in other clusters. Figure 2 illustrates atypical centroid presentation, consisting of planar presentation indicative of clustering over two attributes where each item is positioned according to the values of those attributes and shaded to indicate cluster membership, with the centroids indicated by square objects of the same shade.
Figure 3 presents two types of 3D partitioning visualisation. Figure 3a is a snapshot of the tool HBlob created by Sprenger, Brunella and Gross (2000). The figure clearly indicates the itempoints and represents their cluster membership as a translucent encasing. Figure 3b presents the same elements and although less informative than HBlob with respect to item membership, provides more information regarding cluster membership by correlating centroid size with membership.
Figure 3: Centroidbased 3D clustering presentations
A general problem with partitionbased clustering methods is that the shape of all discovered clusters are convex because a partition is equivalent to a Voronoi diagram and each cluster is contained within one of the Voronoi polygons (Sander, Ester, Kriegel & Xu, 1998). To overcome this limitation, densitybased algorithms were devised.
Figure 4: Densitybased presentation (Sander et al., 1998)
Densitybased algorithms regard clusters as dense regions of items which are separated by regions of low density. This group of algorithms relies upon pointdensity functions and a density threshold parameter to discover clusters of arbitrary shape. The first presentations of these algorithms were planar, as illustrated by Figure 4. The figure shows a DBSCAN (Ester, Kriegel, Sander & Xu, 1996) presentation that illustrates the discovery of arbitrarily shaped clusters, each represented by a different shade.
Figure 5 represents a more informative visualisation from the University of Halle's DENCLUE system (Hinneburg, Keim & Wawryniuk, 1999), which illustrates clustering over two attributes. The left image extends into 3D space indicating itempoint density within the planar coordinate space. The selected density threshold is represented as a slice parallel to the image base. The right image reflects the identified clusters as though looking down upon the threshold slice as it cuts through the 3D space.
Figure 5: Partitionbased density presentation (Hinneburg & Keim, 1999b)
Gridbased clustering is a densitybased method that optimises processing through the summarisation of point data. This is accomplished by mapping the data points to a grid of like dimensionality. Where the number of points within a particular cell exceed a density threshold, the cell is classed as dense and included as part of the cluster.
This type of clustering differs from other densitybased presentations, as the visualisation is of the cells not the datapoints. As indicated in Figure 6 the clustering can be performed at different levels of resolution by varying the grid cell size. This variance results in a tradeoff between clustering accuracy and processing time (Hinneburg & Keim, 1999a).
Figure 6: Gridbased clustering presentation with resolution variance (Sheikholeslami, Chatterjee & Zhang, 1998)
To this point we have assumed that clustering has involved only two or three item attributes, the results of which can be mapped directly onto coordinatespace presentations (single attribute clustering is presented using common techniques like piecharts and histograms, which are not discussed within this chapter). Presentation of clustering involving more than three item attributes is more difficult because projections of higher dimensional space are unfamiliar. There are two solutions to this problem: the reduction of attributes through nonlinear dimension reduction techniques prior to the clustering process, and the viewing of high dimensional results through projecting the data onto lower dimension subspaces.
Central to nonlinear dimension reduction techniques is the concept that, regardless of the dimensionality, a relative distance can be calculated between all item pairs (Li, Vel & Coomans, 1995). Calculation of this distance provides a basis for defining topologypreserving transformations that project each item from ndimensional space to two or threedimensional space whilst preserving the relative distances between each item. Several different techniques exist for performing these transformations including multidimensional scaling (Young, 1987) and springembedding systems (Bruss & Frick, 1996; Gross, Sprenger & Finger, 1997). Once the transformation has occurred the clustering is performed on two or three attributes which can then be directly mapped to viewing space.
Subspace projection techniques can be presented either statically or through animation. Static presentations appear as scatterplot matrices (Carr, Littlefield, Nicholson & Littlefield, 1987; Ward, 1994), each of which contains a different paired combination of the clustered attributes. Figure 7 illustrates this method with shading used to differentiate clusters. Animated subspace presentation is based upon the grandtour concept devised by Asimov (1985). This concept is an extension of datarotation for multidimensional datasets, whereby a tour of the high dimensional space is created by iterating through subspace projections via interpolation along a geodesic path, creating the illusion of smooth motion.
Figure 7: Scatterplot matrix projection
Brushing techniques (Ward, 1994) can be used within both the static and animated techniques to track items in separate subspace visualisations. For example, within scatterplot matrices the brushing of an item will result in it being highlighted in each matrix. Within animated projections such as GrandTour, brushing of an item will allow the user to track its movement from one subspace presentation to another. The way in which the point moves may elicit further information about the item and its relationships.
Tour paths can be selected in different ways including random, statistical and clusterseparation techniques. Random tours arbitrarily iterate through every set of different attributes. Statistical tours use explorative statistical techniques such as principal component analysis (Faloutsos & Lin, 1995) and projection pursuit indices (Asimov & Buja, 1995) to examine the statistical significance of variables and to decide which should be included in further analysis. Clusterseparation techniques (Yang, 2000) use clustercentroid positions to facilitate projection selection, each one based upon the mapping of the item to a threedimensional subspace determined by the centroids of four clusters whereby the distance between these centroids is maximised. A snapshot of this is presented in Figure 8.
Figure 8: Snapshot of clusterbased projection tour (Li et al., 1995)
It is often the case in large datasets that multiple items map to the same visualisation coordinate, which ultimately leads to misinterpretation of the visualisation as many items appears as one. The presentation of large numbers of items can result in a cluttered environment, making it difficult to comprehend. To alleviate these issues the results can be presented as an itemdensity map instead of a itempoint map using volume rendering techniques as illustrated in Figure 9. This technique promotes the perception of density at each location in the respective environment. Although not as accurate as itempoint representations due to the use of binning techniques to calculate the voxelised data to be rendered, it is a useful overview technique that can be used as the starting point for exploration of the results. A detail threshold can be defined within this type of presentation, at which point the representation could change to an itempoint map to give a more accurate, detailed representation.
Figure 9: Volume rendering (Yang, 2000)
Cluster algorithms that result in the identification of hierarchical clusters or clusters of differing strengths are based upon extensions to either partition or densitybased algorithms. The associated presentations reflect the underlying nature of the algorithm and incorporate hierarchical semantics or structure as illustrated in Figure 10.
Figure 10: Partitionbased hierarchical clustering
Partitionhierarchy presentations are based upon the result of a sequence of partitioning operations, whereby different levels of clusters are discovered by splitting or merging currently discovered clusters. Figure 11 presents two different representations of the same clustered dataset. Figure 11 a presents a planar circle bounded set of text itempoints positioned in accordance with selected feature values and represented as black points. The dissecting lines represent the hierarchical clustering of the dataset. Each area (numerically labelled) relates to a leafitem in an associated dendogram shown in Figure 11b, which clearly indicates the hierarchical nature of the clustering.
Figure 11: Hierarchical clustering planar presentation, comprised of a hierarchically partitioned space and associated dendogram (Fox, 2001)
Threedimensional presentations of partitionbased hierarchical clustering are typified in Figure 12, which shows a visualisation of the HBlob system that uses smooth translucent shapes (blobs) to indicate cluster boundaries (Sprenger et al., 2000). In sequence, these separate visualisations represent instances of an agglomerative Hblob session where the clusters are built up from individual datapoints. The different levels of clustering are represented separately to help perception, as the superimposition of all blobs upon a single image would result in a cluttered and occluded environment.
Figure 12: HBlob—Hierarchical clustering presentation (Sprenger et al., 2000)
Densityhierarchy clustering methods are based upon the selection of a sequence of density threshold levels. This is illustrated in Figure 13. By varying the densitythreshold level, different clusters are apparent. In this in stance the thresholds have been arbitrarily selected. However more intelligent techniques such as OPTICS's reachability plots have been developed (Ankherst, Breunig, Kriegel & Sander, 1999). Figure 14 illustrates an OPTICS visualisation of both the reachability plot and the resultant hierarchical clustering with arrows super imposed to illustrate the mapping between the two images.
Figure 13: DENCLUE—Density presentation of same data set with different thresholds indicating possible hierarchical extension (Hinneburg et al., 1999)
Figure 14: OPTICS—Reachability plot (Ankherst et al., 1999)
Conglomerative hierarchical visualisations exist within planar space (Figures 10, 11, 13, and 14). 3D space presentations commonly avoid conglomerative visualisations (Figure 12) and instead represent hierarchical clustering as a sequence of separate visualisations. This technique reduces occlusion, whereby the lower level clusters will be difficult to see because the higher level clusters will in effect hide them from view. For example, HBlob (Figure 12) does not use conglomerative presentation because although the membranes representing the clusters are translucent, lower level cluster membranes would remain unclear, especially in scenarios involving many levels.
In general the same techniques can be used to present spatial and nonspatial clustering results because of the common grounding of all clustering algorithms in distance metrics. The only difference is whether the coordinatespace mapping is direct or abstract, where a direct mapping refers to spatial semantics. However there are two techniques related to spatial clustering that are of interest from a presentation perspective: spatially extended objects and spatial obstacles.
Within many different application areas items occupy an area instead of a single point. The clustering of these spatially extended objects or polygons requires specialisation of the densitybased method whereby each object is considered to have abounding area instead of occupying a particular point. The results are then presented in a typical coordinatespace as illustrated by Figure 15.
Figure 15: Clustering of spatially extended objects
The incorporation of realworld physical constraints such as the presence of lakes and highways can effect clustering results. For example, Figure 16 shows how COD (Tung, Hou & Han, 2001) tackles this problem. The underlying algorithm incorporates knowledge of the spatial location of obstacles that constrain the clustering, such that a cluster cannot cross an obstacle.
Figure 16: COD—Clustering with obstructed distance
Clustering can incorporate temporal semantics through either incremental or instancebased techniques. Incremental techniques involve the realtime incorporation of new or modified data into the clustered result set. Instancebased techniques require the data set to be clustered in its entirety at specific instances in time. The presentation of temporal clustering may be either static as a sequence of static images, each of which represents the clustering at a particular time, or the temporal semantics can be incorporated through the use of animation. However, the actual presentation forms do not differ from those already presented.
Presentation difficulties may arise when trying to incorporate both hierarchical and temporal semantics in a clustering presentation, especially in 3D space. Hierarchical inclusion requires sequencing of images to avoid occlusion. If temporal semantics were also to be included as a sequence of images, a matrix of images would be required, each one representing a level of the hierarchy at a particular point in time. The most efficient way of presenting these semantics to gether would be in a single planar partitionbased hierarchical presentation, using interpolation along an intuitive path to incorporate time.
Association mining, or group affiliation, refers to a group of techniques that discover relationships between items based on frequency metrics (see Agrawal, Imielinski & Swami, 1993; Han & Kamber, 2001; Srikant, Vu, & Agrawal, 1997, for further discussion). Associated presentations incorporate representations of the items and their relationships. These presentations fall into two classes: matrixbased and graphbased methods.
Two types of matrix structure used in the presentation of association rules are 2D matrices and mosaic plots. 2D matrices map the antecedent and consequent to separate axes with the third axis indicating the relationship strength, as illustrated by Figure 17a. Figure 17b is a visualisation from SGI's MINESET tool, which follows the same matrix design. Matrixbased visualisations are useful when small numbers of item sets are to be presented. However they degenerate as the underlying result set increases in size and complexity as every new combination of valid antecedents or consequents is appended to each axis in the presentation, resulting in an order of n^{2} rate of matrix growth. This may lead to large presentations that are cumbersome, occluded and hence difficult to understand.
Wong, Whitney and Thomas (1999) have tried to minimise some of these matrix problems through the implementation of a ruletoitembased matrix which is based upon the premise that an item can only occur once in a rule. The technique improves upon previous matrix presentations in that the matrix growth is linear when new rules are appended and the matrix is less sparse. Occlusion is also improved by displaying the associated support and confidence data as wall plates. (See Figure 18).
Figure 17: Matrixbased association presentations
Figure 18: Rule vs. item association matrix (Wong et al., 1999)
Hofman, Siebes and Wilhelm (2000) created an alternative form of matrix presentation termed interactive mosaic plots. This presentation technique provides more information about the discovered association rules by representing the strengths of all associations between the included items. Figure 19 illustrates this technique, with the antecedents represented along the x axis, the consequents along the y axis, and confidence displayed as colouring of the relevant bar. The rules are interpreted by analysing the concatenation of the antecedents (where black is existence and white is nonexistence). Hence, from Figure 19, the rightmost bar indicates a strong association between he ineken & coke & chicken → sardines as all antecedent items are black. In comparison the fifth bar from the left (widest) shows a weak association between heineken & not coke & not chicken → sardines.
Figure 19: Mosaic plots (Hofman, Siebes & Wilhelm, 2000)
An important strength of this type of presentation is that all associations between a set of items can be analysed, with the differences in confidence between the different combinations being an indication of the interestingness of a rule. Figure 19 indicates an interesting rule as the confidence difference between the rule heineken & coke & chicken → sardines and all other combinations in the plot is relatively large. This presentation technique has been designed to be interactive so that users can arbitrarily specify sets of antecedents and consequents. However the system is designed for focused discovery, where the set of attributes under consideration is small (less than 10). These techniques would become cumbersome and difficult to interpret with large numbers of items.
Graphbased techniques present items as nodes and associations as the linking of nodes. Presentations vary in the placement of the nodes and the representation of metadata, including direction, confidence and support. This presentation type displays association rules in a more concise manner than that of matrixbased techniques. However as the number of items increase, graphbased visualisations become cluttered and hence also hard to interpret.
Rule Graph (Klemettinen, Mannila, Ronkainen & Verkano, 1994), illustrated in Figure 20, is a comprehensive directed graph presentation. In this presentation alphalabelled nodes represent instance items, with the arc thickness and label representing the association's confidence and support. Rule Graph uses rule templates to reduce the complexity of the presentation by allowing users to create display filters through template manipulation. This is indicated in the figure by items E through J, which have been removed from the display and appear to the right of the image. This allows the user to focus on rule subsets aiding in presentation comprehension.
Figure 20: Rule Graph (Klemettinen et al., 1994)
Directed associated visualisation (DAV) (Hao, Dayal, Hsu, Sprenger & Gross, 2000) is a 3D presentation technique that maps the items and relationships to positions and vertices on a sphere, using weighted edges to indicate confidence and arrows for direction. DAV, shown in Figure 21, distributes items equally on a spherical surface (Figure 21a). Based upon physics' principles of masses and spring, a support matrix is then created that relates the strength of the association between items in terms of spring tension. The spherical structure is then relaxed and a state of low local minimum energy is reached (Figure 21b), resulting in an item's relative position reflecting their associations. The direction and confidence of each vertex is then calculated (Figure 21c) and finally presented to the user.
Figure 21: DAV process (Hao, Dayal, Hsu, Sprenger & Gross, 2000)
Association mining over different levels of abstraction implies that items belong to a taxonomy and that interesting rules may be discovered by mining associations at not only the item level but also at higher levels in the taxonomy. Related presentations incorporate this hierarchical structure with associations that may be discovered amongst the differing levels of the taxonomy. The FIDO research group at Flinders University has created an effective visualisation technique incorporating hierarchical semantics known as CARV (concentric association rule visualisation). This technique is capable of displaying both singlelevel and hierarchical association mining results as illustrated in Figure 22.
Figure 22: Concentric association rule visualisation
Figure 22a displays singlelevel association mining results in a circular format with relationships indicated by arcs and nodes (a rules antecedents and consequents are indicated by shading the arc ends). Only those items that are significant within the results are displayed. If an association exists in both directions between two items, for example A → B and B → A, then the display bends the line so that they don't overlap. Figure 22b displays a hierarchical instance that follows the same principles; however, the different levels of the hierarchy are visible and the items are positioned below their ancestor to aid in the understanding of the hierarchical structure.
Spatial association mining is the discovery of relationships among sets of spatially oriented items, possibly influenced by nonspatial predicates. For example, the spatial association rule is (caravan park) and closeto (water body) → has (boat hire) consists of spatial antecedents and a nonspatial consequent. Spatial items use spatial predicates to represent topological relationships between spatial objects, as illustrated in the above example with the predicates is and closeto other predicates include intersects, contains, adjacentto, leftof and covers.
Koperski and Han (1995) have undertaken the extensive research into this field. Although the algorithmic development is advanced, in general presentation is intextual form. Visual representations need to incorporate spatial predicates as well as regular association presentation elements. At present there is no intuitive means of doing so. A current technique by Koperski and Han uses a regular association graph annotated with textual spatial predicates (Figure 23).
Figure 23: Geominer—Spatial association rule visualisation
The inclusion of temporal semantics within association mining algorithms involves the discovery of discrete events that frequently occur in the same arrangement along a time line. Therefore temporal mining is the study of the temporal element arrangement, whereas association mining is the study of item relationships. Like spatial mining, temporal mining requires the incorporation of a set of defined temporal predicates. For example, tea after➙ cinema consists of two events, tea and cinema, with the predicate  after➙ indicating their temporal arrangement.
Figure 24 shows a temporal presentation designed by Rainsford and Roddick (2000), that incorporates temporal by representing the temporal predicates (centre column) through which each rule (line) passes. In this way the temporal relationships between the antecedent (left) and consequent (right) can be seen, with shading representing each rule's confidence. Although restricted to simple rules (single antecedent and consequent), this visualisation technique efficiently presents rules with diverse temporal predicates.
Figure 24: Temporal association mining (Rainsford & Foddick, 2000)
A specialisation of temporal mining is sequential mining, in which the patterns being discovered are constrained to sequentially discrete events. This focus on a particular temporal aspect facilitates visualisation design as only a single temporal predicate is to be represented. Research by Wong, Cowley, Foote, Jurrus and Thomas (2000) focus on sequential mining and to this end they have developed a visualisation for the presentation of temporal patterns discovered from newspaper article topics over a period of time. This is illustrated in Figure 25, where the topics are listed on the yaxis and the time line along the xaxis. The patterns found at various times are displayed in a shade representing level of support. The four dashed circles highlight the presence of the same two patterns within the time period.
Figure 25: Sequential mining presentation (Wong et al., 2000)
The technique by Wong et al. (2000) provides a mixture of qualitative and quantitative information because it shows the patterns while providing information regarding the times at which they occurred. A more qualitative and concise presentation of the interesting information (the two reoccurring patterns) might have been achieved through the use of Rainsford's presentation method (Rainsford & Roddick, 2000). However this would be dependent upon the underlying algorithm.
 < Day Day Up > 
