3. Animation Toolkit

Access 97 was used as the primary database for the storage of the animations represented in the XML mediator DTD. The animation toolkit uses Visual Basic as well as Java applications for operations on the stored animation models and motion sequences. The toolkit uses existing solutions in the literature for applying motion sequences to different models and for retargeting motion sequences to meet different constraints. The current implementation of our animation toolkit uses the motion mapping technique proposed in [8] for applying motion of a model to a different model. It (the toolkit) uses the inverse kinematics technique proposed in [13] for retargeting motion sequences to meet new constraints. The implementation of the XML mediator animation database toolkit has two phases:

Pre-processing phase: that helps to generate the XML representation from the various animation formats such as VRML or MPEG-4. It also helps to push the XML representation into Microsoft Access. In the current implementation, we do not handle SMIL animations.
Animation generation phase: that helps to reuse the models and motion sequences in the animation database, using the spatial, temporal, and motion adjustment operations. Depending on the format in which the animations are to be generated, the second phase applies appropriate mapping techniques for models and motion sequences.

3.1 Pre-Processing Phase

In the pre-processing phase, animations described in different formats are first mapped on to the general XML representation, as shown in Figure 17.6. XML documents, when parsed, are represented as a hierarchical tree structure in memory that allows operations such as adding data or querying for data. The W3C (World Wide Web Consortium) provides a standard recommendation for building this tree structure for XML documents, called the XML Document Object Model (DOM). Any parser that adheres to this recommendation is called a DOM-based parser. A DOM-based parser provides a programmable library called the DOM Application Programming Interface (DOM API) that allows data in an XML element to be accessed and modified by manipulating the nodes in a DOM tree. In the current implementation, we have used the DOM-based parser available from Xj3D (eXtensible java 3D, developed by Web 3D Consortium)[14]. Once the XML DOM tree is built, it can be stored in the form of a database. In the current implementation, we use Microsoft Access for storing and manipulating the XML DOM tree. This animation database is used in the second phase for reusing models and motion sequences.

Figure 17.6: Block diagram for the Pre-processing phase

3.1.1 Metadata Generation

Querying the animation databases relies on the metadata associated with animation models and motions. While it is difficult to generate the metadata fully automatically, it is possible to provide semi-automatic techniques. In the first stage, users of this animation toolkit should manually create reference templates for motions and models. These reference templates are used for comparison with the XML animation representation in order to generate the metadata.

3.2 Animation Generation Phase

In this second phase, users can query the animation database for models and motion sequences. The resulting models and/or motion sequences can be reused to generate new animation sequences. The authoring of a new animation starts by collecting the necessary objects and motion in the scene. This is done by invoking the Query Module. As shown in Figure 17.9, the Query Module directly interacts with the database to retrieve the animation, object, or motion the user has requested. The user interacts with the system through a Graphical User Interface (GUI). The user can manipulate the query results using the authoring operations. If the result is an object, it can be edited by the spatial operations before it is added to the new scene graph. If the result is a motion, it could be passed to the temporal operations to change time constraints. It can then be either sent to the motion adjustment operations for further editing or added directly to the scene graph. The user can manipulate the nodes of the scene graph by using the authoring operations repeatedly.

Motion adjustment operations use the technique of inverse kinematics. For ease of understanding, we illustrate inverse kinematics in a 2D case. In Figure 17.7, it shows a movement of the articulated figure to have the end effecter to reach the target. This motion sequence represents the original motion.

click to expand
Figure 17.7: The original motion sequence of a 2d articulated figure with 3 DOFs in joint space

Now we move the target point to a new position (Figure 17.8(a)) and show how inverse kinematics can be applied to adjust the original motion, i.e., retargeting, generate a new motion (Figure 17.8(b)). Motion in Figure 17.8(b) is achieved by applying the following equation for inverse kinematics:

(17.1)

click to expand
Figure 17.8: Retargeting to generate a new motion

where:

∆θ n.	is the unknown vector in the joint variation space, of dimension
Ax	describes the main task as a variation of the end effecter position and orientation in Cartesian space. For example in Figure 17.8(b), the main task assigned to the end of the chain is to follow a curve or a line in the plane under the small movements hypothesis. The dimension m of the main task is usually less than or equal to the dimension n of the joint space.
J	is the Jacobian matrix of the linear transformation, representing the differential behavior of the controlled system over the dimensions specified by the main task.
J⁺	is the unique pseudo-inverse of J providing the minimum norm solution which realizes the main task (Figure 17.8 (a) and (b)).
I	is the identity matrix of the joint variation space n x n.
(I-J⁺J)	is a projection operator on the null space of the linear transformation J. Any element belonging to this joint variation sub-space is mapped by J into the null vector in the Cartesian variation space.
∆z	describes a secondary task in the joint variation space. This task is partially realized via the projection on the null space. In other words, the second part of the equation does not modify the achievement of the main task for any value of ∆z. Usually ∆z is calculated so as to minimize a cost function.

The inverse kinematics solver applies the above equation (1) (no secondary task, thus, ∆z =0) to Figure 17.8(a). Using (1) the offset vector Ax of the target and the end effecter position for each frame, and (2) the configuration at each frame as initial posture, it (the solver) automatically generates a new configuration in joint space to reach the new target. Thus, motion retargeting is handled.

Other operations include manipulating the scene graph structure. The toolkit uses VB's Tree View Control Component to implement the scene graph structure, catering to functions such as delete, add, and search. After editing the objects and the motion, the scene graph now reflects the desired animation of the user. The resulting models and motion sequences can be converted to the required animation format. The animation mapper module carries out this conversion. Figure 17.9 describes the different modules and their interactions in the animation toolkit.

click to expand
Figure 17.9: XML Mediator-based Animation Toolkit

3.2.1 Animation Mapper

The Animation Mapper module helps in integrating responses to a query (or a set of queries) that are in different animation formats. For instance, the resolution of a user query may involve the following actions. Motion in a VRML animation might need to be applied on a MPEG-4 model and a new MPEG-4 animation is generated. To carry out these actions, information regarding the motion is extracted from the motion and the model tables. This information is then integrated with the metadata content along with key values, timing, and sensor information present in the respective nodes to build the required new animation.

While mapping a scene from one format to another format, care must be taken to handle unsupported types (e.g., the nodes supported by MPEG-4 but not by VRML). For this purpose, the node table definition includes the File Type that maps the different nodes (for VRML and MPEG-4) and elements (for PowerPoint) according to the corresponding file type. While incorporating a node in a new animation sequence, it is first checked whether the node type is supported in the desired output format. Node types that are not supported in the output format have to be eliminated.

3.3 Query Processing

As discussed above, users can query the XML-based animation database for available animation models and motion sequences. The resulting models and motions can be combined using the spatial, temporal, and motion adjustment operations to generate new animation sequences. The current implementation of the XML-based animation database handles two different types of queries:

Query on Metadata
Query by Example

Query on Metadata: Normally, a user's query is resolved based on the metadata associated with a model and/or a motion. For instance, a query can be to retrieve Walking motion. This query is applied to the metadata associated with the motion table, and retrieves all the motions with the motion name being Walking. The motion sequences are extracted along with the corresponding model, as stored in the model table. The result is displayed as per the user's requested format. If the user is satisfied with the motion, s/he can retain it else the user has the option to choose a different motion. Also, the user can make changes on the displayed motion. For instance, user can modify the speed of the motion or modify the duration of animation (i.e., extend/reduce the animation time). These operations are supported as part of the motion adjustment operations using a set of Graphical User Interfaces (GUIs). The modified animation can be saved as a new entry in the database with the same motion name to process future requests.

Query By Examples: Users can also provide an example animation model and/or motion and query the database to find similar models and/or motions. The example animation model can be provided as a VRML/MPEG-4/PowerPoint file. This file is parsed and the tables in the database are updated appropriately. Also, a new XML file is created and new sceneId's and ObjectId's are assigned. This XML file of the model and/or animation is compared with the reference templates used for models and/or motions in the metadata generation phase. The resolution of queries by examples uses the same set of comparison algorithms discussed in Section 3.1.2 for metadata generation.

3.3.1 Performance of Query Resolution

The comparison of the example (in the query) and the reference motion/model can be realized in real time since the comparison is carried out on only a fixed number of reference templates. For example, given 10000 animation models and motions in the database, there might be only 10–15-reference templates for different motion sequences. (The numbers may be slightly more for models.) Hence, the example query is compared only with these 10–15 reference templates. Once a matching template is identified, models/motions in the database can be retrieved based on the metadata associated with the identified reference template. In the current implementation, the XML database has animation models and motion sequences whose sizes are of the order of 100–200 Kbytes. For these file sizes, the time for one reference template comparison is of the order of microseconds. It should be observed here that during the preprocessing stage we store motions and models separately in different tables. Hence, 100–200 Kbytes size range for simple motions and models is quite reasonable.

The animation model/motion size and hence the timings might increase when complex animations are considered (e.g., continuous sequence of different animation motions such as walking, running, and jumping). We are in the process of populating the database with such complex animation models and motion sequences to measure the performance of queries on complex examples of motions or models.