Performing Data Mapping and Transformation with XSLT | Developing .NET Enterprise Applications

A common issue surrounding all application integration projects is data mapping and transformation. Data mapping refers to the process of lining up values in one data source against values in another data source. Transformation refers to the process of changing the source data to be capable of being stored in the target data field.

Understanding the Role of Mapping Tables

Data mapping can be as minimal as a field-to-field mapping. An issue priority within the DemoApp table might map directly to a priority field within the IssueTracker table. In more complicated cases, you may need to merge two or more fields in a source table and map them to a single field in another table. The source table may contain first and last name in separate fields, whereas the table stores a single value for name . It quickly becomes clear that you need a structured mechanism to map the source data to the target data. This is traditionally referred to as a mapping table . The mapping table typically describes the following:

The source database, table, and column
The target database, table, and column
The transformation function used to convert the data

Because application integration is a constantly changing activity, data field mapping tables are usually external from the application. Along with the mapping table is the transformation library. Transformation libraries contain several custom algorithms for converting source data into target data. Often, Enterprise Application Integrators (EAI) will refer to such a library as Business Rules .

Understanding the Role of Transformations

Data transformation can also be as minimal as changing field types. A character array of 32 bytes might need to be trimmed to fit into a character array of 24 bytes. An integer might need to become a long. In more complicated cases, the data itself might need to be intelligently modified. A source data table might store the first name and then last name in a single name column, and the target stores the last name, a comma, and then the first name. A transformation is needed to parse the source data and modify it to be correctly stored in the target database.

Implementing Field Mapping and Transformation with XSLT

The benefit of building an XML-based integration is that it can leverage XSLT style sheets to offer a powerful way to dynamically transform and present data. Just as Cascading Style Sheets (CSS) offer richness to static Hypertext Markup Language (HTML) pages, XSLT style sheets extend the value of XML data. XSLT not only offers the ability to present XML data but also to transform it into completely new data. It provides a mechanism for packaging, exchanging, and presenting XML data.

Like traditional data mapping tables, XSLT can transform data so it can be exchanged between different business systems. XSLT supports the mapping of one XML document into another. One application may represent issue data based on one schema, and another application may represent it with another schema. With XSLT, you can transform the data into an XML representation that matches the target application's schema.

XSLT can also dynamically transform XML documents into HTML documents for Web browser access. Transformations are useful not only for backward compatibility for older browsers but also for transforming data so it can be rendered on new Web browser-enabled devices. Transformations can also convert XML data into many other document formats as needed.

XSLT serves as an ideal solution for mapping source XML documents to target XML documents in a flexible way. However, it is essentially an entirely different language to master. This chapter introduces some of the basics of XSLT. For a deeper understanding of XSLT and how to use it, I recommend XML Programming: Web Applications and Web Services with JSP and ASP by Tom Myers and Alexander Nakhimovsky (Apress, 2002).

XSLT is composed of multiple instructions that control the formatting of an output document. Appendix D, "Using XSLT Functions," summarizes the most common XSLT instructions. Applying an XSLT style sheet to one of the source XML documents appearing earlier will produce a new XML document in a format that the target application's adapter can more easily understand. The most common XSLT instructions relate to copying values, evaluating values, looping through groups of elements, and organizing instructions into templates.

Copying Data with XSLT

Because the primary function of XSLT is to transform one XML document into another, it is not surprising that the most common instruction is to perform a straight copy of values from the source document to the target document. The <xsl:copy-of> instruction does just that:

 <?xml version="1.0"?> <DemoDat     xmlns:xsl='http://www.w3.org/1999/XSL/Transform'     xsl:version='1.0' >     <Issues>         <xsl:copy-of select='/DemoDat/Table/ID' />     </Issues> </DemoDat>

In this case, the <xsl:copy-of> instruction makes a literal copy of the specified source elements into the target document. The only element referenced inthe select statement is <ID>, so only that node is produced in the target document. Applying the previous XSLT statements to the source document presented in Listing 12-7 produces the following output:

 <?xml version="1.0" encoding="utf-16"?> <DemoDat>     <Issues>         <ID>2053</ID>         <ID>2054</ID>     </Issues> </DemoDat>

Evaluating Data with XSLT

XSLT also provides instructions for performing evaluations against source values (see Listing 12-16). You can use the <xsl:if> instruction to check a specified value in the source document and output different values to the target document as long as there are no else conditions. Inside the <xsl:choose> instruction, the <xsl:when> instruction uses the same syntax to evaluate source values. If no values match, then the <xsl: otherwise > instruction is capable of displaying a default value.

Listing 12-16: Using the <xsl:if> and <xsl:choose> Instructions to Evaluate Source Data

 <?xml version="1.0"?> <DemoDat     xmlns:xsl='http://www.w3.org/1999/XSL/Transform'     xsl:version='1.0' >     <Issues>         <xsl:if test='/DemoDat/Table/ID &lt; 1'>             <xsl:text>INVALID ID</xsl:text>         </xsl:if>         <Severity>             <xsl:choose>                 <xsl:when test='(/DemoDat/Table/Severity) = 1'>                     <xsl:text>Important</xsl:text>                 </xsl:when>                 <xsl:when test='(/DemoDat/Table/Severity) = 2'>                     <xsl:text>Mild</xsl:text>                 </xsl:when>                 <xsl:when test='(/DemoDat/Table/Severity) = 3'>                     <xsl:text>Unimportant</xsl:text>                 </xsl:when>                 <xsl:otherwise>                     <xsl:text>Unknown</xsl:text>                 </xsl:otherwise>             </xsl:choose>          </Severity>     </Issues> </DemoDat>

In this case, the <xsl:if> instruction is used to evaluate the ID value in the source document. If the value is less than 1 (represented by < 1), then an error message is displayed. Because the severity field reacts differently depending upon a value, the <xsl:choose> instruction is used. In each case, the test attribute evaluates an element value against a fixed value. Applying the previous XSLT statements to the source document presented in Listing 12-7 produces the following output:

 <?xml version="1.0" encoding="utf-16"?> <DemoDat>     <Issues>         <Severity>Important</Severity>     </Issues> </DemoDat>

Looping Through Data with XSLT

Another significant XSLT instruction relates to iterating through a collection of elements. The <xsl:for-each> instruction cycles through specified elements for processing. This instruction lets the XSLT style sheet process multiple export records during batch processing:

 <?xml version="1.0"?> <DemoDat     xmlns:xsl='http://www.w3.org/1999/XSL/Transform'     xsl:version='1.0' >     <Issues>         <xsl:for-each select='/DemoDat/Table'>             <EnteredBy>                 <xsl:value-of select='ComposedBy' />             </EnteredBy>         </xsl:for-each>     </Issues> </DemoDat>

The <xsl:for-each> instruction begins with a select attribute that points to the starting node of the loop. Each <xsl:value-of> instruction will be relative to the current iteration in the loop. Therefore, its select attribute should only point to the actual element identifier. Applying the previous XSLT statements to the source document presented in Listing 12-7 produces the following output:

 <?xml version="1.0" encoding="utf-16"?> <DemoDat>     <Issues>         <EnteredBy>JP Batson</EnteredBy>         <EnteredBy>Anke</EnteredBy>     </Issues> </DemoDat>

The XML resulting from the processing spans all rows of the export file. Additional processing may occur within the for-each operation, including additional for-each instructions.

Organizing XSLT Instructions into Templates

XSLT also provides a mechanism for creating reusable instruction sets, known as templates . These templates allow XSLT to be organized into modular and reusable blocks. The <xsl:template> element defines the starting point for a template block. All statements within this block are processed like a normal style sheet:

 <xsl:template name='GetIssueID' >     <IssueID>         <xsl:copy-of select='/DemoDat/Table/ID' />     </IssueID> </xsl:template>

The GetIssueID template can be called from another template or style sheet using the <xsl:call-template> instruction:

 <xsl:template name='AppDemoImport' > <DemoDat>     <Table>         <xsl:call-template name='GetIssueID' />     </Table> </DemoDat> </xsl:template>

Applying the previous XSLT statements to the source document presented in Listing 12-7 produces the following output:

 <?xml version="1.0" encoding="utf-16"?> <DemoDat>     <IssueID>         <ID>2053</ID>         <ID>2054</ID>     </IssueID> </DemoDat>

Templates can also process parameters just like methods . You define parameters within the template using the <xsl:param> instruction. The instruction specifies the parameter name and default value. In the content region of the template, you reference the parameter by prefixing the variable with a $, such as $id:

 <xsl:template name='GetSpecificDescription' >     <xsl:param name='id' select='/DemoDat/Table/ID' />     <Table>         <Description><xsl:value-of select='$id' /></Description>     </Table> </xsl:template>

Other templates or style sheets can supply parameters using the <xsl:withparam> instruction:

 <xsl:call-template name='GetSpecificDescription'>     <xsl:with-param name='id' >1</xsl:with-param> </xsl:call-template>

Using a combination of XSLT instructions, you can create different style sheets for each application-to-application mapping. Depending upon the incoming source data, the integration server's ProcessRequest method is called upon to load the appropriate style sheet and perform the data transformation. Listing 12-17 presents the ProcessRequest method responsible for loading and applying style sheets to perform transformations.

Listing 12-17: Invoking XSLT Processing Within the Integration Server

 public void ProcessRequest( string strData ) {     string strOutput = "";     System.IO.StringWriter sWriter = null;     try     {         strData.Replace( "\r", "" );         strData.Replace( "\n", "" );         //initialize the source document         XmlDataDocument xmlDoc = new XmlDataDocument();         xmlDoc.LoadXml( strData );         //initialize the transformation engine         XslTransform xslTransformer = new XslTransform();         xslTransformer.Load( "c:\transformation.xsl" );         //initialize the output string writer         sWriter = new System.IO.StringWriter();         //transform the document         xslTransformer.Transform( xmlDoc, null, sWriter );         //forward the response to the destination adapter         strOutput = sWriter.GetStringBuilder().ToString();         SendToAdapter( "http://127.0.0.1:3202", strOutput );     }     catch( Exception x )     {         EventLog systemLog = new EventLog();         systemLog.Source = "IssueTracker";         systemLog.WriteEntry( x.Message, EventLogEntryType.Error, 0 );         systemLog.Dispose();     }     finally     {         sWriter.Close();     }     return; }

The ProcessRequest method takes the adapter-formatted XML data, removes any line breaks found, and creates an XmlDataDocument object. Next , the XSLT style sheet is loaded into an XslTransform object. An XmlTextWriter is also created to write the generated XML output to a local file. The Transform method performs all of the work by reading the source document, applying the XSLT style sheet, and writing the output to the location specified by the XmlTextWriter.

The XML generated from the transformation should be specific to another adapter capable of inserting the data into the integrated application. The results of this method either can be posted back to a message queue or can be delivered to an adapter method via remoting.

The direct database adapter and the file exchange adapter may be different in their implementations , but they both produce XML that represents applicationspecific data. The next step is to send this application-specific data to the integration server where it can be mapped and transformed into data that is specific to the new application. Before the adapter and integration server can communicate, you need to define a communication mechanism.