|< Day Day Up >|| |
Extensible Markup Language (XML) allows you to specify your own markup language with tags defined in a Document Type Definition (DTD) or XML Schema. XML can be used as a means to specify the content of messages between servers, whether the two servers are within an enterprise or represent a business-to-business connection. The critical factor here is the agreement between parties on the message schema, which is specified as an XML DTD or Schema. An XML parser is used to extract specific content from the message stream. Your design will need to consider whether to use an event-based approach, for which the SAX API is appropriate, or to navigate the tree structure of the document using the DOM API.
The IBM XML4J XML parser was made available through the Apache open source organization under the Xerces name. For open source XML frameworks, see:
XML documents are defined using DTDs or XML Schemas.
DTDs are a basic XML definition language, inherited from the SGML specification. The DTD specifies what markup tags can be used in the document and what their structures are.
DTDs have two major problems:
Poor data typing: In DTDs, elements can only be specified as EMPTY, ANY, element content, or mixed element-and-text content, and there is no standard way to specify null values for elements.
Data typing like date formats, numbers, or other common data types cannot be specified in the DTD, so an XML document may comply with the DTD but still have data type errors that can only be detected by the application.
Not defined in XML: DTD uses its own language to define XML syntax that is not compliant with the XML specification. This makes it difficult to manipulate a DTD.
To solve these problems, the World Wide Web Consortium (W3C) specified a new standard to define XML documents called XML Schema. XML Schema provides the following advantages over DTDs:
Strong typing for elements and attributes
Standardized way to represent null values for elements
Key mechanism that is directly analogous to relational database foreign keys
Defined as XML documents, making them programmatically accessible
Even though XML Schema is a more powerful technology to define XML documents, it is also a lot harder to work with, so DTDs are still widely used to define XML documents. Additionally, simple, non-hard-typified documents can be easily defined using DTDs with similar results to using XML Schema.
Whether to use one or the other will depend on the complexity of the messages and the validation requirements of the application. Actually, in many cases both a DTD and an XML Schema are provided, so they can be used by the application depending on its requirements.
Remember that the validation process of an XML document using XML Schemas is an expensive process. Validation should be performed only when it is necessary.
Extensible Stylesheet Language Transformations (XSLT) is a W3C specification for transforming XML documents into other XML documents. The XSLT is built on top of the Extensible Stylesheet Language (XSL), a stylesheet language for XML (such as CSS2 for HTML). Unlike CSS2, XSL is also a transformation language.
A transformation expressed in the XSLT language defines a set of rules for transforming a source tree to a result tree, and it is expressed in the form of a stylesheet.
An XSLT processor is used for transforming a source document to a result document. There are currently a number of XSLT processors available on the market. DataPower has introduced an XSL just-in-time (JIT) compiler, which speeds up the time taken for the XSL transformation.
The XSLT processor has a performance overhead, so online processing of larger documents can be slow.
XML security is an important issue, particularly where XML is being used by organizations to interchange data across the Internet. Several new XML security specifications are working their way through three standards bodies-the World Wide Web Consortium (W3C), Internet Engineering Task Force (IETF), and Organization for the Advancement of Structured Information Standards (OASIS). We highlight a few of them here:
XML Signature Syntax and Processing is a specification for digitally signing electronic documents using XML syntax. According to the W3C, "XML Signatures provide integrity, message authentication, and/or signer authentication services for data of any type, whether located within the XML that includes the signature or elsewhere."
A key feature of the protocol is the ability to sign parts of an XML document rather than the document in its entirety. This is necessary because an XML document might contain elements that will change as the document is passed along, or various elements that will be signed by different parties.
WebSphere Studio provides you with the ability to create (using a wizard) and verify XML digital signatures.
XML encryption will allow encryption of digital content, such as Graphical Interchange Format (GIF) images or XML fragments. XML Encryption allows parts of an XML document to be encrypted while leaving other parts open, encryption of the XML itself, or the super-encryption of data (that is, encrypting an XML document when some elements have already been encrypted).
XML Key Management Specification (XKMS) establishes a standard for XML-based applications to use Public Key Infrastructure (PKI) when handling digitally signed or encrypted XML documents. XML signature addresses message and user integrity, but not issues of trust that key cryptography ensures.
Security Assertion Markup Language (SAML) is the first industry standard for secure e-commerce transactions using XML. It aims to standardize the exchange of user identities and authorizations by defining how this information is to be presented in XML documents, regardless of the underlying security systems in place.
For further discussion, see the Sun ONE article Riddle Me This: Is Your XML Data Safe? by Brett Mendel:
XML has many advantages over other technologies. Some of the factors that have influenced the wide acceptance of XML are:
Acceptability of use for data transfer
XML is a standard way of putting information in a format that can be processed and exchanged across different hardware devices, operating systems, software applications, and the Web.
Uniformity and conformity
XML gives you a common format that can be developed upon and is accepted industry-wide.
Simplicity and openness
Information coded in XML is human readable.
Separation of data and display
The representation of the data is separated from the presentation and formatting of the data for display in a browser or other device.
XML has been accepted widely by the information technology and computing industry. Numerous tools and utilities are available, along with new products for parsing and transforming XML data to other data, or for display.
Some XML issues to consider are:
While XML tags can allow software to recognize meaningful content within documents, this is only useful to the extent that the software reading the document knows what the tagged content means in human terms, and knows what to do with it.
When multiple applications use XML to communicate with each other they need to agree on the tag names they are using. While industry-specific standard tag definitions often do exist, you can still declare your own non-standard tags.
XML documents tend to be larger in size than other forms of data representation.
|< Day Day Up >|| |