Introducing the XML Family | Web Services Security

Although XML stands for eXtensible Markup Language, the acronym “XML” is mostly used to describe not only XML itself, but also the ever-growing family of related technologies. The core XML specification itself is relatively simple. In fact, it is so simple that many people, when looking at XML for the first time, find it difficult to understand why such a simple technology could be feted with such world-changing powers. The trick is not just to look at the language, which is a syntax for presenting data in a structured manner, but to also look at the surrounding technologies that act on XML in order to define it, transform it, transmit it, and secure it. This collection of technologies surrounding the core XML specification represents the real power of XML.

XML: A Syntax to Define Markup Languages

XML is a specification, defined by the W3C (World Wide Web Consortium), which defines a syntax used to define markup languages. This may seem like a roundabout definition. XML defines a syntax to structure a document, using markup. Documents used for specific purposes (for example, legal agreements defined by the LegalXML group) are examples of documents that have been marked up using XML. Because they use XML, they can be manipulated by the great quantity of XML-enabled applications available. To see the advantages of XML, let’s look at a before-and-after example.

Structured Documents

It is sometimes said that every single document is an example of a structured document, because every document has some internal logic that can be leveraged in order to break it into parts, or to translate it into another format. The problem, though, is finding this structure. This takes time and effort. In the past, it was common practice for applications to define their own unique structural markup, but today applications must share their data as easily as possible and so it makes sense to standardize a structural markup language—and that is the reason for XML.

Various methods of structuring data have been proposed over the past 30 years. Many of these have found their way into commercial use. EDI (Electronic Document Interchange) documents, for example, mainly use the offset distance from the left of the page as a means of structuring documents. The following line, taken from a UN/EDIFACT EDI purchase order, contains the name and address of a supplier:

NAD SU  JOE BLOGGS INC        101 SOME STREET         BOSTON, MA 12345     US

The structural rules for this EDI document state that the first three characters specify the type of data in the line. By examining the specification for this particular type of EDI document, we find that NAD means Name and Address. The characters in the fifth and sixth positions identify the role of the entity whose name and address we are viewing—in this case, it’s a supplier (SU stands for Supplier). The rest of the line contains the name and address itself, in fixed blocks for each line of the address, along with a state, ZIP code, and country code. Information about the structure of EDI documents is contained in lengthy specification documents. These documents explain what the various codes mean, and the offsets to use to pick information out of the EDI file.

The disadvantages for the EDI approach were that the semantic information about the data was missing. The information about what NAD and SU actually mean is contained in a specification document. This is also a problem for comma-separated files. Name-value pairs, such as those used in Windows initialization files and Java configuration files, provide the semantic information, but are too unsophisticated to be used for complex, nested information.

XML defines a new method of structuring documents, based on SGML. The same EDI fragment could be rendered in XML, as shown here in Listing 1-1.

Listing 1-1

<NameAndAddress Role="Supplier"> <CompanyName>Joe Bloggs Inc</CompanyName > <AddressLine>101 SOME STREET</AddressLine> <AddressLine>BOSTON</AddressLine> <AddressLine>MA</AddressLine> <ZipCode>12345</ZipCode> <CountryCode>US</CountryCode> </NameAndAddress>

This XML fragment contains the same raw data as the corresponding EDI fragment, but it adds descriptive information about the data. This descriptive information is contained in elements, sometimes called tags, which are surrounded by angle brackets. “NameAndAddress” is one element in the example, and AddressLine is another. Where additional information relating to an element is required, this can be added as an attribute. The role information relates specifically to the NameAndAddress, because it is the name and address of a supplier. The lines of the address are nested under the NameAndAddress element.

Verbosity

As you can see from the example, XML is quite verbose. The W3C (World Wide Web Consortium) recommends that XML elements and attribute names not be abbreviated. Therefore, NameAndAddress is used instead of NAD. The meaning of the data is more obvious than for the EDI file. This means, of course, that the XML documents are larger than other types of structured documents, even though they may convey the same meaning. This book is about Web Services, and Web Services involve computer-to- computer communication. Therefore, it is tempting to imagine that there is a trade-off between the human readability of XML and the compactness of the data.

The argument goes: “This data is being sent from one computer to another computer. No humans are reading it, so why does it have to be human readable? Surely it would be better for the data to be more compact and quicker to send.” This argument sounds convincing, but it neglects the full context of XML communication. Compression is available at lower layers of the communications stack. XML documents generally compress very well because they contain many repeating strings of text in element and attribute names. Chapter 5 shows how a compression algorithm can be used in conjunction with a digital signature, so that a document can be compressed before or after being signed.

Tip

Don’t give in to the temptation to cut down on the size of XML element and attribute names. Instead, ensure that compression is implemented at lower layers of your Web Services rollout.

Document Type Definitions and XML Schema

XML that follows the syntax rules defined by the W3C XML specification is called well formed XML. These syntax rules include the requirements:

The start tag and the end tag should be the same.
There should be no “overlapping” tags.
Element and attribute names must be surrounded by quotes.

The extensible nature of XML means that anybody can create an XML document, providing that it is well formed. However, if the XML document is not understood by anyone else, it has limited use because one of the goals of XML is to facilitate the sharing of information. That is why consortia such as OASIS (Organization for the Advancement of Structured Information Standards) and industry-specific groups such as RosettaNet (for the electronics procurement industry) provide central points for organizations to agree on XML definitions. Many of the security specifications discussed in this book include definitions of permitted XML syntax (for example, to define a SAML assertion or an XML Signature). Let’s take a look at what it means to define the structure of XML.

XML is based on SGML (Standardized General Markup Language), a metalanguage that predates the World Wide Web. SGML includes a means of defining which particular elements and attributes are used to define meaning in an XML document. These definition files are called DTDs—Document Type Definitions. DTDs allow organizations to standardize their use of XML so that each organization can understand the other’s documents. Remember that the XML specification defines only syntax, so if two organizations decide they’re just going to communicate “using XML,” that doesn’t mean very much. The question is, what type of XML?

Let’s have a look at an example of a DTD. Listing 1-2 shows a DTD that defines the XML from Listing 1-1.

Listing 1-2

<?xml version="1.0"?>  <!DOCTYPE AddressingInfo [ <!ELEMENT NameAndAddress (CompanyName, AddressLine+, CountryCode)> <!ELEMENT CountryCode (#PCDATA)>> <!ELEMENT CompanyName (#PCDATA)> <!ELEMENT AddressLine (#PCDATA)> <!ELEMENT ZipCode (#PCDATA)> <!ATTLIST NameAndAddress Role CDATA #REQUIRED> ]>

This DTD opens with a line to state which XML version it supports. Then, the DOCTYPE directive states that this defines a piece of data called AddressingInfo. This means that a document that is governed by this DTD can refer to this fact by referring to the DTD, also using the DOCTYPE directive:

<!DOCTYPE AddressingInfo SYSTEM http://www.example.com/AddressingInfo.dtd>

This line tells an XML processor to validate the document against the AddressingInfo DTD. If the document conforms to the DTD, it is said to be a valid document.

The next line tells us that the NameAndAddress contains three subelements (or “children”): CompanyName, AddressLine, and CountryCode. These are defined on the next three lines as PCDATA. PCDATA means “parsed character data,” meaning that they are to contain textual data that can be parsed by a text parser. After the elements comes the attribute. NameAndAddress has only a single attribute, Role.

As is apparent from Listing 1-2, DTDs define XML documents but do not actually use XML themselves. This is a disadvantage, because it means another language for the XML implementer to learn.

This is not the only disadvantage of DTDs. DTDs are limited in the extent to which fine-grained constraints can be placed on XML data. Simple constraints are possible— declaring an element to be mandatory, for example—but the rules for element contents are limited. For example, if an element called MusicalNote is intended to contain a musical note that may only be A through G, then DTDs provided no way to enforce that rule. A document would still be valid (that is, conforming to the DTD) even if it contained a nonexistent H note.

XML Schema, an alternative to DTDs, was created to address these deficiencies. More complex rules about XML documents can be created than were possible in DTDs, and these rules are themselves written in XML. However, XML Schema has proven to be a mixed success. Many argue that it is overly complex and difficult to learn. However, most of the XML and Web Services security specifications described in this book contain XML Schema definitions. The basics of XML Schema are simple, and in fact are no more difficult to understand than DTDs.

Keeping with our example, the XML Schema defining the XML in Listing 1-1 is shown in Listing 1-3.

Listing 1-3

<?xml version="1.0" encoding="UTF-8" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:complexType name="NameAndAddress">    <xs:sequence>       <xs:element name="CompanyName" type="xs:string" use="required"/>       <xs:element name="AddressLine" maxOccurs="unbounded" type="xs:string"/>       <xs:element name="CountryCode" type="countryCodeType"/>       <xs:element name="ZipCode" type="zipCodeType"/>    </xs:sequence>    <xs:attribute name="Role" type="xs:string" use="required"/> </xs:complexType> <xs:simpleType name="zipCodeType">    <xs:restriction base="xs:string">       <xs:pattern value="[0-9]{5}"/>    </xs:restriction> </xs:simpleType> <xs:simpleType name="countryCodeType">    <xs:restriction base="xs:string">       <xs:pattern value="[A-Z]{2}"/>    </xs:restriction> </xs:simpleType> </xs:schema>

When the DTD (see Listing1-2) and the XML Schema are compared, it is apparent that the XML Schema is more complex and allows for more constraints on the XML to be defined. In addition, XML Schema allows data types to be defined. In Listing 1-3, a data type called zipCodeType is defined to mean a five-character data string that is composed of integers from 0 through 9. Similarly, in the Schema, a two-character restriction is placed on the data contained in the CountryCode element.

Security specifications require a great deal of accuracy. Differences between the implementations of security recommendations have dogged the information security industry over the past 10 years. Therefore, XML Schema is an important tool used to define Web Services security technologies, and we will see that XML Signature, XML Encryption, WS-Security, and others all make use of XML Schema.

XPath

XPath is another XML-related specification that is used in Web Services security. XPath defines a syntax to pinpoint just a portion of an XML document. This is useful if just part of an XML document is to be encrypted or digitally signed, or if an access control decision is being made based on the contents of an XML document. If we wish to just sign the contents of the CompanyName element in Listing 1-1, it would be specified using the following XPath syntax:

/NameAndAddress/CompanyName

This pinpoints the CompanyName element, which is located underneath the NameAndAddress element. XPath gets much more complicated than this, of course, but this information is not immediately relevant for an understanding of how XPath is used in Web Services security.