Web Service Security Issues

Before we can address Web services security, we must first understand the areas where potential threats may occur as well as how they may occur. In this section, we look at the issues surrounding Web services security flow of a Web service invocation starting from the client application through the actual business logic implementation of the service and back to the client again. Throughout this flow, we analyze the possible threats and best practices to address those threats.

Data Protection and Encryption

Data protection refers to the management of transmitted messages so that the contents of each message arrives at its destination intact, unaltered, and not viewed by anyone along the way. The concept of data protection is made up of the sub-concepts of data integrity and data privacy:

Data integrity: When data is transmitted from one entity to another entity, the receiving entity must have faith that the data is in fact what the sending entity transmitted. The concept of data integrity assures the recipient that the data was neither damaged nor intercepted by a third-party and altered while in transit.
Data privacy: Data integrity assures the recipient that the data has been received unaltered and intact. Data privacy builds on this and assures the recipient that the contents of the data have not been viewed by any third-party.

Data protection is critical within a Web services environment as personal information, such as credit card numbers, and competitive organizational information, such as customer contacts and employee names, will be exchanged between Web services.

Encryption techniques are used to implement data integrity and data privacy. The most commonly used of such solutions is the Secure Sockets Layer (SSL) protocol. The SSL protocol creates a secure tunnel between the origination and destination computers based on public-key encryption techniques. The protocol also supports authentication of the origination computer to the destination computer, and optionally supports authentication of the destination computer.

However, SSL provides only point-to-point data protection. In many instances the Web service provider may itself forward the request to be ultimately handled by another computer or even a person. A legacy mainframe may actually fulfill requests that are forwarded to it from another computer that simply "wraps" the application running on the mainframe's capabilities as a Web service. As shown in Figure 8-3, after the termination of the SSL connection (at the Web services server) all the data is left insecure.

Figure 8-3. SSL provides point-to-point data protection, leaving forwarded data insecure.

graphics/08fig03.gif

The inability to provide end-to-end security among multiple parties is a major drawback of the SSL protocol within Web services environments where it is routine for multiple entities to be involved. Consider purchasing a book from an online merchant using a credit card. The purchase order for a book is sent from the user to the merchant, who then forwards the credit card information to a bank. Only information about the book, e.g., book title and quantity ordered, must be viewable by the online merchant, not by both the merchant and the bank. Similarly, the credit card information must be viewable only by the bank. The ability to selectively encode various parts of a message using different keys is a critical requirement for Web services.

Moreover, SSL involves a large amount of overhead in encrypting and decrypting an entire message. Oftentimes, a message is comprised of a mixture of both secure and insecure information. Returning to our book purchase order, information about the book can be sent as clear text, while the credit card number must be encrypted. For these types of messages, encrypting the entire message using SSL adds needless overhead.

The SSL protocol provides point-to-point data protection between two parties but has the following weaknesses:

It does not provide end-to-end data protection between multiple parties.
It does not support selectively encrypting segments of a message.

XML Encryption builds on SSL and provides end-to-end security that addresses these two weaknesses. To understand how XML Encryption works, consider the following XML code fragment representing our purchase order for a book:

 <?xml version='1.0'?> <PurchaseOrder>   <Cart>     <Item>       <Title>Developing Enterprise Web Services</Title>       <Quantity>21</Quantity>     </Item>   </Cart>   <Payment>     <PaymentType>VISA</PaymentType>     <Number>123456789000</Number>     <Expiration>01-23-2024</Expiration>   </Payment> </PurchaseOrder>

The purchase order is comprised of XML elements that specify the contents of the shopping cart as well as the payment details. The <Cart> element contains items to be purchased. Each <Item> element contains the title of the book as well as the desired quantity. The <Payment> element contains subelements that specify the type of credit card, the credit card number, and the credit card expiration date.

Using XML Encryption, we can selectively encrypt an entire element or the contents of an element. In the above book purchase order, the only element that must be encrypted is the credit card number denoted by the <Number> element. The resulting document after using XML Encryption to encrypt the credit card number is as follows:

 <?xml version='1.0'?> <PurchaseOrder>   <Cart>     <Item>       <Title>Developing Enterprise Web Services</Title>       <Quantity>21</Quantity>     </Item>   </Cart>   <Payment>     <PaymentType>VISA</PaymentType>     <EncryptedData xmlns='http://www.w3.org/2001/04/xmlenc#'            Type='http://www.w3.org/2001/04/xmlenc#Element'>         <CipherData>     <CipherValue>A23B45C56</CipherValue>         </CipherData>     </EncryptedData>    <Expiration>01-23-2024</Expiration>   </Payment> </PurchaseOrder>

The encrypted data is specified within the <EncryptedData> element. The Type attribute specifies that an element has been encrypted, and the xmlns attribute specifies the namespace used. The actual encrypted data appears as the contents of <CipherValue>.

In some cases, it is advantageous to encrypt only the contents of the element and not the element itself. Using XML Encryption to do thisresults in the following encrypted document:

 <?xml version='1.0'?> <PurchaseOrder>   <Cart>     <Item>       <Title>Developing Enterprise Web Services</Title>       <Quantity>21</Quantity>     </Item>   </Cart>   <Payment>     <PaymentType>VISA</PaymentType>     <Number>       <EncryptedData             xmlns='http://www.w3.org/2001/04/xmlenc#'             Type='http://www.w3.org/2001/04/xmlenc#Content'>         <CipherData>     <CipherValue>A23B45C56</CipherValue>         </CipherData>       </EncryptedData>    </Number>    <Expiration>01-23-2024</Expiration>   </Payment> </PurchaseOrder>

Again, the encrypted data is specified within the <EncryptedData> element. This time, the Type attribute specifies that the contents of an element have been encrypted and the element tags, <Number> and </Number>, appear as clear text. The actual encrypted data still appears as the contents of <CipherValue>.

Finally, we can also use XML Encryption to encrypt the entire message as follows:

 <?xml version='1.0'?>   <EncryptedData xmlns='http://www.w3.org/2001/04/xmlenc#'        Type='http://www.isi.edu/in-notes/iana/assignments/              media-types/text/xml'>     <CipherData>       <CipherValue>A23B45C56</CipherValue>     </CipherData>   </EncryptedData>

This time, the entire document, including all the tags and their values, are encrypted and appear as the value of the <CipherValue> element. The value of the attribute Type (of <EncryptedData>) is now set to http://www.isi.edu/in-notes/iana/assignments/media-types/text/xml since the encrypted data prior to encryption was XML the official type definition by the Internet Assigned Numbers Authority (IANA) for XML.

Interestingly enough, XML Encryption can also be used to encrypt non-XML documents. For instance, encrypting a JPEG image file using XML Encryption results in this document:

 <?xml version='1.0'?>   <EncryptedData xmlns='http://www.w3.org/2001/04/xmlenc#'        Type='http://www.isi.edu/in-notes/iana/assignments/media-types/jpeg'>     <CipherData>       <CipherValue>A23B45C56</CipherValue>     </CipherData>   </EncryptedData>

There is little difference between the entire XML document that was encrypted and the encrypted JPEG image. The only difference is the value of the Type attribute of EncryptedData. For the XML document, the value of Type was set to http://www.isi.edu/in-notes/iana/assignments/media-types/text/xml while in this case the value of Type is set to http://www.isi.edu/in-notes/iana/assignments/media-types/jpeg the official IANA type definition for JPEG images. Of course, the actual data will also be different.

Toolkits are available that facilitate the process of encrypting documents using XML Encryption. Implementations of XML Encryption are included within IBM's XML Security Suite and Baltimore Technologies' KeyToolsXML.

More information about XML Encryption can be found in the W3C's technical report XML Encryption Syntax and Processing at http://www.w3.org/TR/xmlenc-core/.

Authentication and Authorization

Authentication refers to verifying that the identity of an entity is in fact that which it claims to be. The entity trying to have its identity authenticated is known as the principal. The evidence used to prove the principal's identity is known as the credentials. If the correct credentials are used, the principal is assumed to be who it claims to be.

Credentials can be misappropriated. Passwords, for example, are easy to steal, while retinal scan data and thumbprints are more difficult.

In a Web services environment, a Web service provider may need to be authenticated by the Web service requester before the service is invoked and personal information is sent. The requester may also need to be authenticated by the provider before the service is rendered and critical information is sent back in the reply.

In many simple service invocations that do not involve the exchange of personal information or where there is no charge for the service invocation, authentication is unnecessary. For example, a client application that queries a free weather report Web service neither needs to authenticate the provider nor does the provider need to authenticate the requester.

After a principal's identity has been authenticated, authorization mechanisms are used to determine what the user (or application) will be allowed to access. Information about the user, such as subscription levels, is used to allow the appropriate level of access. For example, a Web service may have twenty operations, of which only five are available for access by some users while all twenty are available for other users. Or, particular endpoints of a Web service may be made available for premier customers, while standard customers must share just a single endpoint.

Authorization is increasingly important within Web services environments. Web services expose data as well as processes and operations to programmatic access. For the most part, access to this type of information was previously channeled through humans. These human beings acted as checkpoints that safeguarded the information from unauthorized access. With Web services providing programmatic access, authorization schemes must act as the checkpoints.

A variety of technologies and approaches can be used to implement authentication and authorization for Web services. These approaches can generally be classified as system-level approaches, application-level approaches, or third-party approaches.

System-level approaches do not require custom application (or Web service) programming to implement. Nor does it require any changes to the Web service if the authentication approach is changed. Usually, the operating system or the Web server handles authentication and authorization prior to forwarding the SOAP request to the Web service.

Common system-level approaches to authentication include basic passwords, encrypted passwords, and digital certificates. Digital certificates require that each user obtain a certificate verifying his identity. Since today the use of certificates is limited, this approach does not present a viable mass-market authentication scheme. In Microsoft Windows-based systems, both password and certificate credentials are checked against valid user accounts, which necessitate creating accounts before users can access a Web service.

Application-level approaches to authentication require custom development, and usually have to be modified with changes to the authentication mechanism. Sometimes, system-level approaches are insufficient or require too much overhead. For example, the overhead of creating and maintaining individual Windows user accounts may outweigh the benefits of using system-level passwords.

Application-level authentication approaches can pass credentials as part of the SOAP message. In this case, the Web service must parse the credentials as well as implement authentication and authorization mechanisms itself. The credentials can be transmitted as part of the SOAP header or the SOAP body. In the case where credentials are passed as part of the SOAP header, a service other than the called Web service may parse and authorize the invocation. Such a modularized solution allows the development of system-level schemes, and ensures that the Web service consumes computer cycles processing only valid and authorized requests.

SOAP on top of HTTP exposes the credentials as clear text, and facilitates misappropriation of this information. SSL can be used to encrypt the data for all SOAP messages sent to the other operations of the Web service. Unfortunately, SSL imposes significant performance overhead compared with just HTTP alone.

For operations where security can be loosened a bit, alternatives that are less of a performance drain are available. For instance, an authentication operation may be added to the Web service itself. SSL can be used to send SOAP messages to this operation so the credentials are not in the clear. Once a user has been authenticated, the Web service can return a token or a session key that can be used for subsequent SOAP messages. Although the session key can be stolen, the credentials (username and password) are not available and it is not critical to encrypt the session key. Accordingly, HTTP alone can be used. Another method is to use HTTP cookies for the session information instead of the SOAP header or body.

Figure 8-4 depicts a SOAP envelope that uses the optional SOAP header specification to pass username and password credentials. Before the SOAP body, the SOAP header is defined that includes UserName and Password elements. SOAP messages that either lack a header or present incorrect credentials will not be allowed to invoke the GetRealTimeQuote method.

Figure 8-4. Passing username-password credentials as part of the SOAP header.

 <soap:Envelope      xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">   <soap:Header>     <AuthHeader xmlns="http://tempuri.org/">       <UserName>MyUserName</UserName>       <Password>MyPassword</Password>     </AuthHeader>   </soap:Header>   <soap:Body>     <GetRealTimeQuote xmlns="http://tempuri.org/">       <symbol>HPQ</symbol>     </GetRealTimeQuote>   </soap:Body> </soap:Envelope>

Integrated development environments (IDEs), Web services platforms and tools facilitate generating and parsing SOAP headers and bodies so that developers do not have to write the parsing code themselves.

Third-party authentication services may also be available. Single sign-on capabilities are of particular interest to Web services environments, which are comprised of a large number of heterogeneous services, each of which may use different authentication mechanisms. Requiring the service requester to maintain and manage a large number of credentials for the variety of authentication schemes used by the different Web services within the environment is difficult and impractical.

With a single sign-on scheme, service requesters need only maintain a single credential. The third-party single sign-on service manages and maps the single credential held by service requesters to each of the service providers. The complexities of maintaining, managing, and revoking authentication credentials and authorization access list are handle by the third-party service provider. Two examples of single sign-on services are Microsoft Passport and the Liberty Alliance.

Non-Repudiation and Signatures

Protecting the confidentiality of messages is important within any secure environment. But, data privacy is just one piece of the security puzzle. Alongside privacy are the following related and equally important issues:

Data authenticity: This verifies the identity of the sender of a message. The concept of data authenticity answers the question: Who sent this message?
Data integrity: This verifies that the message data that was received was in fact the same data that was sent by the sender, and the information was not modified in any way in transit. The concept of data integrity answers the question: Is this data really what the sender sent?
Non-repudiation: It provides a means to prove that a sender sent a particular message, and does not allow the sender to later disavow having sent it. The concept of non-repudiation answers the question: Can the sender deny having sent this message?

The means to support the important issues of authenticity, integrity, and non-repudiation are not provided by standard security mechanisms, such as SSL and passwords that we have already discussed.

These issues are addressed by the concept of digital signatures. Digital signatures are similar to standard handwritten signatures, and allow the receiver of a document to verify that the source from which it came has created (or viewed) and validated the contents of the document. It also supports the ethic of accountability in that the identity of the person who validated the document can be proved and the person can be held accountable for their validation.

Consider the creation of purchase orders within organizations. Purchase orders (POs) are documents that allow an organization to purchase components or services from a vendor. Usually, companies buy components in large volumes and errors in the components purchased, delivery dates, or payment terms may result in potentially large losses either from lost revenue or from increased costs.

The steps necessary to create a PO are usually complex and involve critical decisions made by a number of different people. The process may proceed as follows:

An engineer researches the different components that can be used within the system being developed and makes a recommendation for the desired part.
A project manager determines when the components must be delivered to allow the proper manufacturing and assembly of the overall system.
A vice president (or someone else with the appropriate authority) authorizes the purchase.
An accountant specifies the bank account number (or other payment means) by which to pay for the purchase.
A purchasing officer takes the information from the engineer and project manager and identifies the vendors and distributors that sell that component. He also negotiates the best price and payment terms.

In each step, different people's areas of expertise are brought to bear. Each of these people must be accountable for their actions only and not for those of others. If the engineer makes a mistake in identifying the appropriate component, only she should be held accountable; the purchasing officer who made the actual purchase should not be liable. Similarly, the purchasing officer must be confident that the component part number specified on the PO is as the engineer specified and has not been modified (either intentionally or accidentally) by someone else.

The XML Signatures specification specifies a technology that meets these needs, and is well suited for use within Web services environments. The key characteristics of the XML Signature technology are:

XML support: The XML Signatures technology provides a standard means by which the actual signature is represented in XML. Since Web services are based on XML, having a standard mechanism for representing digital signatures within XML environments is important.
Selective signatures: The XML Signatures technology provides a means to selectively sign different parts of a document with different signatures. As we saw in the example of writing a purchase order, different parts of a document (the purchase order) may need to be signed by different people (or entities). Usually within Web service environments, multiple Web services may coordinate and collaborate to accomplish a unit of work (e.g., create a purchase order). Clearly, the need for selective signatures is critical within Web service environments.
Simple archiving: Signed documents are important not only during transmission between parties, but also as an archival medium. As demonstrated by the purchase order example, signed documents can be used as a means to prove and enforce accountability and liability. In order to do so, signed documents must be easily archived so that both the contents of a document as well as its signature(s) can be easily retrieved at a later time if needed. Since the XML Signature technology supports signatures that are inlined with the actual XML document (as opposed to separate signature files), it presents a simple means by which signed documents can be archived and later retrieved.
Supports references: A document to be signed may not directly contain all of its contents, but instead contain references to remote content. For example, in the purchase order example, a photograph of the component to be purchased may be referenced but not directly contained within the document. Even though the image may not be contained within the document, the signature for that part (and other related parts) of the document must include the referenced image. The XML Signature technology supports the use and signing of referenced document content.

Now that we have seen the benefits of digital signatures as well as the unique features of the XML Signature technology, we next turn to the process of signing documents.

The basic technology behind signatures is simple and is based on Public Key Infrastructure (PKI) technologies. The basic process is as follows:

The document that is to be signed is transformed using the private key of the sender.
When the document is received, the receiver transforms the received document using the public key of the sender. Since only a transformation using the public key of the sender can undo the initial transformation using the private key of the sender, the receiver can be certain that the owner of the private key has signed the document.

For this digital signature to have any validity, the receiver must have confidence in the authenticity of the public key and that it actually belongs to the entity the receiver thinks it belongs to. Otherwise, an imposter can claim to be a different entity, transform the document using his private key, and provide his public key to the receiver. To safeguard against this situation, a certificate issued by a trusted Certificate Authority is used to match a public key with the actual entity.

Now that we have discussed the fundamental concepts underlying digital signatures, the steps in generating a signature using the XML Signature technology are as follows:

Identify the resources that are to be signed by using a Uniform Resource Identifier (URI). The resource identified by the URI can be located remotely and available over a network, or can be located within the signature document itself. Each of these resources is located within <Reference> elements.

Figure 8-5 depicts how each of the resources that are to be signed is enumerated. The two resources shown refer to an HTML document (http://www.example.org/index.html) and to a data object (ComponentSuggestionForProjectX) that is located within the document itself. Other resources such as JPEG images (http://www.example.org/logo.jpg), or XML documents (http://www.example.org/data.xml) can also be referenced.

Figure 8-5. Identifying the resources to be signed.
```
 <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">   <Reference URI="http://www.example.org/index.html">   </Reference>   <Reference URI="ComponentSuggestionForProjectX">   </Reference>   <Object >     .     .     .   </Object> </Signature> 
```
The next step is to calculate the digest of each of the enumerated resources. A digest is a unique thumbprint of the actual resource that is calculated using a digest algorithm such as the Secure Hash Algorithm (SHA-1) or the MD5 algorithm. A digest is similar to a hash and results in a smaller representation of the original data. The transformations using private and public keys are performed on the smaller digest value and not the original data, largely because of the performance implications of transforming large amounts of data. On receiving the signed document, the receiver calculates the digest and uses the sender's public key to transform the digest and verify the signature.

Figure 8-6 shows how the digests are associated with each resource. Each <Reference> element specifies the URI for the actual resource and contains a <DigestMethod> element that specifies the algorithm used to calculate the digest, as well as a <DigestValue> element that contains the actual calculated digest value.

Figure 8-6. Specifying the digest values for each of the identified resources that are to be signed.
```
 <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">   <Reference URI="http://www.example.org/index.html">     <DigestMethod        Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />     <DigestValue>HlSGWGJiTAg4loR1BEI9238H3f3=<DigestValue>   </Reference>   <Reference URI="ComponentSuggestionForProjectX">     <DigestMethod        Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />     <DigestValue>kLsgTYdTIAG4UoB1rt972H48FHR=<DigestValue>   </Reference>   <Object >     .     .     .   </Object> </Signature> 
```
The next step is to collect all of the <Reference> elements that will be signed together into a <SignedInfo> element, then calculate the digest and private key transformations (signature).

Before the digests and signatures can be calculated, the <SignedInfo> element must be canonized. The canonical form of an XML document takes into account the fact that logically identical XML documents can have different physical representations. These differences can stem from such issues including the use of whitespace, the use of quotation marks around element attribute values, the inclusion of default attributes, and the lexicographic ordering of attributes and namespace declarations. As a simple example, consider the following two XML fragments for a hotel room booking:
```
 <room bedtype="king" smoking="no">, and <room smoking="no" bedtype="king"> 
```
Although both segments are logically identical and convey the same information, their representations are physically different. That is, the two fragments do not have the same sequence of bytes. This results in differences in the digest values computed by the sender and receiver of signed documents.

Figure 8-7 illustrates how all of the <Reference> and <SignedInfo> elements are combined together with the calculated signature inside of a <Signature> element.

Figure 8-7. Combining together all of the <Reference> elements together into a <SignedInfo> element, and calculating the signature.
```
 <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">   <SignedInfo>     <CanonicalizationMethod        Algorithm="http://www.w3.org/TR/2001/                   REC-xml-c14n-20010315" />     <SignatureMethod        Algorithm="http://www.w3.org/2000/09/                   xmldsig#dsa-sha1" />     <Reference URI="http://www.example.org/index.html">       <DigestMethod        Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />       <DigestValue>HlSGWGJiTAg4...</DigestValue>     </Reference>     <Reference URI="ComponentSuggestionForProjectX">       <DigestMethod        Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />       <DigestValue>kLsgTYdTIAG4UoB1r...</DigestValue>     </Reference>   </SignedInfo>   <SignatureValue>     HlSGWGJiTAg4lB1rt972H48FHR=   </SignatureValue>   <Object >     .     .     .   </Object> </Signature> 
```

The digital certificate for the sender may also be provided within the signature. The X.509 digital certificate for the sender including the sender's public key would provide all the information necessary to confidently verify the signature. The certificate information would be placed within a <KeyInfo> element as follows:

 <KeyInfo>   <X509Data>     .     .     .   </X509Data> </KeyInfo>

We have now created a digital signature and have successfully signed the resources to be transmitted from the sender to the receiver.

On receiving the signed document, the receiver must simply follow these steps to verify the signature as well as the integrity of the received data:

Calculate the digest of the <SignedInfo> element using the digest algorithm specified in the <SignatureMethod> element.
Use the public key of the sender (from the <KeyInfo> element or from external sources) to verify the signature of the digest.
Calculate the digests of each of the resources (within each <Reference> element) using the algorithm specified in the <DigestMethod> element. Compare the calculated values with those specified within the <DigestValue> of each <Reference> element to verify the integrity of the data.

These simple steps are used to create and later verify digital signatures using the XML Signatures specification. Digitally signed documents provide a means to verify the authenticity and integrity of a document as well as a means to implement non-repudiation.

Data Protection and Encryption

Figure 8-3. SSL provides point-to-point data protection, leaving forwarded data insecure.

Authentication and Authorization

Figure 8-4. Passing username-password credentials as part of the SOAP header.

Non-Repudiation and Signatures

Figure 8-5. Identifying the resources to be signed.

Figure 8-6. Specifying the digest values for each of the identified resources that are to be signed.

Figure 8-7. Combining together all of the <Reference> elements together into a <SignedInfo> element, and calculating the signature.

Figure 8-7. Combining together all of the `<Reference>` elements together into a `<SignedInfo>` element, and calculating the signature.