Security Basics | Securing Web Services with WS-Security: Demystifying WS-Security, WS-Policy, SAML, XML Signature, and XML Encryption

< Day Day Up >

The Web is an interconnected global information system that provides resources suitable for consumption directly by humans . In this model, security is critical for many of these resources (login-password authentication at restricted sites, SSL encryption of credit cards and other personally identifiable confidential information). It only makes sense, then, that application-to-application Web services need at least this much security as well.

In fact, because Web services expose critical and valuable XML-encoded business information, Web services security is a critically important concept to fully understand. For one thing, trade secret pilfering is already a large problem, and without security, Web services might even make this situation worse . The reason is that Web services can be thought of as allowing in strange , new users who might take your company's valuable business secrets out.

This section covers basic security concepts to establish the vocabulary that will be used throughout this book. Keeping communications secret is the heart of security. The science of keeping messages secret is called cryptography . Cryptography is also used to guarantee trust in a known identity across a network by "binding" that identity to a message that you can see, interpret, and trust. An identity asserting itself must be authenticated by a trust authority to a previously established identity known to the authority for the binding to be valid. After you know the identity, authorization allows you to specify what the individual with that identity is allowed to do. When you receive a secret message, you need to know that nothing in the message has been changed in any way since it was published, an attribute called integrity . When cryptography successfully keeps a message secret, it has satisfied the requirement for confidentiality . At times, you might want to know that someone who received confidential information cannot deny that she received it, an important security concept called non- repudiation .

Most of these core security concepts depend on encryption technologies, so before you look at any of them more closely, take a look at the fundamentals of encryption.

Shared Key and Public Key Technologies

We won't get very far in our security discussions without bumping into shared key and public key technologies. They, in turn , stem from cryptography. We will briefly introduce these concepts here so that we can apply them where needed throughout the rest of the book.

Cryptography

Cryptology is the branch of mathematics focused on designing algorithms to keep information (data) secret. Cryptography is the work of applying these algorithms to secure systems, protocols, applications, and messages ^[4] .

^[4] Recommended text on cryptography: Applied Cryptography by Bruce Schneier (John Wiley & Sons, 1996).

The first and most important area of cryptography to discuss is encryption . Encryption is the basis for XML Encryption and also for XML Signature, which encrypts a digested form of a message. The message digest encrypted in a digital signature is created using another important cryptographic algorithm called a hash function , which is a special class of one-way function that creates a fixed- size (small) output that is unique for all input messages and that is not, in practice, reversible. In the Web services arena, you will find common uses for both shared key ^[5] and public key encryption.

^[5] The terms shared key, secret key , and symmetric key are used interchangeably in various texts . To be consistent in this book, we choose to use the term shared key throughout, but occasionally context requires we also use the term symmetric key

A message that is completely readable and is in no way scrambled or disguised is called plaintext . Plaintext is unencrypted data. Encryption is the process of scrambling or disguising plaintext by applying a cryptographic algorithm to produce ciphertext . Ciphertext is encrypted data. Decryption reverses the encryption process and turns ciphertext back into its original plaintext form. These concepts are shown in Figure 1.2.

Figure 1.2. The relationships between plaintext and ciphertext, and the encryption and decryption processes that transform them.

graphics/01fig02.gif

The goal of encryption ”and therein the way to achieve confidentiality ”is to create ciphertext from plaintext that is undecipherable to anyone except the intended recipient. A special cryptographic algorithm that creates seemingly random permutations on the message, but which in fact are reversible under the right circumstances, performs the encryption process.

The algorithms for encryption and decryption require a key, which is a special numeric value that is required as a parameter for the algorithm to perform its task. The wrong key will get garbage out, not the correct output.

Shared key algorithms use the same key performing encryption and decryption symmetrically and are relatively fast. Public key encryption uses different but mathematically related (a public and private pair ) keys performing encryption and decryption asymmetrically and is primarily used for secure shared key distribution and digital signatures.

Throughout this book, we will use the term shared key when we refer to symmetric encryption and public key when we refer to asymmetric encryption. We will use the term subject to refer to the holder of a key. A subject may be an individual or an entity (computer).

The magic is in the keys. But what is a key, and what does it have to do with encryption?

Keys

A key is a set of bits that acts as an input parameter to a crypto-algorithm. Think of the crypto-algorithm like the lock on your house door. That lock is standard, and so is your door. Lots of other people have doors and locks that are outwardly exactly the same as yours. But inside the lock on your door are some unique (or almost ^[6] ) settings of tumblers that exactly match your and only your key.

^[6] "Almost unique" because, like door locks, there is not an absolute certainty that two keys are unique. But the chances of two keys being the same is infinitesimally small, just as is the chance that your key will happen to open a neighbor's door lock.

Algorithms for encryption and decryption do not need to be and normally are not kept secret. It is the key that is kept secret. It is an important fundamental principle of cryptography that the algorithms be public, standard, and widely distributed and carefully scrutinized. This principle ensures that all the world's cryptographers fully shake out the algorithms for any security flaws.

The key is the variable that makes the algorithm result unique and secret. For some crypto-algorithms, the key may be a random number. For others, such as public key algorithms, it must be carefully chosen ”a complex, time-consuming mathematical operation by itself. The key space needs to be large, so a large number of possible keys is available to prevent guessing attacks. Different algorithms require different key lengths for good security. Most keys today are typically 200 bits or larger.

Shared Key Cryptography

Shared key cryptography uses the same key to encrypt and decrypt the data. This requires that both communicating parties share the same key and, vitally important, keep it secret from the rest of the world. As shown in Figure 1.3, plaintext is encrypted into ciphertext by the sender using the shared secret key. The ciphertext is then decrypted by the receiver using the same shared secret key.

Figure 1.3. The shared key (symmetric) encryption process.

graphics/01fig03.gif

The advantage of shared key encryption/decryption is that the algorithms are fast and can operate on arbitrarily sized messages. The disadvantage is that this approach creates great difficulties managing a shared key that must be kept secret across a network between message sender and recipient. Within Web services security, you will run into shared key cryptography as the basis of Secure Socket Layer (SSL) security and as the foundation for XML Encryption. Much effort has been put into XML Encryption to take care of most of its details for you. But you will be exposed at the minimum to choices you will have to make about algorithms, key information, and the like, so it is important you gain a foundation in these concepts.

Public Key Cryptography

Public key cryptography uses a key pair called a private and public key. Whichever one is used to encrypt the data is not the one used to decrypt the data; only the other half of the pair can decrypt the data. Of vital importance is that the private keys are never shared. Only the public key can be, and it is widely distributed to others. We repeat that it is an absolute tenet of public key cryptography that each subject keeps his private key confidential, never sharing it with anyone.

Either key can be used to encrypt, but only the matching key from the pair can then be used to decrypt. In Figure 1.4, the sender uses the public key of the recipient to encrypt her plaintext message into ciphertext. The resulting ciphertext is sent to the recipient who uses her private key to decrypt the ciphertext back into the original plaintext message.

Figure 1.4. The public key (asymmetric) encryption process.

graphics/01fig04.gif

If you want to make sure only the recipient can read your message, use that person's public key to encrypt, and then he and only he using his private key can decrypt. If you want everyone who gets your message to know it came from you and only you, use your private key to encrypt and then the recipients can use your public key to decrypt. Because you keep your private key highly secure, the message could have been encrypted only by you.

Now that you have a basic understanding of encryption and digital signature, we can establish working definitions for critical security concepts that will be used throughout this book.

Security Concepts and Definitions

Authentication, authorization, integrity, confidentiality, and non-repudiation are critical concepts for your understanding of Web services security, so the following sections provide a bit more detail on each one.

Authentication

Authentication involves comparing provided identity information (a "challenge") to something already stored about this individual. Authentication is classically divided into three types: something you know , something you have , or something you are :

Something you know ” Pin, password, pass phrase, shared secret
Something you have ” Key, card, token
Something you are ” Biometrics such as fingerprint , retinal scan, voice print, palm print

Single-factor authentication (using just one of the preceding types) is the simplest but is not very strong. Stealing or guessing a password is easy, and the rightful individual has no way to refute this is the case. Furthermore, because a password is something you know, you can tell it to someone or she can guess it, and with no other factor checked, that person is into the system with nothing stopping her.

Two-factor authentication, also known as strong authentication, is much stronger and is considered the standard when authentication is for anything of high value. Either something you have or something you are is added to the something you know category of shared secret. Something you have is typically a card, token, or device of some sort . Something you are is a biometric such as a fingerprint, retinal scan, or voice print. Outside military applications, authentication stronger than two-factor is rarely found. To strengthen two-factor authentication, you must strengthen the process used to create the individual factors.

How rigorous the authentication needs to be and what types of factors should be used in the one-, two-, or more factors of authentication require that you think about the level of trust needed. What is it that you are trying to protect? If the Web service is one that integrates vendors into your supply chain, and they have no access to critical corporate or customer data, the level of authentication may be lower than for a Web service that integrates an employer to its 401k provider and represents an employee requesting fund reallocation.

Authorization

Authorization is the process of establishing what someone who has been authenticated is allowed to do. The entity receiving the request for service will be granting permissions for each identity to access certain items.

Most Web services coming in over the public network to an enterprise require authentication; it is not usually acceptable to provide services that you expect to be paid for without knowing who is using them. So fundamentally, authorization requires authentication. Additionally, Web services frequently expose vital business data to the requestor , who must be identified and not remain anonymous. Exceptions to this rule are free services that do not care who use them. If a service provides different levels of access depending on who is using it, that service also requires authorization to determine, based on identity, what services are accessible to whom.

One way that authorization is implemented is through a set of credentials that a subject identity carries and presents ; those credentials are then mapped into access to certain restricted items. Alternatively, rights can be attached to restricted content, and these rights are mapped to identities and the permissions they will be granted to this content.

On the HTML-based Web, authorization has typically been very coarse grained and either gives access to entire sections of a Web site or denies access completely. With Web services, on the other hand, very fine-grained control specifying access to messages, parts of messages, or content carried by a message is possible. Unlike many security concepts, authorization is actually very easy to understand, but it turns out to be exceedingly complex technologically as well as socially . You will see why in Chapter 6, "Portable Identify, Authentication, and Authorization," which discusses standards such as SAML that are involved with this aspect of identity.

Integrity

Integrity is an assertion that no one has tampered with a message since it was initially created. This assures the sender and the receiver that every bit produced by the sender is received by the recipient in precisely unaltered form. In cryptographic terms, data integrity is accomplished by using digital signatures. Messages in which data integrity is required ^[7] must explicitly or implicitly include the identity and credentials of the sender to enable this kind of message-level security. Why? Because proving integrity means proving no bits have been changed in the message, which involves sending something with the message that no one in the middle could fraudulently create. That, in turn, requires signing that data with a key that only the sender could have had (more on this in Chapter 4, "Safeguarding the Identity and Integrity of XML Messages"). Message integrity- and identity-related issues (authentication and authorization) are often inter-related. Ironically, no matter how sophisticated your security technology becomes, the core security issue comes down to this: Do you know whom you are dealing with and do you trust those people?

^[7] On the Web, for example, message integrity is not required and not possible. You request HTML documents from Web sites and assume you are getting what was sent; because the risks of a bit or a word or even the entire document having been modified is low, you don't worry about message integrity. When the message is a patient record, a purchase order, or a contract, as you expect Web services to carry, you care a lot about integrity.

Confidentiality

Confidentiality is keeping the message secret. This process requires encryption, which scrambles the message in such a way that only authorized identities can decrypt and see the data. To do this, you exchange a shared secret and an algorithm for encrypting and decrypting the message. You could imagine Bob and Alice agreeing that they were going to encrypt their messages to each other by adding some number to each letter in the alphabet (the algorithm) and that number would be 2 for the next five days (the key). Thus, the message from Bob to Alice would look scrambled to a reader, and even if you knew the algorithm, you would have to do some analysis (not much in this case) to figure out how to decode the message. In the real world, these algorithms are very challenging mathematical functions with keys that are very large numbers , and the time to do the analysis is technically infeasible even with modern computers.

With typical contemporary encryption strategies, you provide the algorithm openly and rely on the strength of the key to keep the encryption secure. The trick is keeping the key secret. You could keep the key secret by giving it to the recipients in some way outside the message exchange, such as mailing it to them or phoning them. However, this approach doesn't scale up very well to large numbers of participants , so exchanging this key usually requires Public Key Infrastructure (PKI) technology, which is designed to manage and exchange keys. Chapter 3, "The Foundations of Distributed Message-Level Security," describes PKI in more detail, and Chapter 5, "Ensuring Confidentiality of XML Messages," goes into a lot of detail about encryption, specifically XML Encryption.

Non-repudiation

As digital transactions are used in more and more legal contracts and as the acceptability of digital signatures becomes commonplace, the legal aspects of identity will become critical for Web services. First and foremost among those legal aspects of identity is non-repudiation.

Non-repudiation proves that one identity sent the data only to another identity. This then proves that this specific transaction was entered into by the recipient, and neither party can refute or deny that it occurred later. If the transaction is challenged legally, a contract that was supposedly executed must be shown to have been entered into by both parties. Each party must have seen the contract as signed, and their identities ”confirmed traditionally by validating "wet" signatures on paper and notary witnesses ”must have been confirmed at the time of signing. These are difficult, and as yet legally unchallenged, tenants to uphold in a digital and anonymous world, but that day is coming.

Non-repudiation depends on public key cryptography technology. You prove that one identity sent the data only to another identity because the sender used the recipient's public key, and it is only the recipient with his secret private key who can decrypt the data. To achieve legal non-repudiation, more is needed, such as a separate time-and-date-stamp notary to prove when the transaction occurred as well as independent verification of the participants' identities.

< Day Day Up >