The Role of Cryptography in Security | Core Security Patterns: Best Practices and Strategies for J2EE, Web Services, and Identity Management

The past few years have seen major advances in security technologies, especially in the area of cryptography. The advent of the one-way digitally signed hash algorithm opened up opportunities for both verifying data integrity (with algorithms such as MD5 and SHA-1) and, to a lesser extent, protecting data through obfuscation (as with UNIX passwords using "crypt"). Encryption algorithms such as symmetric ciphers have evolved from the government-endorsed DES (Data Encryption Standard), a mainstay from the seventies through today in many government and commercial systems, to the latest algorithms such as RC4, IDEA, Blowfish, and the newly government-endorsed AES (Advanced Encryption standard), a.k.a. Rijndael.

But perhaps the most compelling recent achievement in cryptography has been the advent of asymmetric ciphers (also known as "public key cryptography"). Before asymmetric ciphers, the sender of a message that is secured with a symmetric cipher would need to communicate the key value used to encrypt the message to the receiver via a separate secure communications channel. In 1976, Whitfield Diffie and Martin Hellman developed a method that would allow two parties to communicate over an unsecured communications channel (for example, e-mail) and derive a secret key value that would be known only to them, even if others were eavesdropping on the communication [DiffieHellman]. In 1977, Ron Rivest, Adi Shamir, and Leonard Adleman developed the RSA asymmetric cipher, where one key value is used to encrypt a message but another key value is used to decrypt the message. The technology is based on the inability to quickly factor large prime numbers. The exact details are beyond the scope of this book. Suffice it to say that with asymmetric ciphers, the headache of key management is greatly reduced. Let's take a closer look at the popular cryptographic algorithms and how their use contributes to achieving security goals.

Cryptographic Algorithms

Although cryptography has been studied for years, its value has only recentlywith the tremendous increase in the use of networkingbeen recognized. One normally associates cryptography with confidentiality via data encryption, but some cryptographic algorithms, such as the one-way hash function and digital signatures, are more concerned with data integrity than confidentiality.

This chapter will introduce you to the following cryptographic algorithms: one-way hash functions, symmetric ciphers, asymmetric ciphers, digital signatures, and digital certificates. For more information about understanding and implementing cryptographic algorithms in Java, refer to Chapter 4.

One-Way Hash Function Algorithms

One-way hash functions are algorithms that take as input a message (any string of bytes, such as a text string, a Word document, a JPG file) and generate as output a fixed-size number referred to as the "hash value" or "message digest." The size of the hash value depends on the algorithm used, but it is usually between 128 and 256 bits.

The purpose of a one-way hash function is to create a short digest that can be used to verify the integrity of a message. In communication protocols such as TCP/IP, message integrity is often verified using a checksum or CRC (cyclic-redundancy check). The sender of the message calculates the checksum of the message and sends it along with the message, and the receiver recalculates the checksum and compares it to the checksum that was sent. If they do not match, the receiver assumes the message was corrupted during transit and requests that the sender resend the message. These methods are fine when the expected cause of the corruption is due to electronic glitches or some other natural phenomena, but if the expected cause is an intelligent adversary with malicious intent, something stronger is needed. That is where cryptographically strong one-way hash functions come in.

A cryptographically strong one-way hash function is designed in such a way that it is computationally infeasible to find two messages that compute to the same hash value. With a checksum, a modestly intelligent adversary can fairly easily alter the message so that the checksum calculates to the same value as the original message's checksum. Doing the same with a CRC is not much more difficult. But a cryptographically strong one-way hash function makes this task all but impossible.

Two examples of cryptographically strong one-way hash algorithms are MD5 and SHA-1. MD5 was created by Ron Rivest (of RSA fame) in 1992 [RFC1321] and produces a 128-bit hash value. SHA-1 was created by the National Institute of Standards and Technology (NIST) in 1995 [FIPS1801] and produces a 160-bit hash value. SHA-1 is slower to compute than MD5 but is considered stronger because it creates a larger hash value. For references to other hash algorithms with various hash value sizes, see [Hashfunc01].

The Breaking of the SHA-1 One-Way Hash Algorithm

In February 2005, researchers at China's Shandong University released a preliminary paper demonstrating the ability to find two messages that, when run through the SHA-1 algorithm, produce the same hash value; this is known as a "collision." The purpose and benefit of one-way hash algorithms are that they produce unique hash values for different messages, and finding collisions should be impossible without trying all possibilities, which is known as a "brute force" attack. A brute force attack against SHA-1 would involve trying 2**80 (about 10**24, or 1 million billion billion) hash operations. The new research shows that it is possible to find a collision with SHA-1 in 2**69 (about 5 x 10**20, or 500 billion billion) hash operations.

For practical purposes, this is no big deal. For example, no one is going to be able to change your Web address on your X.509 certificate and have it still look like you signed it. But it is a kink in the armor, and more kinks will appear as better attacks against all cryptographic algorithms are researched. The general advice is that one should start thinking about avoiding using SHA-1 and opt for newer one-way hash algorithms, such as SHA-256 or SHA-512.

As an example of using a hash function, suppose an open-source development project posts its product, which is available for download, on the Web at several mirror sites. On their main site, they also have available the result of an MD5 hash performed on the whole download file. If an attacker breaks into one of the mirror sites, and inserts some malicious code into the product, he would need to be able to adjust other parts of the code so that the output of the MD5 would be the same as it was before. With a checksum or CRC, the attacker could do it, but MD5 is specifically designed to prevent this. Anyone who downloads the altered file and checks the MD5 hash will be able to detect that the file is not the original.

Another example: Suppose two parties are communicating over a TCP/IP connection. TCP uses a CRC check on its messages, but as discussed earlier, a CRC can be defeated. So, for additional security, suppose that the two parties are using an application protocol on top of TCP that attaches an MD5 hash value at the end of each message. Suppose an attacker lies at a point in between the two communicating parties in such a way that he can change the contents of the TCP stream. Would he be able to defeat the MD5 check?

It turns out he can. The attacker simply alters the data stream, and then recalculates the MD5 hash on the new data and attaches that. The two communicating parties have no other resource against which to check the MD5 value, because the communicating data could be anything, such as an on-the-fly conversation over an instant message channel. To prevent this, one combines a hash function with a secret key value. A standard way to do this is defined as the HMAC [RFC2104].

With hash functions, as with any cryptographic algorithm, the wise developer uses a tried-and-true published algorithm instead of developing one from scratch. The tried-and-true algorithms have undergone much scrutiny, and for every MD5 and SHA-1 there are many others that have fallen because of vulnerabilities and weaknesses.

We will discuss ciphers next. There are two types of ciphers, symmetric and asymmetric. We will start with symmetric ciphers, which have been around for centuries and are the cornerstone of data privacy.

Symmetric Ciphers

Symmetric ciphers are mechanisms that transform text in order to conceal its meaning. Symmetric ciphers provide two functions: message encryption and message decryption. They are referred to as symmetric because both the sender and the receiver must share the same key to encrypt and then decrypt the data. The encryption function takes as input a message and a key value. It then generates as output a seemingly random sequence of bytes roughly the same length as the input message. The decryption function is just as important as the encryption function. The decryption function takes as input the same seemingly random sequence of bytes output by the first function and the same key value, and generates as output the original message. The term "symmetric" refers to the fact that the same key value used to encrypt the message must be used to successfully decrypt it.

The purpose of a symmetric cipher is to provide message confidentiality. For example, if Alice needs to send Bob a confidential document, she could use e-mail; however, e-mail messages have about the same privacy as a postcard. To prevent the message from being disclosed to parties unknown, Alice can encrypt the message using a symmetric cipher and an appropriate key value and e-mail that. Anyone looking at the message en route to Bob will see the aforementioned seemingly random sequence of bytes instead of the confidential document. When Bob receives the encrypted message, he feeds it and the same key value used by Alice into the decrypt function of the same symmetric cipher used by Alice, which will produce the original messagethe confidential document (see Figure 2-1).

Figure 2-1. Encryption using a symmetric cipher

An example of a simple symmetric key cipher is the rotate, or Caesar, cipher. With the rotate cipher, a message is encrypted by substituting one letter at a time with a letter n positions ahead in the alphabet. If, for example, the value of n (the "key value" in a loose sense) is 3, then the letter A would be substituted with the letter D, B with E, C with F, and so on. Letters at the end would "wrap around" to the beginning; W would be substituted with Z, X would be substituted with A, Y with B, Z with C, and so on. So, a plaintext message of "WINNERS USE JAVA" encrypted with the rotate cipher with a key value of 7 would result in the ciphertext "DPUULYZ BZL QHCH." Even without the aid of a computer, the rotate cipher is quite easily broken; one need only try all possible key values, of which there are 26, to crack the code.

However, there are plenty of published symmetric ciphers from which to choose that have held up to a great deal of scrutiny. Some examples include DES, IDEA, AES (Rijndael), Twofish, and RC2. For references to these and other symmetric ciphers, see [WeiDai01], which is also a great starting point for other cryptographic references.

With symmetric ciphers, as with any cryptographic algorithm, the wise developer uses a tried-and-true published algorithm instead of developing one from scratch. The tried-and-true algorithms have undergone much scrutiny, and for every Rijndael and Twofish, there are many others that have fallen because of vulnerabilities and weaknesses [RSA02].

Symmetric ciphers are available in two types: block ciphers and stream ciphers. Block ciphers encrypt blocks of data (blocks are typically 8 bytes or 16 bytes) at a time. Stream ciphers are relatively new and are generally faster than block ciphers. However, it seems that block ciphers are more popular, probably because they have been around longer, and there are many free choices available [RSA02]. Examples of block ciphers include DES, IDEA, AES (Rijndael), and Blowfish. Examples of stream ciphers are RC4 and WAKE. Ron Rivest's RC4 leads the stream cipher popularity contest, because it is used with SSL in all Web browsers, but it can only be used via a license with RSA. Also, block ciphers can be used in modes where they can emulate stream cipher behavior [FIPS81]. An excellent free reference on the use of the modes, as well as cryptography in general, is available at [RSA01].

Advanced Encryption Standard (AES)

Among the symmetric ciphers, the AES deserves extra attention because it replaced DES as the symmetric cipher standard endorsed by the United States government. The AES algorithm, known as "Rijndael," was developed by Belgian cryptographers Joan Daemen and Vincent Rijmen and was selected from 21 entries in a contest held by NIST (National Institute of Standards and Technology).

Since 1977, DES has been the U.S. federal government's standard method for encrypting sensitive information [AES01]. But as computing power increased, DES was in danger of being too easily compromised. In 1987, the U.S. government started Capstone, a project to standardize encryption that included an algorithm called Skipjack. However, the algorithm was kept secret, so cryptographers could not openly analyze it for weaknesses. Due to these and other circumstances [RSA03], Skipjack's popularity never got off the ground, although the algorithm was finally published in 1998 [Schneier04].

In 1997, NIST announced that the replacement for DES, called AES, would be an algorithm selected from an open number of entries. Anyone could submit an algorithm, all algorithms would be made public, and the winner would be selected on a number of factors, including speed and ease of implementation. On October 2, 2000, the Rijndael algorithm was selected over finalists such as Bruce Schneier's Twofish and Ron Rivest's RC6. Rijndael is a block cipher with a key length of 128, 192, or 256 bits and a block length of 128, 192, or 256 bits.

The AES contest was a landmark activity in computer security, because it demonstrated the acceptance of the concept that open scrutiny of an encryption algorithm offered better security in the long run than a secretly developed algorithm. To understand how to use AES in Java applications, refer to Chapter 4.

Asymmetric Ciphers

Asymmetric ciphers provide the same two functions as symmetric ciphers: message encryption and message decryption. There are two major differences, however. First, the key value used in message decryption is different than the key value used for message encryption. Second, asymmetric ciphers are thousands of times slower than symmetric key ciphers. But asymmetric ciphers offer a phenomenal advantage in secure communications over symmetric ciphers.

To explain this advantage, let's review the earlier example of using a symmetric cipher. Alice encrypts a message using key K and sends it to Bob. When Bob receives the encrypted message, he uses key K to decrypt the encrypted message and recover the original message. This scenario introduces the question of how Alice sends the key value used to encrypt the message to Bob. The answer is that Alice must use a separate communication channel, one that is known to be secure (that is, no one can listen in on the communication), when she sends the key value to Bob.

The requirement for a separate, secure channel for key exchanges using symmetric ciphers invites even more questions. First, if a separate, secure channel exists, why not send the original message over that? The usual answer is that the secure channel has limited bandwidth, such as a secure phone line or a trusted courier. Second, how long can Alice and Bob assume that their key value has not been compromised (that is, become known to someone other than themselves) and when should they exchange a fresh key value? Dealing with these questions and issues falls within the realm of key management.

Key management is the single most vexing problem in using cryptography. Key management involves not only the secure distribution of key values to all communication parties, but also management of the lifetime of the keys, determination of what actions to take if a key is compromised, and so on. Alice and Bob's key management needs may not be too complicated; they could exchange a password over the phone (if they were certain that no one was listening in) or via registered mail. But suppose Alice needed to securely communicate not just with Bob but with hundreds of other people. She would need to exchange (via trusted phone or registered mail) a key value with each of these people and manage this list of keys, including keeping track of when to exchange a fresh key, handling key compromises, handling key mismatches (when the receiver cannot decrypt the message because he has the wrong key), and so on. Of course, these issues would apply not just to Alice but to Bob and everyone else; they all would need to exchange keys and endure these key management headaches (there actually exists an ANSI standard (X9.17) [ANSIX9.17] on key management for DES.)

To make matters worse, if Alice needs to send a message to hundreds of people, she will have to encrypt each message with its own key value. For example, to send an announcement to 200 people, Alice would need to encrypt the message 200 times, one encryption for each recipient. Obviously, symmetric ciphers for secure communications require quite a bit of overhead.

The major advantage of the asymmetric cipher is that it uses two key values instead of one: one for message encryption and one for message decryption. The two keys are created during the same process and are known as a key pair. The one for message encryption is known as the public key; the one for message decryption is known as the private key. Messages encrypted with the public key can only be decrypted with its associated private key. The private key is kept secret by the owner and shared with no one. The public key, on the other hand, may be given out over an unsecured communication channel or published in a directory.

Using the earlier example of Alice needing to send Bob a confidential document via e-mail, we can show how the exchange works with an asymmetric cipher. First, Bob e-mails Alice his public key. Alice then encrypts the document with Bob's public key, and sends the encrypted message via e-mail to Bob. Because any message encrypted with Bob's public key can only be decrypted with Bob's private key, the message is secure from prying eyes, even if those prying eyes know Bob's public key. When Bob receives the encrypted message, he decrypts it using his private key and recovers the original document.

Figure 2-2 illustrates the process of encrypting and decrypting with the public and private keys.

Figure 2-2. Encryption using an asymmetric cipher

If Bob needs to send some edits on the document back to Alice, he can do so by having Alice send him her public key; he then encrypts the edited document using Alice's public key and e-mails the secured document back to Alice. Again, the message is secure from eavesdroppers, because only Alice's private key can decrypt the message, and only Alice has her private key.

Note the very important difference between using an asymmetric cipher and a symmetric cipher: No separate, secure channel is needed for Alice and Bob to exchange a key value to be used to secure the message. This solves the major problem of key management with symmetric ciphers: getting the key value communicated to the other party. With asymmetric ciphers, the key value used to send someone a message is published for all to see. This also solves another symmetric key management headache: having to exchange a key value with each party with whom one wishes to communicate. Anyone who wants to send a secure message to Alice uses Alice's public key.

Figure 2-3. Bob's public key cannot decrypt what it encrypted

Recall that one of the differences between asymmetric and symmetric ciphers is that asymmetric ciphers are much slower, up to thousands of times slower [WeiDai02]. This issue is resolved in practice by using the asymmetric cipher to communicate an ephemeral symmetric key value and then using a symmetric cipher and the ephemeral key to encrypt the actual message. The symmetric key is referred to as ephemeral (meaning to last for a brief time) because it is only used once, for that exchange. It is not persisted or reused, the way traditional symmetric key mechanisms require. Going back to the earlier example of Alice e-mailing a confidential document to Bob, Alice would first create an ephemeral key value to encrypt the document with a symmetric cipher. Then she would create another message, encrypting the ephemeral key value with Bob's public key, and then send both messages to Bob. Upon receipt, Bob would first decrypt the ephemeral key value with his private key and then decrypt the secured document with the ephemeral key value (using the symmetric cipher) to recover the original document.

Figure 2-4 depicts using a combination of asymmetric and symmetric ciphers.

Figure 2-4. Using a combination of asymmetric and symmetric ciphers

Some examples of asymmetric ciphers are RSA, Elgamal, and ECC (elliptic-curve cryptography). RSA is by far the most popular in use today. Elgamal is another popular asymmetric cipher. It was developed in 1985 by Taher Elgamal and is based on the Diffie-Hellman key exchange, which allows two parties to communicate publicly yet derive a secret key value known only to them [Diffie-Hellman].

Diffie-Hellman, developed by Whitfield Diffie and Martin Hellman in 1976, is considered the first asymmetric cipher, though the concept of an asymmetric cipher may have been invented in the U. K. six years earlier. Diffie-Hellman is different from RSA in that it is not an encryption method; it creates a secure numeric value that can be used as a symmetric key. In a Diffie-Hellman exchange, the sender and receiver each generate a random number (kept private) and value derived from the random number (made public). The two parties then exchange the public values. The power behind the Diffie-Hellman algorithm is its ability to generate a shared secret. Once the public values have been exchanged, each party can then use its private number and the other's public value to generate a symmetric key, known as the shared secret, which is identical to the other's. This key can then be used to encrypt data using a symmetric cipher. One advantage Diffie-Hellman has over RSA is that every time keys are exchanged, a new set of values is used. With RSA, if an attacker managed to capture your private key, they could decrypt all your future messages as well as any message exchange captured in the past. However, RSA keys can be authenticated (as with X.509 certificates), preventing man-in-the-middle attacks, to which a Diffie-Hellman exchange is susceptible.

Digital Signature

Digital signatures are used to guarantee the integrity of the message sent to a recipient by representing the identity of the message sender. This is done by signing the message using a digital signature, which is the unique by-product of asymmetric ciphers. Although the public key of an asymmetric cipher generally performs message encryption and the private key generally performs message decryption, the reverse is also possible. The private key can be used to encrypt a message, which would require the public key to decrypt it. So, Alice could encrypt a message using her private key, and that message could be decrypted by anyone with access to Alice's public key. Obviously, this behavior does not secure the message; by definition, anyone has access to Alice's public key (it could be posted in a directory) so anyone can decrypt it. However, Alice's private key, by definition, is known to no one but Alice; therefore, a message that is decrypted with Alice's public key could not have come from anyone but Alice. This is the idea behind digital signatures.

Digital signatures are the only mechanisms that make it possible to ascertain the source of a message using an asymmetric cipher. Encrypting a message with a private key is a form of digital signature. However, as we discussed before, asymmetric ciphers are quite slow. Alice could use the technique presented in the previous section of creating an ephemeral key to encrypt the message, and then encrypt the ephemeral key with her private key. But encrypting the message is a wasted effort, because anyone can decrypt it. Besides, the point of the exercise is not to secure the message but to prove it came from Alice.

The solution is to perform a one-way hash function on the message, and encrypt the hash value with the private key. For example, Alice wants to confirm a contract with Bob. Alice can edit the contract's dotted line with "I agree," then perform an MD5 hash on the documents, encrypt the MD5 hash value with her private key, and send the document with the encrypted hash value (the digital signature) to Bob. Bob can verify that Alice has agreed to the documents by checking the digital signature; he also performs an MD5 hash on the document, and then he decrypts the digital signature with Alice's public key. If the MD5 hash value computed from the document contents equals the decrypted digital signature, then Bob has verified that it was Alice who digitally signed the document.

Figure 2-5 shows how a digital signature is created.

Figure 2-5. Digital signature

Figure 2-6 shows the process of verifying a digital signature.

Figure 2-6. Verifying a digital signature

Moreover, Alice cannot say that she never signed the document; she cannot refute the signature, because only she holds the private key that could have produced the digital signature. This ensures non-repudiation.

Digital Certificates

A digital certificate is a document that uniquely identifies information about a party. It contains a party's public key plus other identification information that is digitally signed and issued by a trusted third party, also referred to as a Certificate Authority (CA). A digital certificate is also known as an X.509 certificate and is commonly used to solve problems associated with key management.

As explained earlier in this chapter, the advent of asymmetric ciphers has greatly reduced the problem of key management. Instead of requiring that each party exchange a different key value with every other party with whom they wish to communicate over separate, secure communication channels, one simply exchanges public keys with the other parties or posts public keys in a directory.

However, another problem arises: How is one sure that the public key really belongs to Alice? In other words, how is the identity of the public key's owner verified? Within a controlled environment, such as within a company, a central directory may have security controls that ensure that the identities of public keys' owners have been verified by the company. But what if Alice runs a commerce Web site, and Bob wishes to securely send Alice his credit card number. Alice may send Bob her public key, but Mary (an adversary sitting on the communication between Alice and Bob) may intercept the communication and substitute her public key in place of Alice's. When Bob sends his credit card number using the received public key, he is unwittingly handing it to Mary, not Alice.

One method to verify Alice's public key is to call Alice and ask her directly to verify her public key, but because public keys are large (typically 1024 bits, or 128 bytes), for Alice to recite her public key value would prove too cumbersome and is prone to error. Alice could also verify her public key fingerprint, which is the output of a hash function performed on her public key. If one uses the MD5 hash function for this purpose, the hash value is 128 bits or 16 bytes, which would be a little more manageable.

But suppose Bob does not know Alice personally and therefore could not ascertain her identity with a phone call? Bob needs a trusted third party to vouch for Alice's public key. This need is met, in part, by a digital certificate.

For example, assume Charlie is a third party that both Alice and Bob trust. Alice sends Charlie her public key, plus other identifying information such as her name, address, and Web site URL. Charlie verifies Alice's public key, perhaps by calling her on the phone and having her recite her public key fingerprint. Then Charlie creates a document that includes Alice's public key and identification, and digitally signs it using his private key, and sends it back to Alice. This signed document is the digital certificate of Alice's public key and identification, vouched for by Charlie.

Now, when Bob goes to Alice's Web site and wants to securely send his credit card number, Alice sends Bob her digital certificate. Bob verifies Charlie's signature on the certificate using Charlie's public key (assume Bob has already verified Charlie's public key), and if the signature is good, Bob can be assured that, according to Charlie, the public key within the certificate is associated with the identification within the certificatenamely, Alice's name, address, and Web site URL. Bob can encrypt his credit card number using the public key with confidence that only Alice can decrypt it.

Figure 2-7 illustrates how a digital certificate is used to verify Alice's identity.

Figure 2-7. Verifying an identity using a digital certificate

Suppose Mary (an adversary) decides to intercept the communication between Alice and Bob, and replaces Alice's public key with her own within the digital certificate. When Bob verifies Charlie's signature, the verification will fail because the contents of the certificate has changed. Recall that to check a digital signature, one decrypts the signature with the signer's public key, and the result should equal the output of the hash function performed on the document. Because the document has changed, the hash value will not be the same as the one Charlie encrypted with his private key.

Figure 2-8 shows what happens if an adversary (Mary) tries to alter Alice's certificate.

Figure 2-8. Adversary (Mary) alters certificate

Verification of a digital certificate can also be a multilevel process; this is known as verifying a certificate chain. In the previous example, it was assumed that Bob had already verified Charlie's public key. Let's now assume that Bob does not know Charlie or Alice but does have in his possession the pre-verified public key of Victor, and that Charlie has obtained a digital certificate from Victor. When Bob needs to secure information being sent to Alice, Alice sends Bob not only her digital certificate signed by Charlie, but Charlie's certificate signed by Victor. Bob verifies Charlie's signature on Alice's certificate using Charlie's public key, and then verifies Victor's signature on Charlie's public key using Victor's public key. If all signatures are good, Bob can be assured that Victor vouches for Charlie and that Charlie vouches for Alice.

In practice, Victor's public key will be distributed as a certificate that was self-signed. A self-signed certificate is known as a root certificate. So in the example, there are really three certificates involved: Alice's certificate signed by Charlie, Charlie's certificate signed by Victor, and Victor's certificate, also signed by Victor. These three certificates make up the certificate chain.

Figure 2-9 shows how certificates can be chained together to verify identity.

Figure 2-9. A certificate chain

In this example, Victor acts as a CA. He is in the business of being a trusted authority who verifies an individual's identification, verifies that individual's public key, and binds them together in a document that he digitally signs. CAs play an important part in the issuance and revocation of digital certificates.

The Role of CA in Issuing Certificates

In a trusted communication using digital certificates, a CA plays the role of the entity that issues a public key certificate. The certificate is the CA's assertion that the public key contained in the certificate belongs to a specific person, associated organization, or server host. Other information related to the person noted in the certificate is also provided. The CA is obligated to verify the information when a user requests a certificate and sends a Certificate Signing Request (CSR) to the CA. After verifying the information, the CA signs the requests and returns a certificate to the user (usually in X.509 certificate format). It is important to note that the relying parties also trust the certificate information issued by the CA. The relying party can decrypt the CA's signature using the CA's public key, which assures the relying party that the certificate was issued by the CA mentioned in the certificate.

In order to trust a certificate, the relying party has to trust the root certificate in its hierarchical chain. The CA, therefore, provides the trusted root certificate, and the CA is responsible for verifying the identities (out-of-band) of the certificates it signs. The reason digital certificates are so pervasive on the Web today is because many CA root certificates come bundled in the Web browsers. This eliminates the need for Web users to import and verify root certificates, though often this must be done with employees in organizations that have their own root certificates that are not bundled by default in the browsers. One of the most popular certificate authorities is VeriSign. Most of the secure Web sites on the Internet have their certificates verified (and signed) by VeriSign.

All Web browsers that support HTTPS employ the SSL (Secure Socket Layer) protocol, which in turn uses X.509 certificates. S/MIME (Secure MIME) and PEM (Privacy Enhanced Mail), both used to secure e-mail, also use X.509 certificates. Along with the public key and identification information, X.509 certificates include validity dates that indicate from what point in time the certificate is valid and when it expires. Other information is included as well; see [RFC2459]. The examples in this chapter have described the X.509 certificate structure, which is a hierarchical trust model; a signing authority signs each digital certificate once, and the top authority self-signs his certificate. Other certificate structures use different trust models; PGP (Pretty Good Privacy), which is another program used to secure e-mail, uses its own type of digital certificate that can be signed more than once. Trust models are discussed later in this chapter.

The Role of CA in Revocation of Certificates

The CA is also responsible for revoking the certificates if the CA discovers that the issued certificate is falsely verified or the identified user does not adhere to the CA-mandated policy requirements or has violated its policies. In addition, revoking certificates is also necessary for a variety of other reasons. A user may leave an organization, an organization with an SSL Web site might go out of business, or a private key may be compromised.

As part of the revocation process, the CA maintains the user certificate and its serial number as part of a certificate revocation list (CRL). The CRL is a list of certificates that are considered revoked, that are no longer valid, and that should not be trusted by any system or users. It is important to note that when a revoked certificate's expiration date occurs, the certificate will be automatically removed from the CRL.

Using Certificate Revocation Lists (CRL)

To verify a certificate, it is quite important to use the appropriate CRL to make sure the signer's certificate has not been revoked. CRLs are usually maintained as repositories identified with a URL for sites containing the latest CRLs. It is also possible to download CA-listed CRLs by subscribing to the repositories and creating an in-house CRL repository to make CRL searches for verifying the signer's information. Most Web servers and Web browsers provide a facility for verifying the certificates using CRLs.

Using the Online Certificate Status Protocol (OCSP)

Another alternative for verifying certificates using CA-maintained CRLs is the Online Certificate Status Protocol (OCSP) defined in RFC 2560 [RFC2560]. In this method, the CA publishes the revoked certificate lists to an OSCP-enabled directory. This could be done using a CRL or an LDAP update. The CA then maintains an OCSP responder application that will use the data in the OCSP directory to respond to a query for a particular certificate with a "good," "revoked," or "unknown" response. This allows CAs to create plug-ins for Web browsers that can automatically check for certificate revocations. Application developers can also write code to query the OSCP responder, because it is a standard protocol. The drawbacks to OSCP are that it does not allow for out-of-band verification and that it may be slow, because each response must be signed by the CA and the CA may get overwhelmed with requests as the number of users hitting it and the number of revoked certificates grows.