IPsec: A Security Architecture for IP | Deploying Site-to-Site IPsec VPNs

Table of contents:

IPsec A Security Architecture for IP

IPsec protects IPv4 and IPv6 traffic as it transits a network between end hosts or security gateways.

Note

In this book, the term security gateway or simply gateway refers to Cisco routers, PIX Firewalls, Adaptive Security Appliances (ASA), or VPN concentrators that provide IPsec security services to end hosts or other devices on internal networks as they communicate with end hosts or other devices on external networks.

The terms IPsec peer and IPsec device refer to IPsec-enabled end hosts or security gateways.

IPsec consists of a number of elements, including the following:

Cryptographic algorithms
Security protocols
Security associations
IPsec databases
SA and key management techniques

These elements together provide the following security services to IP:

Access control IPsec can control access to resources such as an end host or networks behind a security gateway.
Connectionless integrity IPsec can detect modifications to IP packets regardless of the order in which they are sent or received. If an attacker modifies packets in transit between IPsec-enabled hosts or security gateways, these packets are dropped by the receiving host or security gateway.
Data origin authentication IPsec verifies that messages that are received were transmitted by a supposed sender and not by another source masquerading as the supposed sender. Packets sent by an attacker are dropped by IPsec-enabled hosts or security gateways.

Note that connectionless integrity and data origin authentication are collectively known as authentication.
Replay protection Replay protection ensures that IPsec-enabled hosts or security gateways drop any duplicate IPsec packets that they receive.
Data confidentiality Data confidentiality hides data and prevents it from being disclosed to an attacker. IPsec uses encryption algorithms to provide data confidentiality.
Limited traffic flow confidentiality In some cases, even if an attacker is unable to determine the exact nature of protected data, he/she might still find information such as identities of communicating devices, the frequency of transmission, and even packet sizes useful. IPsec provides limited protection against an attacker being able to obtain this information.

Cryptographic Algorithms

IPsec relies on a number of cryptographic algorithms to authenticate and encrypt user packets, including the following:

Authentication algorithms
Encryption algorithms
Public key cryptographic algorithms

The sections that follow discuss these algorithm types in greater detail.

Authentication Algorithms

In IPsec, a number of hash algorithms provide connectionless integrity and data origin authentication (authentication). These algorithms are discussed in this section.

Hash Algorithms

A hash algorithm is a type of cryptographic algorithm that takes a message of an arbitrary length as its input and produces a fixed-length output value that is characteristic of the message input and no other message. The output value produced by a hash algorithm is variously called a hash value, a fingerprint, or a message digest.

Two common hash algorithms are Message Digest 5 (MD5) and the Secure Hash Algorithm (SHA-1). MD5 produces a 128-bit hash, and SHA-1 produces a 160-bit hash.

As an example of the operation of a hash algorithm, the message "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis" produces the output hash value (in hexadecimal) of 0x164727408b1e20f1c97b6952b1cb425c4ffa8864 using SHA-1, and the message "Troubleshooting Virtual Private Networks by Mark Lewis" produces the output hash value 0xecaae55ceff1d90ebff79def877796e033538f4a using SHA-1.

Cryptographic hash algorithms have two important characteristics:

It should be computationally infeasible to find two different messages that produce the same output hash value.

So, for example, it should be infeasible that you could find another message that would produce the hash value 0x164727408b1e20f1c97b6952b1cb425c4ffa8864 (the hash value produced by the input message "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis") using the SHA-1 algorithm.
It is not possible to reverse the hash algorithm to produce the original (input) message from the hash value. So, given only the hash value 0x164727408b1e20f1c97b6952b1cb425c4ffa8864, it would not be possible to find out that the original message was "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis."

Note

See the following URL if you would like to try out the MD5 and SHA-1 hash algorithms yourself:

http://block111.servehttp.com/hash

http://pajhome.org.uk/crypt/md5/

Now, because a hash algorithm provides a fingerprint of a message, you might think that if a host or security gateway simply sends a hash value along with a message this would be enough to ensure that an attacker would not be able to tamper with that message. But this is not the case. Figure 6-2 illustrates transmission of a message with its corresponding hash value.

Figure 6-2. "Authenticating" a Message with a Hash Value

To keep the following example as simple to follow as possible, hash values for the messages "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis" and "Troubleshooting Virtual Private Networks by Mark Lewis" have been truncated to 16 bits (4 hexadecimal numerals).

In Figure 6-2, the London gateway transmits a message ("Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis") along with its corresponding (truncated) SHA-1 hash value, 0x1647.

If an attacker modifies the message to be "Troubleshooting Virtual Private Networks," the Paris gateway will detect this when it receives the message, as shown in Figure 6-3.

Figure 6-3. An Attacker Modifies the Message

In Figure 6-3, the London gateway calculates a hash value for "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis" (truncated to 0x1647) and attaches this hash value to the message before sending the message to the Paris gateway.

An attacker alters the message to "Troubleshooting Virtual Private Networks by Mark Lewis" as it transits the Internet to the Paris gateway. The Paris gateway then receives the message (now "Troubleshooting Virtual Private Networks by Mark Lewis") and calculates a hash value (truncated to 0xecaa). The locally calculated hash value (0xecaa) does not match the attached hash value (0x1647, which corresponds to the message "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis"), and so the Paris gateway drops the message.

At this point, it might appear that a hash algorithm can prevent an attacker altering a message; however, unfortunately, that is not the case (see Figure 6-4).

Figure 6-4. An Attacker Modifies the Message and the Hash Value

In Figure 6-4, the London gateway calculates a hash value for "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis" (truncated to 0x1647), and attaches this hash value to the message. The London gateway then sends the message to the Paris gateway.

The attacker intercepts and alters the message (to "Troubleshooting Virtual Private Networks by Mark Lewis") and adds the new hash value 0xecaa (which corresponds to the message "Troubleshooting Virtual Private Networks by Mark Lewis").

Finally, the Paris gateway receives the message ("Troubleshooting Virtual Private Networks by Mark Lewis") and calculates a hash value (truncated to 0xecaa). The locally calculated hash value (0xecaa) matches the attached hash value (0xecaa), and so the Paris gateway accepts the message.

So, if the attacker alters the attached hash value at the same time as altering the message itself, the message will be accepted by the receiving gateway. Clearly, a simple hash value is not enough to protect a message from tampering.

Message Authentication Code (MAC) and Hashed Message Authentication Code (HMAC) Algorithms

As stated in the previous section, a simple hash value is not enough to ensure that a message is not tampered with in transit between IPsec-enabled hosts or security gateways.

What is needed is to add something else (apart from the message itself) as an input to the hash algorithm such that only the sending and receiving hosts or security gateways can correctly calculate and verify the hash value. This something is a shared key. If the attacker does not know the shared key, any attempt to modify the message or the hash or both will result in the receiving gateway discarding the message, as shown in Figure 6-5.

Figure 6-5. An Attacker Modifies the Message and the Hash

Notice in Figure 6-5 that the hash value calculated by the London gateway (0xa45e) no longer corresponds to the output of the regular SHA-1 hash algorithm. The output of regular SHA-1 for the "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis" message is (when truncated to 16 bits) 0x1647. This difference in hash values (0xa45e and 0x1647) is due to the addition of a shared key as an input to the SHA-1 hash algorithm.

A hash algorithm that uses the message and a shared key as inputs is known as a Message Authentication Code (MAC) algorithm.

One further characteristic of a MAC (apart from ensuring that a message cannot been tampered with) is that a receiver can use it for data origin authentication (to verify that a specific sender did, indeed, send the message). The MAC can be used for data origin authentication because only the sender and receiver know the shared key and can correctly calculate MACs. If an attacker attempts to modify a MAC, the receiver will detect this attempt because the attacker does not possess the shared key.

However, that is not quite the end of the story. Regular MD5 and SHA-1 have been shown to be vulnerable to attack, and so IPsec uses a strengthened MAC algorithm called a Hashed Message Authentication Code (HMAC). Figure 6-6 illustrates an HMAC algorithm.

Figure 6-6. HMAC Algorithm

Two HMACs algorithms that are often used with IPsec are MD5-HMAC-96 and SHA-HMAC-96. Whereas regular MD5 produces a 128-bit output and SHA-1 produces a 160-bit output, MD5-HMAC-96 and SHA-HMAC-96 both produce a truncated output of 96 bits. One advantage of this truncation of output is that less information is available to an attacker, and so the attacker has less chance of being able to successfully crack the HMAC.

Note

MAC and HMAC algorithms are often referred to simply as hash algorithms.

Encryption Algorithms

Encryption is a process by which data is rendered incomprehensible to anyone other than those allowed to view that data. Data in its unencrypted form is known as plaintext or cleartext. Data in its encrypted form is known as ciphertext.

The process of encryption changes a plaintext to a ciphertext; the process of changing the ciphertext back to its unencrypted plaintext form is known as decryption.

IPsec uses symmetric encryption algorithms for bulk encryption of data. Symmetric encryption algorithms use a key as an input (along with a plaintext) to produce a ciphertext. The same key is used as an input (with the ciphertext) to reproduce the plaintext. So, when one IPsec VPN gateway encrypts data with a certain key, another gateway will require that same key to decrypt the data.

Figure 6-7 illustrates encryption and decryption using a symmetric encryption algorithm.

Figure 6-7. Encryption and Decryption Using a Symmetric Encryption Algorithm

The main characteristics of symmetric encryption algorithms are as follows:

The same key is required for encryption and decryption.
The ciphertext is compact.
Symmetric encryption algorithms are relatively fast and can therefore be used for bulk encryption.
Key management and distribution is complex. IPsec peers must use the same key, and so distribution of keys in a large-scale network can be challenging.

There are two types of symmetric encryption algorithms:

Block ciphers
Stream ciphers

The sections the follow describe each symmetric encryption algorithm in more detail.

Block Ciphers

Block ciphers encrypt and decrypt a block of plaintext or ciphertext at one time and can operate in a number of modes, including the following:

Electronic Codebook (ECB) mode
Cipher-Block Chaining (CBC) mode
Cipher Feedback (CFB) mode
Output Feedback (OFB) mode

In ECB mode, each block of plaintext is encrypted independently. One disadvantage of ECB mode is that any two blocks of identical plaintext will produce identical ciphertext. So, for example, the plaintext block "Hello" would always produce the same ciphertext block.

In ECB mode, an attacker might then be able to analyze patterns of identical blocks of ciphertext within the complete ciphertext. In addition, a ciphertext produced using ECB mode might be vulnerable to cut-and-paste attacks (the substitution of blocks of ciphertext).

The CBC, CFB, and OFB modes of operation introduce an element of feedback into the encryption of any given block of plaintext.

In CBC mode, prior to encryption, each block of plaintext is first XOR'd (exclusive OR'd) with the ciphertext corresponding to the previous block of data. Because there is no previous block of data, the first block of plaintext is first XOR'd with a special random value called an initialization vector (IV) prior to encryption.

Figure 6-8 illustrates encryption using CBC mode (with, in this example, an eight-character block size).

Figure 6-8. Encryption Using CBC Mode

In Figure 6-8, the first block of eight characters of the message "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis" ("Comparin") has already been encrypted, producing the ciphertext E!5u{<90. Note that the first block of plaintext was XOR'd with the (random) IV before being input into the encryption algorithm.

The next block of eight characters of plaintext ("g, Desig") is then XOR'd with the ciphertext produced by the previous block of data (E!5u{<90). The resulting data is then input into the encryption algorithm along with the (symmetric) encryption key, producing the &*aZHlo ciphertext.

The process of encrypting each block is continued until all the blocks of characters in the message have been encrypted. If the final block of data in the plaintext does not equal the encryption algorithm's block size, it is padded prior to encryption.

Two examples of block ciphers are the Data Encryption Standard (DES) and the Advanced Encryption Standard (AES). The U.S. National Institute of Standards and Technology has published both of these algorithms as Federal Information Processing Standards (FIPS), and each can operate in CBC mode.

DES encrypts/decrypts 64-bit blocks of plaintext/ciphertext at a time and uses a 64-bit key. For parity, 8 bits of the key are used, and so the effective key length is 56 bits.

Triple DES (3DES) is a derivation of the DES algorithm that uses different keys to (typically) encrypt, then decrypt, and finally encrypt each block of plaintext. 3DES has an effective key length of 168 bits (3 * 56 bits).

DES has in recent years proved to be vulnerable to brute-force attacks. In a brute-force attack, each possible key is tried in an attempt to recover the plaintext from a ciphertext.

Although 3DES currently remains secure, it is slow, and so in 1997, NIST initiated a competition to find a replacement for DES. In October 2000, the Rijndael algorithm was selected as the winner of the competition and in 2001 was published as a FIPS.

AES encrypts/decrypts 128-bit blocks of plaintext/ciphertext in one go, and can use key lengths of 128 bits, 192 bits, or 256 bits.

Note

If you would like to see a block cipher (AES) in action, you might like to download the following Flash animation:

http://www.esat.kuleuven.ac.be/~rijmen/rijndael/Rijndael_Anim_exe.zip

Stream Ciphers

Stream ciphers, in contrast to block ciphers, operate on the plaintext (usually) a single bit at a time. Stream ciphers are fastusually faster than block ciphers.

Examples of stream ciphers are RC4 and the Software Encryption Algorithm (SEAL). RC4 was designed by Ron Rivest, and SEAL was designed by Phil Rogaway and Don Coppersmith and is optimized for 32-bit processors.

Public Key Cryptographic Algorithms

In the previous two sections, symmetric cryptographic algorithms (that require the sender and receiver of a message to be in possession of the same key) were discussed. Public key (asymmetric) cryptographic algorithms differ from symmetric cryptographic algorithms in a number of ways, the most fundamental of which is that a pair of keys is required (one public, one private), rather than the single key required for symmetric cryptographic algorithms.

Public key algorithms have a number of characteristics, including the following:

Public key algorithms are much slower than symmetric algorithms and are therefore not suitable for bulk encryption.
The ciphertext produced by public key algorithms is not compact.
Public key algorithms do not have the same key distribution and management problems as symmetric algorithms. Key distribution consists of the publication of each device's public key.
Public key algorithms can variously be used for encryption, for digital signatures, and for symmetric key exchange.

Some popular public key algorithms are the Diffie-Hellman; Rivest, Shamir, and Addlemen (RSA); the Digital Signature Algorithm (DSA); and ElGamal. Diffie-Hellman is used for key exchange, RSA and ElGamal can be used for encryption or digital signatures, and DSA can, as the name suggests, be used to create digital signatures.

Encryption Using Public Key Algorithms

As previously discussed, symmetric encryption algorithms use the same key for both encryption and decryption, but public key algorithms use one key for encryption and another for decryption.

Public key algorithms require a public key and a mathematically associated private key. Any data encrypted using the public key must be decrypted using the corresponding private key. Similarly, any data encrypted with the private key must be decrypted using the corresponding public key.

Any data encrypted using the public key cannot be decrypted using the public key, and any data encrypted using the private key cannot be decrypted using the private key. Figures 6-9 and 6-10 illustrate public key encryption.

Figure 6-9. Encryption with the Private Key, and Decryption with the Corresponding Public Key

Figure 6-10. Encryption with the Public Key, and Decryption with the Corresponding Private Key

Digital Signatures

Digital signatures are encrypted hash values and can be used to prove that a particular entity sent, authorized, or vouches for a certain communication. Figure 6-11 shows an example of the use of a digital signature.

Figure 6-11. Using a Digital Signature

In Figure 6-11, Host A calculates a hash value on the message "Comparing, Designing, and Deploying Virtual Private Networks by Mark Lewis" and then encrypts this hash using its private key, giving a digital signature.

Host A attaches this digital signature to the message (signs the message) and sends it to the Host B. Host B now calculates its own hash value on the message. It then decrypts the hash value from Host A using Host A's public key. If Host B's and Host A's hash values match, Host B accepts the message.

If the hash values match, the Host B gateway knows two things:

That the message has not been tampered with. If the message had been tampered with, the hash value calculated by Host B would not match the (encrypted) hash value sent by Host A.
That the message could only have been sent by Host A. The digital signature consists of a hash value encrypted using Host A's private key (which only Host A has, assuming that it has not been compromised); so if the hash value can be successfully decrypted using the Host A's public key, it must have come from Host A.

Key Exchange with Diffie-Hellman

Diffie-Hellman is a public key algorithm that allows two peers to establish a secret key that only they know while communicating over an insecure channel such as the Internet. Diffie-Hellman relies on modular exponentiation. Figure 6-12 illustrates a Diffie-Hellman exchange.

Figure 6-12. Diffie-Hellman Exchange

In Figure 6-12, the following events occur:

1.	London and Paris select a large prime number (p) and a generator number (g). The numbers p and g are not secret values and can be known publicly.
2.	London and Paris each select a random number. London's random number is called LRand, and Paris's random number is called PRand.
3.	London and Paris calculate public values. London public value is calculated as follows: LPub=g^LRand mod p. Paris's public value is calculated as follows: PPub=g^PRand mod p.
4.	London and Paris exchange their public values, LPub and PPub.
5.	London and Paris calculate the shared key. London calculates the shared key as follows: shared key=PPub^LRand mod P. Paris calculates the shared key as follows: shared key=LPub^PRand mod p.

Security Protocols: AH and ESP

IPsec uses two security protocols, the Authentication Header (AH) and the Encapsulating Security Payload (ESP). The following two sections discuss AH and ESP.

Authentication Header (AH)

AH is a packet header that provides the following security services:

Connectionless integrity
Data origin authentication
Optional replay protection

AH is IP protocol 51.

Figure 6-13 shows the AH header format.

Figure 6-13. AH Header Format

AH fields have the following functions:

Next Header This indicates the type of header that comes after the AH header (for example, a value of 6 is contained in this field if the next header is TCP).
Payload Length The length of the AH in 32-bit words minus 2.
Reserved This is reserved for future use.
Security Parameter Index (SPI) This is used by the receiving gateway to identify the security association (SA, described later in this chapter) to which this packet corresponds.
Sequence Number A per-packet counter that allows replay protection to be enabled.
Integrity Check Value (ICV) A cryptographic value (hash) corresponding to the user packet being protected. The receiving IPsec VPN gateway uses the ICV to authenticate the packet.

AH can operate in two modes:

Transport mode
Tunnel mode

In transport mode, the AH header is inserted between the original IP header and header of the next protocol (such as TCP, UDP, or ICMP) of the user packet being protected. Figure 6-14 shows AH transport mode.

Figure 6-14. AH Transport Mode

In AH transport mode, the whole packet is authenticated except any mutable fields in the original IP header (fields that may change during transit between IPsec-enabled hosts or security gateways). Mutable fields include Time-To-Live (TTL), Type of Service (ToS), and Header Checksum.

Typically, AH transport mode is used to protect user packets as they transit a network between IPsec-enabled end hosts or devices. It is also possible, however, for security gateways to use AH transport mode to protect tunneling protocols such as GRE, and Layer Two Tunneling Protocol (L2TP).

In tunnel mode, the AH header along with a new IP header are prepended to the user packet being protected. Figure 6-15 shows AH tunnel mode.

Figure 6-15. AH Tunnel Mode

In AH tunnel mode, the whole packet is authenticated except mutable fields in the new IP header. AH tunnel mode can be used to protect user packets as they transit a network between end hosts, but it is typically used to protect user packets as they transit a network between security gateways.

So, that is the theory. However, it is definitely worth taking a look at what happens in practice. Figure 6-16 shows a sample IPsec VPN.

Figure 6-16. Sample IPsec VPN

In Figure 6-16, Host A (10.1.1.2) sends a user packet to Host B (10.2.2.2). When the user packet arrives at the London gateway (192.168.1.1), it is encapsulated in IPsec (AH).

The IPsec packet then transits the Internet to the Paris gateway (192.168.2.2). The Paris gateway decapsulates the user packet (by authenticating it and removing the AH header) and sends it onward to Host B.

Figure 6-17 shows the packet capture of the packet sent from Host A to Host B. The packet was captured at a point between the London and Paris IPsec VPN gateways. Highlighted line 2 in Figure 6-17 shows that this is an AH packet.

Figure 6-17. AH Packet Capture

If you take look at the line directly above highlighted line 2, you can see that the new (outer) IP header. This new IP header includes source and destination IP addresses 192.168.1.1 (the London gateway) and 192.168.2.2 (the Paris gateway).

If you look below highlighted line 2, you can see another IP header (source address 10.1.1.2 [Host A], destination address 10.2.2.2 [Host B]). This is the original IP header of the encapsulated user packet.

Below the original IP header, you can see an Internet Control Message Protocol (ICMP) header. In fact, this is the ICMP header of a ping packet from Host A to Host B.

So, there are two IP headersa new IP header and an original IP header. This is an AH tunnel mode packet (see Figure 6-15).

Directly below highlighted line 2, you can see the fields in the AH header:

The Next Header field shows that the next header after the AH header in the packet is IPIP (0x04). This indicates that the next header is the original IP header.
The (Payload) Length field contains a value of 24.
The SPI field contains a value of 0x04de55df. This value identifies the AH SA on the Paris gateway and will enable the Paris gateway to correctly process (authenticate) the packet when it arrives.
The Sequence (number) field contains a value of 130.
The value contained in the ICV field is not explicitly shown.

Encapsulating Security Payload (ESP)

ESP is a packet header that provides the following:

Connectionless integrity
Data origin authentication
Optional replay protection
Data confidentiality
Limited traffic flow confidentiality (available only in tunnel mode)

Notice that ESP provides all the same security services as AH, as well as data and limited traffic flow confidentiality. For this reason, AH is now rarely used. ESP is IP protocol 50. Figure 6-18 shows the ESP header (and trailers).

Figure 6-18. ESP Header Format

Many of the fields in the ESP header have the same function as those in the AH. Fields not already described have the following functions:

Payload Data This is the user packet data. This field may also contain Initialization Vector (IV) and Traffic Flow Confidentiality (TFC) padding. Some encryption algorithms use an IV to encrypt the first block of user packet data. TFC padding is used to hide user traffic flow characteristics such as user packet size.
Padding Used to ensure that user packet data is a multiple of a certain number of bytes (this may be required by the encryption algorithm that you use) and to ensure that the Pad Length and Next Header fields are right aligned with a 4-byte boundary within the overall packet.
Pad Length Indicates the number of bytes in the Padding field.
ICV This is an optional field and has the same function as the ICV contained within the AH. The ICV field is present only if ESP authentication is configured.

ESP also operates in two modes:

Transport mode
Tunnel mode

In ESP transport mode, the ESP header is inserted between the original IP header of the user packet and the header of the next layer protocol of the user packet being protected. In addition, a variable-length ESP trailer (consisting of the Padding, Pad Length, and Next Header fields), and optionally an ESP ICV field, is appended to the packet. Figure 6-19 illustrates ESP transport mode.

Figure 6-19. ESP Transport Mode

As you can see in Figure 6-19, if ESP authentication is configured, the initial ESP header (including the SPI and Sequence Number fields), TCP/UDP/Other Header, Payload, and ESP Trailer fields are all authenticated. Notice that the original header is not authenticated, unlike when using AH (see Figure 6-14). If ESP encryption is configured, the TCP/UDP/Other Header, Payload, and ESP Trailer fields are all encrypted.

ESP transport mode is usually used to protect user traffic between IPsec-enabled end hosts or other devices, although it may also be used to protect GRE, L2TP, or other tunnels between security gateways.

Figure 6-20 illustrates ESP tunnel mode. In ESP tunnel mode, a new IP header and ESP header are prepended to the user packet, and an ESP trailer and (if ESP authentication is configured) an ESP ICV are appended to the user packet. If ESP authentication is configured, the ESP header, the entire user packet, and the ESP trailer are authenticated.

Figure 6-20. ESP Tunnel Mode

If ESP encryption is configured, the ESP header, entire user packet, and the ESP trailer are encrypted. Now it is time to see how ESP works in practice.

Figure 6-21 shows an ESP tunnel mode packet captured in transit between the London and Paris gateways shown in Figure 6-16. Again, Host A sends a user packet to Host B, and this user packet is encapsulated in ESP by the London gateway and transmitted to the Paris gateway.

Figure 6-21. ESP Packet Capture

If you look just above highlighted line 2, you can see the new (outer) IP header, with source address 192.168.1.1 (London) and destination address 192.168.2.2 (Paris).

In highlighted line 2, you can see the ESP header. Looking just below highlighted line 2, you can see the fields in the (initial) ESP header:

The SPI field contains a value of 0xbfb55b99. This identifies the ESP SA on the Paris gateway and will enable the Paris gateway to correctly process (authenticate and/or decrypt) the packet.
The Sequence Number field contains a value of 934.

After the SPI and Sequence Number fields, Figure 6-21 shows 124 bytes of data.

If you are wondering what happened to the encapsulated user packet from Host A, as well as the Padding, Pad Length, and Next Header fields (collectively referred to as the ESP trailer), refer back to Figure 6-20. In this example, ESP encryption is configured, and so the encapsulated user packet and ESP trailer are all encrypted.

AH and ESP Together

It is possible to configure both AH and ESP protection for a single-user traffic flow. In this case, AH and ESP headers are included, as shown in Figure 6-22.

Figure 6-22. AH/ESP Transport and Tunnel Modes

If you look closely at Figure 6-22 and compare it with Figures 6-19 and 6-20, you might notice the absence of the ESP ICV field at the end of each of the packets shown. The field is absent because it is included only if ESP authentication is enabled. If you are using AH authentication, there is no point also using ESP authentication (although it is possible to configure both AH and ESP authentication for a single traffic flow).

Figure 6-23 shows an AH/ESP tunnel mode packet captured between the London and Paris gateways shown in Figure 6-16.

Figure 6-23. AH/ESP Tunnel Mode Packet Capture

Highlighted line 2 shows the AH header. Directly below that are the AH header fields. Note, in particular, the Next Header fieldthis indicates the protocol in the following header (in this case, ESP).

After the AH header fields is the ESP header. One thing to note here is the SPI (0x73b348ee); if you compare this to the SPI in the AH header (0x0ca310ad), you can see that they differ. The SPIs differ because they correspond to different SAs. The AH header SPI will enable the Paris gateway to correctly authenticate the packet, and the ESP header SPI will enable the Paris gateway to correctly decrypt the packet.

Security Associations

An IPsec SA is unidirectional in nature and defines how traffic for a particular traffic flow is to be protected by IPsec. An IPsec SA is identified by an SPI and includes information such as security protocol, security protocol mode, cryptographic algorithms, and SA lifetime.

A particular traffic flow may be protected by one or more SAs. For example, if AH is specified for authentication and ESP is specified for encryption, traffic protection will involve two SAs. Because IPsec SAs are unidirectional, a minimum of two IPsec SAs are required to protect user traffic in both directions between a pair of IPsec VPN gateways.

IPsec Databases

IPsec defines three databases to ensure that IP traffic is correctly processed (with regard to IPsec):

The Security Policy Database (SPD) This database specifies traffic that should be protected by IPsec and traffic that should bypass IPsec. The SPD is consulted for all inbound and outbound traffic.
The Security Association Database (SAD or SADB) The SAD contains an entry containing information related to each IPsec SA and interfaces to the SPD to ensure correct IPsec packet processing.
The Peer Authorization Database (PAD) The PAD provides a link between the Internet Key Exchange (IKE) protocol and the SPD. The PAD specifies the range of identities (for example, IP addresses) for which the IPsec device is authorized to negotiate IPsec SAs with a peer; it also specifies how to authenticate a peer.

SA and Key Management Techniques

IPsec allows two methods for the management of IPsec SAs and keys:

Manual SA and key management One way of managing IPsec SAs and keys is to manually configure SAs and keying material on IPsec peers. Manual configuration of IPsec SAs and keying material is analogous to the configuration of static routes (though much more involved), and just like the configuration of static routes, the manual configuration of IPsec SAs and keying material is not scalable
Automated SA and key management through the IKE protocol The IKE protocol allows IPsec peers to dynamically authenticate each other, generate keying material, and negotiate IPsec SAs.

There are two versions of IKE: IKE Version 1 (IKEv1) and IKE Version 2 (IKEv2). IKEv1 is defined in RFCs 2407, 2408, and 2409, and IKEv2 is defined in RFC 4306. IKEv2 improves the efficiency and security of IKE and adds extra functionality to the base protocol specification.

IKEv1

IKEv1 is made up of elements of a number of protocols:

The Secure Key Exchange Mechanism (SKEME) Describes a versatile key exchange technique
The Oakley Key Determination Protocol Describes a series of key exchange techniques (modes), as well as Perfect Forward Secrecy, identity protection, and authentication
The Internet Security Association and Key Management Protocol (ISAKMP) Describes a framework for authentication and key exchange

IKEv1 negotiation is divided into two phases and three modes. In phase 1, IPsec peers establish an IKE SA. This IKE SA is used to protect phase 2 negotiations, which are then used to negotiate IPsec SAs.

IKEv1 phase 1 can be negotiated using main mode or aggressive mode. IKEv1 phase 2, on the other hand, is negotiated using quick mode. It is worth noting that the IKE SA negotiated during phase 1 is bidirectional, but IPsec SAs negotiated during phase 2 are unidirectional.

IKE Phase 1 Main Mode Negotiation

So, the purpose of IKEv1 phase 1 is to establish an IKE SA between two IPsec peers. But, how exactly is this accomplished? During IKE phase 1, the two IPsec peers exchange three pairs of messages, giving a total of six messages. The function of these messages is as follows:

First pair of messages (messages 1 and 2) These are used to negotiate IKE policy parameters, such as the hash algorithm, encryption algorithm, and method of authentication. These parameters are specified using the crypto isakmp policy priority command.
Second pair of messages (messages 3 and 4) These are used to exchange Diffie-Hellman public values and nonces (random numbers).

The Diffie-Hellman exchange allows the IPsec peers to agree a shared secret key. The nonce values are used as keying material in the calculation of session keys on the IPsec peers.

The IPsec peers now generate the first of four session keys called SKEYID. A further three session keys (SKEYID_d, SKEYID_a, and SKEYID_e) are then calculated using SKEYID. IKE phase 2 keys are derived from SKEYID_d. IPsec peers authenticate and encrypt remaining IKE phase 1 and phase 2 messages that they send to each other using SKEYID_a and SKEYID_e.
Third pair of messages (messages 5 and 6) These messages are used to exchange identities and authenticate the IPsec peers to each other.

Phase 1 is now complete, and an IKE SA has been established between the IPsec peers.

Figure 6-24 illustrates IKE main mode negotiation between IPsec VPN gateways.

Figure 6-24. IKE Main Mode Negotiation

Before finishing this section, it is worth taking a look at the three methods that IPsec peers can use to authenticate each other during IKE negotiation:

Preshared keys A statically configured key that must be identical on peer IPsec gateways.
Encrypted nonces As the name suggests, this involves the encryption of a nonce with the public key of the peer IPsec VPN gateway. Each peer must possess the other peer's public key prior to IKE negotiation.
Digital signatures (using digital certificates) A digital signature (a hash encrypted using a private key) of certain pieces of information is created and exchanged by the IPsec peers. Each peer then verifies the digital signature on the information using the public key of the other peer.

Each IPsec peer must not only possess the public key of the other peer but also be sure that the public key is, in fact, the correct public key and not the public key of an imposter. To ensure that the public keys can be trusted, each IPsec obtains a signed digital certificate from a certificate authority (CA). A digital certificate is an association between an identity (name) and public key. A CA is a third party trusted by each IPsec peer.

Figure 6-25 illustrates IPsec VPN gateways enrolling with a CA and obtaining a digital certificate.

Figure 6-25. Enrolling with a CA and Obtaining a Digital Certificate

During IKE negotiation, IPsec peers exchanges certificates and in this way obtain the public key of the other peer. Because the certificate is signed by a mutually trusted CA (or CA hierarchy), each IPsec peer can be sure that the public key that it has received from its peer during IKE negotiation is the real public key.

If a certificate becomes invalid for reasons such as key compromise, it is also possible to revoke that certificate. The CA can periodically publish a list of revoked certificates called a Certificate Revocation List (CRL) or make information about revoked certificates available via protocols such as the Online Certificate Status Protocol (OCSP).

IPsec VPN gateways verify the revocation status of peer gateways' certificates during IKE negotiation. If a peer's certificate is found to be invalid (revoked or otherwise invalid), IKE negotiation fails. For more information on digital certificate authentication, see Chapter 7, "Scaling and Optimizing IPsec VPNs."

IKE Phase 1 Aggressive Mode Negotiation

An alternative to main mode negotiation is aggressive mode negotiation. Aggressive mode negotiation consists of three messages rather than the six messages used in main mode.

Aggressive mode negotiation is quicker but is less secure than main mode. It is less secure than main mode because the IPsec identities are exchanged unencrypted. As previously discussed, in main mode negotiation, identities are sent during the third exchange of messages (messages five and six), which are encrypted using session key SKEYID_e.

The three messages exchanged during aggressive mode negotiation can be described as follows:

Message 1 This message is sent by the initiator and consists of all the information contained in the initiator's first two messages used for main mode negotiation (IKE policy proposals, Diffie-Hellman public value, and nonce) as well as the initiator's identity.

Note that the initiator is the IPsec peer that initiates IKE negotiation, and the responder is the other IPsec peer.
Message 2 This message is sent by the responder and consists of all the information contained in all three main mode messages sent by the responder (IKE policy acceptance, Diffie-Hellman public value, nonce, and responder's identity). This message also serves to authenticate the responder to the initiator.
Message 3 This message is sent by the initiator and serves to authenticate the initiator to the responder.

Figure 6-26 illustrates IKE aggressive mode negotiation.

Figure 6-26. IKE Aggressive Mode Negotiation

IKE Phase 2 Quick Mode Negotiation

When IKE phase 1 negotiation is complete, phase 2 can begin. The purpose of IKE phase 2 negotiation is, as previously described, to established IPsec SAs. These IPsec SAs are then used to protect user traffic as it transits the intervening network between the IPsec peers.

IKE phase 2 negotiation consists of three messages:

Message 1 This message is sent by the initiator and contains IPsec SA proposals such as encryption algorithm, hashing algorithm, and IPsec lifetime. IPsec proposals (transforms) are configured using the crypto ipsec transform-set command.
Message 2 This message serves to accept one of the IPsec proposals sent in message 1.
Message 3 This message serves as an acknowledgment of message 2.

Messages 1 and 2 also contain additional keying material (nonces) to be used for the IPsec SAs. Keying material can be used for the authentication key of an AH SA or for the authentication key and/or encryption key for an ESP SA. It is also worth noting that if Perfect Forward Secrecy (PFS) is enabled using the set pfs command, extra Diffie-Hellman public values are also exchanged in messages 1 and 2.

Usually, keying material used for IPsec SAs is (partially) derived from the Diffie-Hellman public values exchanged during IKE phase 1. If you want more security, however, you can enable PFS and thereby ensure that IPsec keying material is based on Diffie-Hellman values exchanged during phase 2.

If the IPsec peers are gateways negotiating SAs on behalf of end hosts, messages 1 and 2 also contain (proxy) identities. These identities differ from those exchanged during phase 1. Phase 1 identities serve to identify the IPsec peers themselves. Phase 2 identities, on the other hand, describe the traffic to be protected by the IPsec SAs (defined using crypto access lists).

All IKE phase 2 messages are protected using the session keys SKEYID_e and SKEYID_a, which are generated during phase 1.

Figure 6-27 illustrates IKE phase 2 (quick mode) negotiation.

Figure 6-27. IKE Phase 2 (Quick Mode) Negotiation

IKEv2

The IKEv2 base specification includes all the functionality of IKEv1 as well as additional functionality that was tacked on to IKEv1, such as NAT traversal and legacy authentication. IKEv2 also includes improvements in overall efficiency and security.

IKEv2 preserves most of the features of IKEv1, including the two negotiation phases. In IKEv2 phase 1, the IPsec peers negotiate algorithms, establish a secret session key, authenticate each other, and establish an IKE SAjust like IKEv1. In IKEv2, however, the first IPsec SA is also established during phase 1. In IKEv2, IPsec SAs associated with an IKE SA are known as child SAs.

Figure 6-28 illustrates a typical IKEv2 phase 1 negotiation sequence. In Figure 6-28, the initiator and responder exchange two pairs of IKEv2 Request and Response messages. The purpose of the first pair of messages (Request 1/Response 1) is used to negotiate cryptographic algorithms such as encryption algorithm and Diffie-Hellman group. The second pair of Request and Response messages (Request 2/Response 2) is used to exchange identities and authenticate the IPsec peers and to negotiate the first IPsec SA.

Figure 6-28. IKEv2 Phase 1 Negotiation

In IKEv2, phase 2 is responsible for negotiating any further child IPsec SAs that may be required. These additional child IPsec SAs are negotiated using additional pairs of IKE Request and Response messages. Additional IKEv2 Request and Response messages can also be used for sending informational messages. These informational messages can be for purposes such as informing an IPsec peer of the deletion of an IPsec SA.

Putting It All Together: IPsec Packet Processing

Now that you have seen all of the components, it is time to take a look at outbound and inbound user packet processing on an IPsec VPN gateway.

Outbound Processing

Figure 6-29 illustrates outbound processing of user packets from the inside interface to the outside interface of an IPsec VPN gateway. In Figure 6-29, the IPsec VPN gateway receives a user packet on its inside interface. The gateway checks the packet against the SPD to determine whether the packet should be discarded, should bypass IPsec and be forwarded, or should be protected by IPsec.

Figure 6-29. Outbound Processing of User Packets

The user packet is discarded if the SPD indicates that this should be the case. If the SPD indicates that the user packet should bypass IPsec, the packet is forwarded out of the outside interface.

If the SPD indicates that the user packet should be protected by IPsec, the SAD is consulted for an IPsec SA. If no corresponding IPsec SA exists in the SAD, IKE negotiation is initiated with the remote IPsec peer (assuming that automated SA and key management is enabled and that IKE negotiation is authorized by the PAD). If IKE negotiation is successful, one or more IPsec SAs are installed into the SAD.

The user packet is then encapsulated (protected) according to the SA in the SAD (using AH or ESP). If only one SA (AH or ESP only) corresponds to the user packet, the packet is forwarded on the outside interface.

If more than one SA corresponds to the user packet (for example, AH in addition to ESP), the SAD is again consulted, and the packet is again encapsulated (protected) according to the additional SA (or SAs).

After all IPsec SAs corresponding to the user packet have been processed, the packet is forwarded out of the outside interface to the peer IPsec gateway.

Inbound Processing

Figure 6-30 shows IPsec processing of packets received on the outside interface of an IPsec VPN gateway. As shown in Figure 6-30, the IPsec VPN gateway receives a packet on the outside interface.

Figure 6-30. Inbound Processing of Packets

If the packet is not IPsec protected, the packet is processed according to the SPD. If the SPD specifies that the packet should bypass IPsec, the packet is forwarded out of the inside interface. If the packet is an ICMP packet addressed to the gateway (for example, as ping packet), it is passed to the ICMP process. If the SPD does not specify that the packet should bypass IPsec, and the packet is not an ICMP packet, it is discarded.

If the packet received on the outside interface is an IPsec packet, IPsec protection (AH or ESP header) is removed by referencing the correct SA in the SAD using the SPI in the AH/ESP header.

If the packet is protected by more than one IPsec header (for example, ESP in addition to AH), the SAD is again consulted and the IPsec protection is removed.

After all IPsec protection has been removed, the user packet is forwarded out of the inside interface.

Deploying IPsec VPNs Fundamental Considerations

Part I: Understanding VPN Technology

What Is a Virtual Private Network?

Part II: Site-to-Site VPNs

Designing and Deploying L2TPv3-Based Layer 2 VPNs

Designing and Implementing AToM-Based Layer 2 VPNs

Designing MPLS Layer 3 Site-to-Site VPNs

Advanced MPLS Layer 3 VPN Deployment Considerations

Deploying Site-to-Site IPsec VPNs

Deploying Site-to-Site IPsec VPNs
Advantages and Disadvantages of IPsec Site-to-Site VPNs
IPsec: A Security Architecture for IP
Deploying IPsec VPNs: Fundamental Considerations
Summary
Review Questions

Scaling and Optimizing IPsec VPNs

Part III: Remote Access VPNs

Designing and Implementing L2TPv2 and L2TPv3 Remote Access VPNs

Designing and Deploying IPsec Remote Access and Teleworker VPNs

Designing and Building SSL Remote Access VPNs (WebVPN)

Part IV: Appendixes

Designing and Building SSL Remote Access VPNs (WebVPN)

Appendix B. Answers to Review Questions