IPSec Protocols | Upgrading and Repairing Networks (5th Edition)

As noted previously, IPSec is the emerging standard being adopted by more and more VPN vendors. IPSec was derived from concepts that were originally designed to provide for secure communications in the next generation of the IP protocol, IPv6, which is gradually being developed.

Although Microsoft chooses to use L2TP and IPSec in combination as its VPN solution for Windows 2000 and Windows XP, many hardware and software vendors are sticking with a simple IPSec solution.

The good news is that if you decide on an all-IPSec solution, you can be virtually assured that equipment (or software) from one vendor to another will work together. If you have an all-Windows server environment, this might be of no concern. For those who operate multiprotocol networks, IPSec might be the best choice. As noted previously, IPSec is also the most widely supported VPN protocol on handheld devices.

IPSec is a standard defined in several Request for Comments (RFC) documents. IPSec is transparent to the end user and can traverse the Internet using standard IPv4 routers and other equipment without requiring any modification because it operates at the Network layer. IPSec is also flexible, allowing for the negotiation and use of many different encryption and authentication techniques.

The three main components of IPSec are the following:

Internet Key Exchange (IKE) This is the protocol defined in RFC 2048, "Internet Security Association and Key Management Protocol (ISAKMP)," which defines a method for the secure exchange of the initial encryption keys between the two endpoints of the VPN link.
Authentication Header (AH) This protocol, defined in RFC 1826, "The Authentication Header," provides for inserting a standard IPv4 header into an additional header that can be used to ensure the integrity of the header information and payload as the packet makes its way through the Internet. AH does not encrypt the actual IP payload data, but instead provides a mechanism to determine whether the payload or header has been tampered with.
Encapsulating Security Payload (ESP) This protocol performs the actual encryption of the data carried in the IP packet so that it cannot be understood by anyone who might intercept your data stream.

Internet Key Exchange (IKE)

IKE defines the mechanism used by the endpoints of the VPN to establish a secure connection and exchange encryption keys and other information pertinent to a secure connection. IKE uses public-key techniques that were discussed in the preceding chapter. If you recall, the public key half of a key pair can be known by anyone, as long as the private-key half of the key pair remains a secret. Thus, each end of the connection can use the other end's public key to encrypt data, which can then be read only by the other end of the connection that holds the private key that can unlock the data.

IKE provides for the establishment of a security association (SA), which is the set of data that governs the particular connection. SAs are unidirectional; that is, each side negotiates an SA with the opposite end of the link. Think of it as a contact between the endpoints. The items that are negotiated by IKE for an SA include these:

The encryption algorithm to be used on the link This can be DES (Data Encryption Standard), triple-DES, and so on.
The hash algorithm Message Digest 5 (MD5) or Secure Hash Algorithm (SHA) is used to ensure the integrity of data transferred.
An authentication method Not surprisingly, this is the method that will be used for authentication.
A Diffie-Hellman group Diffie-Hellman takes its name from the inventors of public-key cryptography. A Diffie-Hellman group is basically a specification in which each group defines the length of the base prime numbers that are used for the key exchange. Group 1 is considered to be easier to break than Group 2, and so on. Both sides of the exchange must use the same Diffie-Hellman group, of course.

	Encryption is detailed in Chapter 47, "Encryption Technology."

Diffie-Hellman uses a public and private key to form a pair of keys. The public key is used to encrypt data, whereas the private (secret) key of the pair is used by the receiver to decrypt the data. Anyone can discover the public key because it can be used only to encrypt data, and not to perform the reverse process.

Using this process, a master secret key is exchanged so that further encryption can use symmetric encryption, which is much faster than public-key encryption, to protect data on the link.

After both sides have authenticated themselves to the other side, negotiations take place to determine whether AH or ESP will be used, what hashing algorithm will be used, and what encryption algorithm will be used (if ESP is used).

The actual mechanics of this exchange are a little more complicated. The Oakley protocol (defined in RFC 2412) is used by IKE to define such things as the prime number groups that are used for the public-key generation, and to decide whether certificate-based authentication will be used. A security parameters index (SPI) value is used, along with an IP address and the security protocol, to uniquely identify a specific SA. Using IKE, the value for the SPI is a pseudorandomly generated number.

The Authentication Header (AH)

IPSec consists of the two basic AH and ESP protocols that are used after IKE has established an SA. AH provides a mechanism to ensure the integrity of the IP header and the payload of the IP packet that will be transported across an untrusted link, such as the Internet. When used by itself, AH cannot provide a total guarantee of the entire IP header because some of the fields in the IP header are changed by routers as the packet passes through the network.

	For more information about fields that make up an IP header, see Chapter 24, "Overview of the TCP/IP Protocol Suite."

The AH is inserted directly after the IP header in an IPv4 packet and is composed of several important fields:

Next Header This 8-bit field is used to identify the protocol that follows the header. If only AH is being used without ESP, typically this field contains the protocol number for TCP because TCP is the standard packet type used to carry most Internet traffic.
Length This 8-bit field is used to specify the total length of the AH, and represents the number of 32-bit words that make up the AH.
Reserved This field is not used at this time, but should instead be zero-filled according to the standard.
Security Parameters Index (SPI) This 32-bit field contains a number used to identify the SA. A value of 0 indicates that no SA exists, whereas the numbers 1255 are reserved by the IANA (Internet Assigned Numbers Authority).
Sequence Number This 32-bit field is used as a counter to keep track of packets that belong to a particular SPI. The counter is incremented once for each packet sent. This is useful for preventing a man-in-the-middle sort of attack.
Authentication Data This is a variable-length field that contains data used for authentication purposes, such as a digital certificate. If this field does not end on a 32-bit boundary, it's padded to adjust its length.

As mentioned earlier, the AH is used to provide an integrity check to determine whether the actual header or payload has been tampered with during transit. It does this by using a hashing algorithm to provide a digital signature for the packet. AH does not encrypt the payload data. If a packet is received and the AH indicates that the packet has been tampered with, the packet is discarded. MD5 and SHA are the two basic hashing algorithms typically used. It is beyond the scope of this book to discuss the details of these algorithms, but rest assured that they are complex formulas that take a variable amount of information and reduce it to a fixed-length unit of data. The hash value can be calculated at each end of the connection to determine whether anything in the packet has changed. Thus, AH provides a method for ensuring the integrity of the packet, but not for keeping its contents secret.

AH can also be used in a Windows environment to ensure that only computers that have certificates administered by the administrator can communicate within the network. The administrator can control the distribution of certificates so that rogue computers (those connected to the network without permission of the administrator) won't be able to use AH as long as certificates are used by computers to authenticate themselves to each other.

For a truly secure VPN connection, ESP must be used.

Encapsulation Security Payload (ESP)

ESP is used to encrypt the payload, or the actual IP packet that is carried in the data portion of the packet. It operates in two modes: transport and tunnel.

In transport mode, ESP provides protection for the payload and for headers created by upper-level protocols, such as TCP, that ride inside the IP packet. In this mode, nothing is done to protect the header information of the IP packet that serves as the workhorse to get the data from here to there. This is an efficient method for encrypting the contents of the IP packet in which bandwidth constraints are important.

When operating in tunnel mode, ESP is used between two IPSec gateways (such as a set of routers or firewalls) and it protects the IP header information. The entire IP datagram, including the IP header and its payloadusually an upper-level protocol such as TCP or UDP (User Datagram Protocol)is encrypted and encapsulated by the ESP protocol. New header information is added to the resulting packet that identifies the endpoints of the transfer (the two gateways), but the true source address, destination address, and other packet information carried inside the ESP packet is protected. At the destination gateway, this outer wrapper of information is removed, the contents of the packet are decrypted, and the original IP packet is sent out onto the network to which the gateway is attached.

When in tunnel mode, the ESP header information is inserted directly before the IP or other protocol datagram that is to be protected. The datagram being protected is encrypted (according to methods set up by the SA), and additional headers are added in clear text format so that the new IP datagram can be transported to the appropriate gateway. In other words, the original protocol datagram is encrypted, the ESP header is added, and, finally, a new IP datagram is created to transport this conglomeration to its destination gateway point.

At the receiving gateway, this outer IP header information is stripped off, and according to the parameters defined by the SA, the protected payload of the original datagram is decrypted.

When in transport mode, the ESP header information follows the other header information of an IP datagram. Usually this is an authentication header that has been inserted to protect the integrity of the packet. The upper-level (Transport layer) header information follows the ESP header information. Any information following the ESP header, including the Transport layer headers, is encrypted according to the method described by the SA, and the packet is sent on its way. Note that this method does not use a gateway, so the clear text IP header at the front of the packet contains the actual destination address of the encapsulated datagram. This is the main difference between transport mode and tunnel mode. However, ESP can be used, as just mentioned, in conjunction with AH to protect the integrity of the IP header information.

At the receiving end of the communication path, this clear text header information is saved, the contents of the encrypted packet are decrypted and reassembled with the correct IP header information, and the packet is sent on its way onto the network.

ESP uses both a header and a trailer to encapsulate datagrams that it protects. The header consists of an SPI, such as the one used by AH, to identify the security association, and a sequence number to identify packets, ensure that they arrive in the correct order, and ensure that no duplicate packets are received. The trailer consists of padding from 0 to 255 bytes to make sure that the datagram ends on a 32-bit boundary. This is followed by a field that specifies the length of the padding that was attached so that it can be removed by the receiver. Following this field is a Next Header field, which is used to identify the protocol that is enveloped as the payload.

Additionally, ESP can include an authentication trailer that contains data used to verify the identity of the sender and the integrity of the message. This Integrity Check Value (ICV) is calculated based on the ESP header information, as well as the payload and the ESP trailer. The layout of an ESP datagram is shown in Figure 46.1.

Figure 46.1. The format of an ESP datagram.

As you can see, the ICV attached to the end of the packet is not encrypted. Instead, it is a value calculated on the contents of the rest of the ESP-encapsulated packet. The receiving end of the VPN can recalculate this value to determine whether the contents of the ESP header, or its payload, have been compromised during transit.