One of the key security requirements for RTP is confidentiality, ensuring that only the intended receivers can decode your RTP packets. RTP content is kept confidential by encryption, either at the application levelencrypting either the entire RTP packet or just the payload sectionor at the IP layer. Application-level encryption has advantages for RTP, but also some disadvantages. The key advantage is that it allows header compression. If only the RTP payload is encrypted, then header compression will work normally, which is essential for some applications (for example, wireless telephony using RTP). If the RTP header is encrypted too, the operation of header compression is disrupted to some extent, but it is still possible to compress the UDP/IP headers. The other advantage of application-level encryption is that it is simple to implement and deploy, requiring no changes to host operating systems or routers. Unfortunately, this is also a potential disadvantage because it spreads the burden of correct implementation to all applications. Encryption code is nontrivial, and care must be taken to ensure that security is not compromised through poor design or flaws in implementation.
Another potential disadvantage of application-level encryption is that it leaves some header fields unencrypted. In some cases, the lack of encryption might reveal sensitive information. For example, knowledge of the payload type field may allow an attacker to ascertain the values of parts of the encrypted payload data, perhaps because each frame starts with a payload header with a standard format. This should not be a problem, provided that an appropriate encryption algorithm is chosen , but it has the potential to compromise an already weak solution. As an alternative, encryption can be performed at the IP layerfor example, using the IP security (IPsec) protocols. This approach has the advantage of being transparent to RTP, and of providing a singlepresumably well- tested suite of encryption code that can be trusted to be correct. The disadvantages of IP-layer encryption are that it disrupts the operation of RTP header compression and its deployment requires extensive changes to host operating systems. Confidentiality Features in the RTP SpecificationThe RTP specification provides support for encryption of both RTP data packets (including headers) and RTCP packets. All octets of RTP data packetsincluding the RTP header and the payload dataare encrypted. Implementations have a choice of the encryption schemes they support. Depending on the encryption algorithm used, it may be necessary to append padding octets to the payload before encryption can be performed. For example, DES encryption 56 operates on blocks of 64 bits, so payloads will need to be padded if they are not multiples of eight octets in length. Figure 13.1 illustrates the process. When padding is used, the P bit in the RTP header is set, and the last octet of the padding indicates the number of padding octets that have been appended to the payload. Figure 13.1. Standard RTP Encryption of a Data Packet
When RTCP packets are encrypted, a 32-bit random number is inserted before the first packet, as shown in Figure 13.2. This is done to prevent known plain-text attacks. RTCP packets have a standard format with many fixed octets; knowledge that these fixed octets exist makes a wily cracker's work easier because he knows part of what he is looking for in a decrypted packet. The cracker could employ a brute-force key guessing, using the fixed octet values in the decryption attempt to determine when to stop. Figure 13.2. Standard RTP Encryption of a Control Packet
The insertion of the prefix provides initialization for the cipher, which effectively prevents known plain-text attacks. No prefix is used with data packets because there are fewer fixed header fields: The synchronization source is randomly chosen, and the sequence number and timestamp fields have random offsets. In some cases it is desirable to encrypt only part of the RTCP packets while sending other parts in the clear. The typical example would be to encrypt the SDES items, but leave reception quality reports unencrypted. We can do this by splitting a compound RTCP packet into two separate compound packets. The first includes the SR/RR packets; the second includes an empty RR packet (to satisfy the rule that all compound RTCP packets start with an SR or RR) and the SDES items. (For a review of RTCP packet formats, see Chapter 5, RTP Control Protocol.) Figure 13.3 illustrates the process. Figure 13.3. Using Standard RTP Encryption to Partially Encrypt a Control Packet
The default encryption algorithm is the Data Encryption Standard (DES) in cipher block chaining mode. 56 When RTP was designed, DES provided an appropriate level of security. However, advances in processing capacity have rendered it weak, so it is recommended that implementations choose a stronger encryption algorithm where possible. Suitable strong encryption algorithms include Triple DES 57 and the Advanced Encryption Standard (AES). 58 To maximize interoperability, all implementations that support encryption should support DES, despite its weakness. The presence of encryption and the use of the correct key are confirmed by the receiver through header or payload validity checks, such as those described in the Packet Validation sections of Chapter 4, RTP Data Transfer Protocol, and Chapter 5, RTP Control Protocol. The RTP specification does not define any mechanism for the exchange of encryption keys. Nevertheless, key exchange is an essential part of any system, and it must be performed during session initiation. Call setup protocols such as SIP 28 and RTSP 14 are expected to provide key exchange, in a form suitable for RTP. Confidentiality Using the Secure RTP ProfileAn alternative to the mechanisms in the RTP specification is provided by the Secure RTP (SRTP) profile. 55 This new profile, designed with the needs of wireless telephony in mind, provides confidentiality and authentication suitable for use with links that may have relatively high loss rate, and that require header compression for efficient operation. SRTP is a work in progress, with the details of the protocol still evolving at the time of this writing. After the specification is complete, readers should consult the final standard to ensure that the details described here are still accurate. SRTP provides confidentiality of RTP data packets by encrypting just the payload section of the packet, as shown in Figure 13.4. Figure 13.4. Secure RTP Encryption of a Data Packet
The RTP header, as well as any header extension, is sent without encryption. If the RTP payload format uses a payload header within the payload section of the RTP packet, that payload header will be encrypted along with the payload data. The authentication header is described in the section titled Authentication Using the Secure RTP Profile later in this chapter. The optional master key identifier may be used by the key management protocol, for the purpose of rekeying and identifying a particular master key within the cryptographic context. When using SRTP, the sender and receiver are required to maintain a cryptographic context, comprising the encryption algorithm, the master and salting keys, a 32-bit rollover counter (which records how many times the 16-bit RTP sequence number has wrapped around), and the session key derivation rate. The receiver is also expected to maintain a record of the sequence number of the last packet received, as well as a replay list (when using authentication). The transport address of the RTP session, together with the SSRC, is used to determine which cryptographic context is used to encrypt or decrypt each packet. The default encryption algorithm is the Advanced Encryption Standard in either counter mode or f8 mode, 58 , 59 with counter mode being mandatory to implement. Other algorithms may be defined in the future. The encryption process consists of two steps:
In both steps, the packet index is the 32-bit extended RTP sequence number. The details of how the key stream is generated depend on the encryption algorithm and mode of operation. If AES in counter mode is used, the key stream is generated in this way: A 128-bit integer is calculated as follows : (2 16 x the packet index) XOR (the salting key x 2 16 ) XOR (the SSRC x 2 64 ). The integer is encrypted with the session key, resulting in the first output block of the key stream. The integer then is incremented modulo 2 128 , and the block is again encrypted with the session key. The result is the second output block of the key stream. The process repeats until the key stream is at least as long as the payload section of the packet to be encrypted. Figure 13.5 shows this key-stream generation process. Figure 13.5. Key-Stream Generation for SRTP: AES in Counter Mode
When implementing AES in counter mode, you must ensure that each packet is encrypted with a unique key stream (the presence of the packet index and SSRC in the key stream derivation function ensures this). If you accidentally encrypt two packets using the same key stream, the encryption becomes trivial to break: You simply XOR the two packets together, and the plain text becomes available (remember from the discussion of parity FEC in Chapter 9, Error Correction, that A XOR B XOR B = A ). If AES in f8 mode is used, the key stream is generated in this way: The XOR of the session key and a salting key is generated, and it is used to encrypt the initialization vector. If the salting key is less than 128 bits in length, it is padded with alternating zeros and ones (0x555...) to 128 bits. The result is known as the internal initialization vector . The first block of the key stream is generated as the XOR of the internal initialization vector and a 128-bit variable ( j = 0), and the result is encrypted with the session key. The variable j is incremented, and the second block of the key stream is generated as the XOR of the internal initialization vector, the variable j , and the previous block of the key stream. The process repeats, with j incrementing each time, until the key stream is at least as long as the payload section of the packet to be encrypted. Figure 13.6 shows this key-stream generation process. Figure 13.6. Key-Stream Generation for SRTP: AES in f8 Mode
The default encryption algorithm and mode is AES in counter mode; use of AES f8 mode can be negotiated during session initiation. SRTP also provides confidentiality of RTP control packets. The entire RTCP packet is encrypted, excluding the initial common header (the first 64 bits of the packet) and several additional fields that are added to the end of each RTCP packet, as shown in Figure 13.7. The additional fields are an SRTCP (Secure RTCP) index, a bit to indicate if the payload is encrypted (the E bit), an optional master key identifier, and an authentication header (described in the section titled Authentication Using the Secure RTP Profile later in this chapter). Figure 13.7. Secure RTP Encryption of a Control Packet
Encryption of RTCP packets proceeds in much the same way as encryption of RTP data packets does, but using the SRTCP index in place of the extended RTP sequence number. The encryption prefix applied during standard RTCP encryption is not used with SRTP (the differences in encryption algorithm mean that the prefix offers no benefit). It is legal to split RTCP packets into encrypted and unencrypted packets, as can be done with standard RTCP encryption, indicated by the E bit in the SRTCP packet. As with the RTP specification, the SRTP profile defines no mechanism for exchange of encryption keys. Keys must be exchanged via non-RTP meansfor example, within SIP or RTSP. The master key identifier may be used to synchronize changes of master keys. Confidentiality Using IP SecurityIn addition to the application-level security provided by standard RTP and Secure RTP, it is possible to use IP-level security 17 , 110 with RTP. IPsec is implemented as part of the operating system network stack or in a gateway, not by applications. It provides security for all communications from a host, and it is not RTP-specific. IP security (IPsec) has two modes of operation: transport mode and tunnel mode. Transport mode inserts an additional header between IP and the transport header, providing confidentiality of the TCP or UDP header and payload, but leaving the IP header untouched. Tunnel mode encapsulates the entire IP datagram inside a security header. Figure 13.8 illustrates the differences between the two modes of operation. IP security in tunnel mode is most commonly used to build virtual private networks, tunneling between two gateway routers to securely extend an intranet across the public Internet. Transport mode is used when end-to-end security between individual hosts is desired. Figure 13.8. Transport Mode versus Tunnel Mode IPsec (Shaded Data Is Protected)
Both tunnel mode and transport mode support confidentiality and authentication of packets. Confidentiality is provided by a protocol known as the Encapsulating Security Payload (ESP). 21 ESP comprises an additional header and trailer added to each packet. The header includes a security parameter index and sequence number; the trailer contains padding and an indication of the type of the encapsulated data (TCP or UDP if transport mode is used, IP-in-IP if tunnel mode is used). Encapsulated between the header and trailer is the protected data, including the remaining headers. Figure 13.9 shows the encapsulation process, header, and trailer. Figure 13.9. An Encapsulating Security Payload Packet
The security parameter index (SPI) and sequence number are 32-bit fields. The SPI is used to select the cryptographic context, and hence the decryption key to be used. The sequence number increments by one with each packet sent and is used to provide replay protection (see the section titled Replay Protection later in this chapter). Following these two header fields is the encapsulated payload: a UDP header followed by an RTP header and payload if transport mode is used; IP/UDP/RTP headers and payload if tunnel mode is used. Following the payload data is padding, if required, and a " next header" field. This last field determines the type of the encapsulated header. Its name is somewhat misleading, given that the header to which this field refers is actually sent earlier in the packet. Finally, optional authentication data completes the packet (see the section titled Authentication Using IP Security later in this chapter). ESP encrypts the protected data section of the packet, using a symmetric algorithm (DES is mandatory to implement; other algorithms may be negotiated). If ESP is used with RTP, the entire RTP header and payload will be encrypted, along with the UDP headersand IP headers if tunnel mode is used. It is not possible to use header compression with IP security in transport mode. If tunnel mode is used, the inner IP/UDP/RTP headers may be compressed before encryption and encapsulation. Doing so largely removes the bandwidth penalty due to the IPsec headers, but it does not achieve the efficiency gains expected of header compression. If bandwidth efficiency is a goal, application-level RTP encryption should be used. IP security may also cause difficulty with some firewalls and Network Address Translation (NAT) devices. In particular, IP security hides the TCP or UDP headers, replacing them with an ESP header. Firewalls are typically configured to block all unrecognized traffic, in many cases including IPsec (the firewall has to be configured to allow ESP [IP protocol 50], in addition to TCP and UDP). Related problems occur with NAT because translation of TCP or UDP port numbers is impossible if they are encrypted in an ESP packet. If firewalls and NAT boxes are present, application-level RTP encryption may be more successful. The IP security protocol suite includes an extensive signaling protocol, the Internet Key Exchange (IKE), used to set up the necessary parameters and encryption keys. The details of IKE are beyond the scope of this book. Other ConsiderationsSeveral RTP payload formats provide coupling between packetsfor example, when interleaving or forward error correction is being used. Coupling between packets may affect the operation of encryption, restricting the times when it is possible to change the encryption key. Figure 13.10 shows an example of interleaving that illustrates this problem. Figure 13.10. Interactions between Encryption and Interleaving
The confidentiality mechanisms available for RTP can use a range of encryption algorithms, but they define a "must implement" algorithm to ensure interoperability. In many cases the mandatory algorithm is the Data Encryption Standard (DES). Advances in computational power have made DES seem relatively weakrecent systems have demonstrated brute-force cracking of DES in less than 24 hoursso it is appropriate to negotiate a stronger algorithm if possible. Triple DES 57 and the recently announced Advanced Encryption Standard (AES) 58 are suitable possibilities. The use of end-to-end encryption to ensure confidentiality in RTP is effective at preventing unauthorized access to content, whether that content is pay-per-view TV or private telephone conversations. This protection is generally desirable, but it does have some wider implications. In particular, there is an ongoing debate in some jurisdictions regarding the ability of law enforcement officials to wiretap communications. Widespread use of encryption as a confidentiality measure in RTP makes such wiretapsalong with other forms of eavesdroppingmore difficult. In addition, some jurisdictions restrict the use, or distribution, of products that include encryption. You should understand the legal and regulatory issues regarding use of encryption in your jurisdiction before implementing the confidentiality measures described in this chapter. |