Load Balancing RTSP Streaming Media | Optimizing Network Performance with Content Switching: Server, Firewall and Cache Load Balancing

The delivery of either real-time or delayed feed streaming media can be very compute intensive on the resources of object servers. As we saw in Chapter 3, the delivery of streaming audio and video data requires not only the use of a combination of TCP- and UDP-based delivery channels, but also potentially the interpretation and encoding of a live video or audio feed. The attraction of load balancing standard HTTP traffic is increased many fold when you consider the horsepower required to deliver high quality media streaming. However, as we've seen with other multithreaded protocols such as FTP, the load balancing of RTSP has issues all its own.

Load Balancing RTSP at Layer 4 Only

Before we look at the subtleties of RTSP load balancing at Layer 7, let's look at the basic frame flow when we use Layer 4 only. Figure 6-18 shows the basic traffic flow for Layer 4 load balancing of RTSP and RTP traffic. If we consider the steps in turn , we'll see where the problem lies.

The client initiates a TCP-based RTSP connection to the VIP on the content switch. In the instance where only Layer 4 load balancing is used, the VIP will only be configured with a service port of 554 for RTSP. Typically, no form of delayed binding is used and the connection is load balanced to a selected server.
The client and server exchange OPTIONS, SETUP, and DESCRIBE header messages to agree on the RTP and RTCP delivery mechanisms and other variables .
The client issues a PLAY command to instruct the server to commence delivery of the UDP-based RTP and RTCP frames containing the data and control, respectively.
Here is where we see the potential problem. As the server initiates the UDP (or potentially TCP) based RTP delivery, the content switch has no understanding of this flow, nor any way to associate it with the RTSP connection established in step 1. As a result, the source IP addresses on the RTP and RTCP UDP streams from the server to the client are not translated to the VIP on the content switch.

Figure 6-18. One of the main issues with load balancing RTSP at Layer 4 only is that the real IP addresses of the object servers are "bled" back to the client, as the corresponding RTSP and RTP streams are not associated in the content switch.

graphics/06fig18.gif

The net result of this type of configuration can be summarized as IP address bleeding , whereby the real source IP addresses of the object servers are revealed to the client on the UDP streams carrying the video and audio data. In many instances, this might not create a problem, as the RTSP and RTP specifications do not require that the source address of the RTP delivery be the same as the destination IP address for the RTSP connection. Where this might cause issues is in the following scenarios:

If the service is hosted by an ISP or hosting service implementing anti-spoofing policies on perimeter routers of firewalls, the real servers will need to be located in public, routable address space. In many implementations , the ISP or hosting service will filter out source IP addresses that do not belong to the assigned address block. One solution for this scenario is to locate the servers in routable address space, although this will incur greater cost as many ISPs will charge for larger numbers of addresses.
Some stateful firewalls implemented in the transitory networks between the client and the server may parse the RTSP connections to determine the expected RTP connections and open corresponding UDP source and destination ports. If the source IP address bleeding occurs, the firewall may reject the UDP-based RTP stream and force the server and client to negotiate down to the less efficient interleaved mechanism.

Therefore, our golden rule for implementing the Layer 4 only server load balancing for RTSP as described is to locate the servers in public routable address space such that once the source IP address bleeding occurs, fewer problems will be experienced .

Applications of Layer 7 RTSP Load Balancing

Adding Layer 7 intelligence to the load balancing of video and audio streaming services with RTSP has a number of applications. Some are less obvious and simply improve the overall operation while others can have tangible benefits in improving how different video and audio content types are delivered.

Tying RTSP, RTP and RTCP Channels Together

The first application we'll consider is a solution to the issue described in the preceding section. The correct source IP address translation will provide the client application with the illusion that both the RTSP connection and the consequent RTP and RTCP streams all originate to and from the VIP on the content switch. Without describing in detail the individual workings of specific vendor implementations, the basic traffic flow can be seen in Figure 6-19.

Figure 6-19. Once the content switch has parsed the RTSP data, it can create the relevant session entries to ensure the reverse NAT takes place for the RTP streams.

graphics/06fig19.gif

The basic stages of the traffic flow as shown in Figure 6-19 are as follows :

The client initiates a TCP handshake to the VIP on the content switch, which in turn facilitates a delayed binding to the client. If no parsing of the URL is required to decide on the object server (see Application 2), then the content switch will initiate a TCP connection to the object server.

The content switch will parse the RTSP data for the passing of the SETUP command from the client and the consequent RTSP 200 OK message from the server, extracting the client_port and server_port arguments as shown here:

 C->S    SETUP rtsp://video.foocorp.com:554/streams/example.rm RTSP/1.0         Cseq: 3         Transport: rtp/udp;unicast;client_port=5067-5068 S->C    RTSP/1.0 200 OK         Cseq: 3         Session: 12345678         Transport: rtp/udp;  client_port=5067-5068;server_port=6023-6024

Once the content switch has extracted the correct source and destination UDP port numbers, it can create a session entry for the UDP data flows from the server to the client ensuring that the correct source IP address (and optionally source UDP port) translations occur.
For the RTP data flow, all packets are properly translated, leading the client to believe that the source of the UDP-based RTP streams is the same VIP to which it has established the RTSP connection.

While each vendor's implementation of Layer 7 RTSP parsing for this type of issue may differ slightly, the fundamental outcome remains the samethe object servers from which the RTP content will be delivered can now reside in RFC 1918 non-routable address space, and the content switch will ensure that none of these addresses appear to the outside world.

Parsing the URL of the RTSP Stream

As we saw earlier in the chapter for HTTP content, there are often many good reasons to separate different content types across backend object servers. In the case of RTSP-based video and audio streams, this reasoning is often accentuated, as different stream types will require different resource types. Live video streaming, for example, requires large amounts of processing power in terms of CPU and memory to allow the object server resources to convert the live video and audio in the correct compression and format for delivery to the client, as well as access to the live feed itself via directly attached cameras , satellite feeds, or TV type input. Static, or prerecorded, might typically require less power CPU and memory resources, as the compression and encoding will already have taken place during the recording and positioning of the content. This type of service might more typically require NFS or SAN access to large data stores containing the prerecorded movies, programs, and clips. Other requirements may be to separate out free versus pay-on-demand content, authenticated versus nonauthenticated content, or even using the URL to distinguish between paying customers in a shared hosting environment. Whatever the requirement, the ability to parse the URL being passed within the RTSP stream can be a very powerful addition to a media streaming implementation. Figure 6-20 shows an example of separating live and prerecorded streams on the object servers using a content switch.

Figure 6-20. A content switch parsing the DESCRIBE commands of an RTSP stream offers the ability to separate different content types such as live and prerecorded streams.

graphics/06fig20.gif

In the same way that FTP requires intelligent parsing and translation of IP addresses and TCP ports, and HTTP offers the ability to segregate content types, Layer 7 RTSP load-balancing combines both of these requirements and capabilities. With RTSP being an open, RFC defined protocol, many content switch vendors now offer support for the mechanisms described previously.