Section 16.2. Scalability and Load Balancing

16.2. Scalability and Load Balancing

A FlashCom application designed to run on one server usually must be redesigned and rewritten to run across multiple servers. In other words, there is no automatic clustering technology available for FlashCom Servers. The primary reason each application must be designed for multiple servers is that only one instance can directly update resources such as shared objects. Any clustering technology would have to provide a way to share instance state across machines.

Designing an application to scale across multiple servers can be relatively simple or quite complex, depending on the type of application. Let's look at the three general categories of applications: media-on-demand, live one-way broadcasting, and n -way communication applications.

16.2.1. Media-on-Demand

Media-on-demand applications provide access to content that is uploaded to one or more FlashCom Servers or to a filesystem accessible to each server. The content is prepared in advance and may be video, audio, and/or data delivered by playing prerecorded streams. Applications such as a library of video tutorials, promotional video on a high-traffic site, or synchronized video/slide presentations are all examples. Media-on-demand applications are the simplest to build. Content such as .swf and .flv files can be uploaded either separately to each server or to one common high-speed storage system connected to each FlashCom Server by an internal high-speed network.

Figure 16-4 shows one possible configuration for on-demand media distribution.

Figure 16-4. Media-on-demand distribution servers and file server

The major challenge in building an on-demand application is determining which clients connect to which server. Since the same content is available on every serverChapter 12 shows how to use ColdFusion to automatically mirror streams across multiple serversthe goal is to try to balance the load on each server by controlling which server each client connects to. Ideally, the relative load on every server should be almost identical. For example, if several servers are usedsome with four CPUs and 4 GB RAM and some with two CPUs and 2 GB RAMthe percentage of memory and CPU cycles used on each machine should be the same. More clients would be serviced by the more powerful machines so that each machine is utilized at a similar proportion of its capacity to serve streams. There are a number of ways to assign clients to servers, such as allowing the client to pick a server at random, having a web application direct a client to a specific server, or using a network load-balancing device.

We explore each of the se options in turn in subsequent sections.

16.2.1.1 Client-side server selection

Possibly the simplest way to select a server is to have the client select a random number within a range of integers and compose a server name from it. For example, if five servers named fcs1.ryerson.ca through fcs5.ryerson.ca are available, then a server can be selected by getting a random number between 1 and 5:

 serverName = "fcs" + Math.floor(Math.random(  )* 5 + 1) + ".ryerson.ca"; 

The Math.random( ) method produces pseudorandom numbers that are evenly distributed. If each server has the same capacity, a large number of clients will be reasonably well distributed across servers. If a server is down, each client can randomly select a name from the remaining servers and try again. If the distribution servers have different network, CPU, or memory capacity, a capacity rating can be assigned to each server and a weighting system used to select a server.

One problem with generating server names in code is that the code must be updated when new servers are brought online or retired . A simple enhancement is to have the client load an XML file containing the name and capacity of every server in production. Example 16-5 lists such a simple serverList.xml file.

Example 16-5. The serverList.xml file
 <serverList>   <server name="fcs1.ryerson.ca" capacity="20" />   <server name="fcs2.ryerson.ca" capacity="20" />   <server name="fcs3.ryerson.ca" capacity="10" />   <server name="fcs4.ryerson.ca" capacity="10" />   <server name="fcs5.ryerson.ca" capacity="15" /> </serverList> 

Each capacity value is an integer greater than 0. To randomly select a server weighted by the capacity values, the capacities can be summed and a pseudorandom number between 0 and the total capacity selected. Table 16-1 shows one way a capacity range can be calculated for each server based on the capacity value and the cumulative sum of capacities.

Table 16-1. Weighting server selection by capacity

Server name

Capacity

Sum

Sum -1

Range

fcs1.ryerson.ca

20

20

19

0-19

fcs2.ryerson.ca

20

40

39

20-39

fcs3.ryerson.ca

10

50

49

40-49

fcs4.ryerson.ca

10

60

59

50-59

fcs5.ryerson.ca

15

75

74

60-74


A pseudorandom number between 0 and 74, inclusive, can be used to select a server. When a large number of clients connect, the weighting will be closely in proportion to the capacity of each server. Example 16-6 shows one way to select a server name from a serverList.xml file.

Example 16-6. Sample code to output the name of a server selected from the serverList.xml file
 request = new XML(  ); request.ignoreWhite = true; request.onLoad = function (success) {   if (success) {     var capacityArray = [];     var sumCapacity = 0;     var serverName;     // Get an array of   <server>   tags.     var servers = this.firstChild.childNodes;     var len = servers.length;     // Get the capacity of each server and build an array of sum capacities.     for (var i = 0; i < len; i++) {       var serverInfo = servers[i].attributes;       sumCapacity += parseInt(serverInfo.capacity);       capacityArray.push(sumCapacity - 1);     }     // Get a random number from 0 to   sumCapacity   -1.     var selector = Math.floor(Math.random( ) * sumCapacity);     // Find the selector in the capacity array.     for (var i = 0; i < len; i++) {       if (selector <= capacityArray[i]) {         serverName = servers[i].attributes.name;         break;       }     }   }   else {     // Hardcoded fallback: assume we have servers numbered 1 to 5.     serverName = "fcs" + Math.floor(Math.random( )* 5 + 1) + ".ryerson.ca";   }   // Trace out the server name selected.   trace("Server name: " + serverName); }; request.load("serverList.xml"); 

16.2.1.2 Software server selection

Server-side software can perform something similar to the calculation made in the client-side code in Example 16-6. Each client can request a server name from a web application. The web application can provide a name using code similar to Example 16-6 or, in more sophisticated systems, can look at the actual load on each server in real time and ask each client to connect to the least-busy server. The load on the server can be determined by calling the Admin Service's getServerStats( ) method, which returns CPU and memory usage information, as described in Chapter 10. Alternatively, a non-FlashCom application may be available to monitor server load.

16.2.1.3 Network-level server selection

Load-balancing devices that can provide efficient server selection on a network are available from various vendors . A load-balancing device can be configured to listen at one IP address and redirect traffic to one of many servers on other IP addresses. For example, a DNS entry for fcs.mydomain.ca might resolve to one IP address. The load balancer is set up to receive IP traffic for that address and is programmed with information about the servers at the other addresses. Figure 16-5 provides a simple illustration of a load balancer and the servers connected to it.

Figure 16-5. Load-balanced servers

Every client attempts to connect to the address xxx.xxx.xxx.100. Each client's connect request is redirected to a server at another address. Load balancers provide different options for determining what server to redirect to and can perform regular health checks on servers so that, if one is not available, clients will be redirected to the other servers. One option is to have the load balancer assign each client connection request to the server with the fewest number of current connections. Another is to hash the client's IP address or to use weighted hashing. Finally, a weighted percentage can also be used.

Regardless of the method used, media-on-demand applications are unaffected if a client connects to one server to play a stream and later to another server to play another stream. That is, because there is no application state, it doesn't matter whether the physical server providing the media changes during the application's lifetime (provided it doesn't change in the middle of playing a given data stream, which would require the second server to pick up where the first server left off).

16.2.2. One-Way Live Delivery

One-way delivery of live audio, video, and/or data is similar to media-on-demand applications except that the same live stream and supporting shared objects must be made available across multiple servers in real time.

A simple example of a one-way live application running on several servers is a live video streaming application that uses a broadcast tree of servers. The root server collects the incoming live video streams and sends them to other servers, which then send the streams on to each client. It is relatively simple to build out a broadcast tree to handle any number of clients.

Figure 16-6 illustrates a simple broadcast tree that could be used to deliver live audio and video streams to more clients than could normally be handled by a single server.

For example, if testing shows that each distribution server can handle only 1200 streams, for safety you might decide to allow no more than 900 streams per server. If a live event requires the delivery of a total of 2700 streams, then three leaf servers are needed. Figure 16-6 illustrates this scenarioeach client plays only one of the three streams in the illustration.

Figure 16-6. A broadcast tree with one root server and three leaf servers

The root server in Figure 16-6 serves only nine streams. In this example, a root/leaf scheme can be scaled out to 900 leaf servers before an intermediate layer of servers would have to be added in between the root and leaf servers.

The primary difference between one-way live delivery and media-on-demand is that each instance on a leaf server must connect to a root server instance. The leaf instance creates proxy shared objects or streams of any resources that must be made available to its clients. A broadcast tree can be described by levels starting from the source clients that connect to the root server and then working down the tree until reaching the destination clients. The software running at each level in the tree has different responsibilities. Let's look at each level from top to bottom.

16.2.2.1 Source clients

The source clients are designed to be used by the camera/microphone operators and allow the operator to log into the root server application. The client is designed to provide convenient controls to control camera and microphone sources and to publish one or more streams at a controlled bit rate.

16.2.2.2 Root server application instance

The root server application allows connections from the source clients who authenticate to the server and from intermediate or leaf FlashCom Servers. Streams and shared objects in the root application are available to intermediate or leaf server applications.

16.2.2.3 Intermediate server instances

Intermediate server instances are used to increase the height of the broadcast tree when extremely large numbers of streams must be served . They connect to the root server or another intermediate server one level higher in the tree and allow connections from leaf or lower-level intermediate servers. They proxy higher-level resources so that they are available lower down in the broadcast tree.

16.2.2.4 Leaf server application instances

Leaf server instances connect to the server above them in the broadcast tree. In smaller trees, they connect to the root. Once connected, the leaf application creates a proxy of every shared object and stream that must be made available to the destination clients. If authentication is required before streams can be viewed , the leaf instance implements user /viewer authentication.

16.2.2.5 Destination clients

The destination clients are the ones designed to view the stream or streams delivered via the leaf servers in the broadcast tree and are often simple viewer movies that allow the user to select one of a number of streams for viewing.

The same server selection options are available for one-way live applications as work with on-demand applications. Because servers are interchangeable, a client can connect to any server at any time and will be able to function correctly.

16.2.3. n-Way Live Communications

Applications that are not one-way are the most challenging to scale and require different approaches to server selection. n -way applications can consume greater server and network resources than one-way applications and require extra work to manage instances correctly.

Figure 16-7 shows 12 clients connected to a single server. In a one-way application, the server may have to send only one stream to each client for a total of 12 streams. However, in an n -way video conference application, each client may send 1 stream and receive 11 streamsone for each of the other clients. The server must therefore send or receive 12 streams per client for a total of 144 streams. That's 12 times more than a one-way application!

Figure 16-7. Twelve clients participating in a video conferenceeach client publishes 1 stream and subscribes to 11 streams

Figure 16-8 shows the same clients connected to one of three distribution or leaf servers that are in turn connected together by one master or root server.

Figure 16-8. A video conference spread across three leaf servers and one root server

Again, each client publishes 1 stream and receives 11 streamsone for every other client in the conference. Each leaf server must send four streams to the root serverone for each client connected to the leaf server. It must receive from the root server eight streamsone for every client connected to a remote leaf server. So, each leaf server must handle:

  • Four streams arriving from its clients

  • Eight streams arriving from the top-level server

  • Four streams going out to the top-level server

  • Forty-four streams going to its clients (11 each)

In total, each leaf server must handle 60 streams. That is a 58% reduction from the single-server scenario of 144 streams.

To make the arrangement illustrated in Figure 16-8 work, the root server normally contains the master instance for the application and the leaf servers' host instances that proxy the resources in the master instance. The arrangement is similar to the application illustrated in Figure 16-2, in which separate lobbies connected to a master instance to share its resources.

Unlike a one-way broadcast tree that can be increased in size indefinitely, the number of streams that must traverse the root server increases with the number of clients. Of course, clients involved in an n -way application will likely run out of bandwidth from trying to receive so many streams before the number of streams exceeds the server's capacity. In practice, n -way applications rarely send more than six simultaneous video streams to each client.

Let's return to the application illustrated in Figure 16-2 and assume that an application has to maintain lobby and room instances spread across multiple servers. Whenever a client connects to a lobby, a slot in the master instance's activeUsers shared object is updated with the client's location and time of arrival. Every client has an ActiveChatUsers component that shows every user connected to the application. As already mentioned, this scheme is scalable only up to a point. A user will not want to scroll through thousands of usernames to find someone. Worse, each added slot in the activeUsers shared object takes up memory in the master instance and in every proxied shared object in the leaf instances. At some point, memory consumption will incapacitate the servers.

The solution is to redesign the application so that it shows only a subset of clients to each user while still maintaining information about what clients are connected. The master instance can be replaced with a database server. Lobby instances in each leaf server will poll a database for information in the friend list of the clients connected to it. The database replaces the master instance. Pollingwhich is not an ideal solutionmust be used in FlashCom 1.5.2 and earlier because Flash Remoting is a request/response protocol.

Another complexity in the lobbies/rooms application is that while a client could, in theory, connect to any lobby on any server, it must connect to the correct instance on one server in order to participate in a room. Rooms are single instances and are not necessarily spread across servers. In that case, each client must connect directly to a particular leaf server.

By default, hardware load-balancing devices usually hide the IP address of the real servers they are load balancing. For applications in which clients must connect directly to an individual host, the IP address of each server must be directly reachable . If your hardware load balancer does not support direct access, software-based load balancing should be implemented instead.

16.2.4. Redundancy and Failover

Servers, firewalls, packet shapers, and networks are all capable of failing to work when your clients are using them. Few clients are really happy about coming back later when the problem is fixed. To increase the probability that a client can continue working, redundant servers, firewalls, and network connections can be provided so that the client can reconnector fail overto the redundant hardware.

Providing redundancy and failover for n -way FlashCom applications presents some special challenges. In a one-way application, if a client loses a connection to a server and cannot reestablish it, it simply tries to connect to another server. If it is playing a recorded stream, it can play the same stream from another server and seek to where it left off. If a live stream is being played , it just begins playing the stream at its current location as soon as the connection is established. Similar strategies can be used with conference rooms, but they require a little more care.

When a conference room is allocated on a single-leaf server, a fallback RTMP address on another FlashCom Server can also be allocated. If the instance goes down for some reason, clients connected to it can automatically attempt to connect to the alternate instance. If each FlashCom Server is connected to a centralized filesystem and has an identical FlashCom installation on it with the same drive mappings (or same mount points), the alternate instance can read the same files, including shared objects and streams that were available to the original. However, under no circumstances can two instances be allowed to access the same shared object files or update the same stream files at the same time. In other words, either all the clients are disconnected from the original instance and connected to the alternate instance or none should be moved. When an instance or entire server goes down, the client's NetConnection object will eventually receive an information object with a code value of "NetConnection.Connect.Closed". Unfortunately, such a code can also mean that the client's network connection was dropped temporarily and that many other clients are still happily connected to the instance. So when a client loses a connection to an instance before a meeting is over, it should attempt to contact the same instance again before trying the alternative server. To avoid any possibility of clients ending up in separate instances, the alternative instance can attempt to contact the original instance and refuse the client access if the latter is still available. This level of interinstance coordination requires careful planning and coding.

When multiple lobbies are using one master instance, failure of a lobby may be easy to handle. The client simply reconnects to another lobby on another server. However, if a master instance goes down, an alternative must be available that can start up and take over. The new master will have to re-create the state of the old master by asking each lobby to update it. Keeping a master instance and another instance in sync so that the other can take over as the master without having to rebuild its state is difficult because a proxied shared object cannot simply be promoted to become a master shared object. One scheme is to use persistent shared objects that are frequently flushed to disk by the master instance. The shared objects are stored on a separate storage file server. If a master server fails, a new master is allocated and reads the persistent shared object from the storage file server. Another scheme is to replicate updates by sending every shared object slot change from the master instance to a shadow copy of the master instance. If the master instance fails, the shadow copy becomes the master instance.



Programming Flash Communication Server
Programming Flash Communication Server
ISBN: 0596005040
EAN: 2147483647
Year: 2003
Pages: 203

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net