Section 15.4. Node-Level Multicast Algorithms | Computer and Communication Networks (paperback)

15.4. Node-Level Multicast Algorithms

Multicasting techniques can also be used at the router level. To implement a multicast connection, a binary tree is normally constructed with the source switch port at the root and the destination switch ports at the leaves . Internal nodes act as relay points that receive packets and make copies. A number of such multicast methods are used. One is a tree-based multicast algorithm using a separate copy network. The Boolean splitting multicast algorithm is used for multistage switches. The third technique the packet recirculation multicast algorithm . The fourth is multicasting in three-dimensional switches .

15.4.1. Tree-Based Multicast Algorithm

Imagine that a multicast tree algorithm must be applied on a multistage switch fabric, as shown in the expansion switch in Figure 15.10. Multicasting is implemented as a tree structure. A source generates a packet and sends it out to the first crossbar switch. The packet may have a field that specifies how many copies of this packet are to be made. All copies may be destined for other crossbars, in which more copies of the packet are made, and some packets may move to their destinations.

Figure 15.10. Tree-based multicasting in an expansion switch

For implementing a tree-based multicast algorithm , the copying tasks in multistage switch fabrics typically involve a copy switch and a routing switch. An example of such systems consists of two Bene networks for constructing the copying and routing networks. The following algorithm presents the multicast steps within a copy network with n inputs or outputs constructed with d x d switch element:

Define:

k

=

log _d n

r

=

Number of stages

j

=

Stage number, j ˆˆ{1, 2, ..., r }

F

=

Global number of copies, F ˆˆ{1, 2, ..., n }

F _j

=

Number of copies remaining when packet arrives at stage j

f _j

=

Local number of copies made at stage j

Begin Tree-Based Multicast Algorithm

In the copy network shown in Figure 15.11, packets are replicated as designated by the initial global number of copies , F , given in the routing-control field of the packet header. The global number of copies refers to the total number of packet copies requested . Let F _j be the remaining number of copies of a packet when it arrives at stage j , and let f _j be the number of copies made locally at stage j . First, the algorithm initializes F _j to F and f _j to 1. The copying method is such that the replication of the packets takes place stage by stage within the network in order to distribute the traffic as evenly as possible. The routing is either point to point or multipoint connection.

Figure 15.11. Tree-based multicasting through a three-stage copy network followed by a routing network

Consider a copy network with k stages. When multipoint connections are requested, a packet is routed randomly in the first k - 1 stages. In the last k stages, the packet is copied . The operation computes a new number of copies for a packet at stage j , with the local number of copies The algorithm is set up to generate the final number of copies that appear on the outputs to be the smallest multiple of 2 greater than or equal to F .

This technique reduces the hardware complexity in the controller. If the final number of copies that appear on the outputs is more than requested, the unnecessary packet copies can be easily thrown away. This task is completed by the broadcast translated circuit (BTC), shown in Figure 15.11. BTCs receive all the copies of a packet. A central monitoring mechanism for all BTCs compares the number of requested copies read from the packet's header and the number of copies made in the copy network. This mechanism then decides to remove unused copies, based on this comparison. For the routing network, routing is similar to that in a point-to-point copy network.

Example.

Find all numbers of copies in every stage of a B _16,4 switching network in which the global number of copies of an incoming packet is F = 5 and copies of the packet are to be routed to output port numbers 1, 6, 9, 12, and 14.

Solution.

The copy network has r = 2 log _d n - 1 = 3 stages and k = 2 log _d n . For stages 1 j r - k = 1 or at stage j = 1, we have F ₁ = 5 and the local number of copies f ₁ = 1, so the packet gets distributed. For stages ( r - k ) + 1 = 2 j r = 3, starting at stage j = 2, F ₂ = [ F ₁ /f ₁ ] = 5, and f ₂ = [ F ₂ /d ^k - j ] = 2; thus, two copies are made at this stage and guided to two switches at the last stage ( j = 3). At the third stage, F ₃ = [ F ₂ /f ₂ ] = 3, and the local number of copies f ₃ = [ F ₃ /d ^k -j ] = 3. Therefore, for each of these switches, three copies of the packet are made. Note that the sum of these copies (6) has exceeded the requested global number of copies (5), so the one additional copy is thrown away by the corresponding BTC (see Figure 15.11).

15.4.2. Boolean Splitting Multicast Algorithm

The Boolean splitting multicast algorithm is a method of copying packets in any space-division switch fabric with n inputs or outputs constructed with 2 x 2 switch elements and therefore with log ₂ n stages. Consider a Banyan network in which switching nodes replicate packets based on 2-bit header information. The following algorithm presents multicast steps within a copy network.

Define:

k = log _d n

r = Number of stages

j = Stage number, j ˆˆ{1, 2, ..., r }

f _j = Local number of copies made at stage j

a(j) : = Lowest requested output address at stage j : a _k a _k _- ₁ ... a ₁

A(j) = Highest requested output address at stage j : A _k A _k _- ₁ ... A ₁

Begin Boolean Splitting Multicast Algorithm

For Stages 1 j r - k

F _j

=

F

f _j

=

1

For Stages ( r - k ) + 1 j r

Sort the output addresses

Compare a _j with A _j

If a _j = A _j = 0

Send the packet on crossbar output 0 (link 0).

For the packet forwarded to link 0

a _j ₊ ₁ = a _j

A _j ₊ ₁ = A _k ... A _k _- _(j _- ₁ ₎ 01 ... 1

If a _j = A _j = 1

Send the packet on crossbar output 1 (link 1),

For the packet forwarded to link 1

a _j ₊ ₁ = a _k ... a _k _- _(j _- ₁ ₎ 10 ... 0

A _j ₊ ₁ = A _j

If a _j = 0 and A _j = 1,or a _j = 1 and A _j = 0,

Replicate the packet.

Send one copy of the packet carrying upper half of addresses on link 0.

Send one copy of the packet carrying lower half of addresses on link 1.

The switch fabric replicates packets according to a Boolean interval-splitting algorithm, based on the address interval in the new header. An output port address of the switch fabric that receives a copy of the packet is represented by two k -bit binary numbers. Suppose that a packet arrives at an arbitrary crossbar at stage j - 1 from a previous crossbar j . If we represent the minimum number by a k -bit address a _k ... a ₁ and the maximum number by a k -bit address A _k ... A ₁ , we can make the following observations: If a _j = A _j = 0 or a _j = A _j = 1, the packet is sent out on crossbar output 0 or 1, respectively. If a _j = 0 and A _j = 1, the network replicates the packet and sends both packets out on both links 0 and 1.

If replication takes place, the packet header needs to be modified by splitting the original address interval into two subintervals, which can be expressed by the two following main recursions: a _j ₊ ₁ = a _j and A _j ₊ ₁ = A _k ... A _k _- _(j _- ₁ ₎ 01 ... 1 for the packet sent out on Link 0, and a _j ₊ ₁ = a _k ... a _k _- _(j _- ₁ ₎ 10 ... 0, and A _j ₊ ₁ = A _j for the packet sent out on Link 1. This modification is a routine operation, meaning that it is independent of the header contents and can be considered built-in hardware.

Example.

Show the detail of the Boolean splitting multicast algorithm applied to copy a packet in a Banyan network Y _16,2 . Find all numbers of copies in every stage of a switching fabric, where the global number of copies of an incoming packet is F = 6, and copies of the packet are to be routed to output port numbers 5, 6, 7, 8, 9, and 10.

Solution.

The switch fabric has k = log ₁₆ 2 = 4 stages. At stage j = 1, we have F ₁ = 6 a( 1) = 0101 and A( 1) = 1010; consequently, a ₁ = 0 is compared with A ₁ = 1. Since these two bits are different, 0101 to 0111 are assigned to link 0, and 1000 to 1010 are assigned to link 1. If we continue the multicast process, the results of this example will be as shown in Figure 15.12.

Figure 15.12. The Boolean splitting multicast algorithm in a multistage switch

15.4.3. Packet Recirculation Multicast Algorithm

Packet recirculation ( recycling ) is another feasible method for constructing large switching fabrics for broadband switching applications. To implement a multicast connection, a binary tree is constructed with the source port at its root and the destination switch port at its leaves. This technique can be used for almost all kinds of space-division switch fabrics.

If a switch fabric is constructed by a multistage interconnected network, internal crossbars can act as relay points to accept packets and recycle them back into the switch fabric after packet relabeling. New labels carry the information about the destination pair, identifying the next two switch ports to which they are to be sent. Figure 15.13 illustrates the recirculation technique. A multicast table provides an output port/IP address pair. This pair is added to the packet header; thus, two additional bits indicate whether the pair is to be recirculated again. An IPP buffer holds packets received from the input link and forwards them to the switching network. An OPP buffer holds packets waiting to be transmitted on outgoing links. A recirculation buffer provides buffering to the packets that need to be recycled.

Figure 15.13. Recirculation multicasting

Packet recirculation and copying within the switch fabric is straightforward. Figure 15.14 illustrates packet recirculation and copying. Suppose that a is a source that generates packets to be delivered to output ports b , c , d , and e and that x and y are relay points. Suppose that a packet entering at input port a must have several copies. For such a packet, there may be entries of the type ( x , j ) and ( e , k ) in the multicast table. This can be interpreted as making two copies of the packet: one to be recirculated back to port x on address j ; the other one, to port e with address k . The multicast table entry at x may now have entries ( b , n ) and ( y , m ), as seen. Typically, each packet header has 2 bits to indicate whether a packet needs to be recirculated.

Figure 15.14. Within the switch fabric of routers: (a) packet recirculation and (b) copying

In summary, an end point that is added needs to have a parent. A good candidate is the one that does not have heavy recirculation traffic. Like the parent of the new end point, the intermediate point also is new, so it has to be put into the multicast table entry of another node, which becomes the grandparent of the new end point, whose child will now become a sibling of the new end point.

To remove a connection, we can consider two cases. We should examine whether the output to be removed has no grandparent but its sibling has children; in that case, we replace the parent's multicast table entry with the sibling's children. If the output to be removed has no grandparent and its sibling has no children, we simply drop the output to be removed from its parent's multicast table entry, and the connection reverts to a simple point-to-point connection. Note that since packets must be resequenced when they exit from the system, the resequencing buffer at the output port processor (OPP) of a router must be dimensioned to delay packets long enough so that slow packets have a chance to catch up with fast packets.

15.4.4. Multicasting in Three-Dimensional Switches

This switch uses a Clos network as a building block module. An n -port Clos network is a nonblocking network if k 2 d - 1, where d and k are the sizes of the first-stage crossbar switch elements. The three-dimensional structure of the Clos network consists of m parallel planes of the same type and same size Clos network in Figure 15.15. An incoming packet can be demultiplexed among m planes. The demultiplexer does the work of distributing the input packets. In other words, the three-dimensional switching system accepts packets coming from different planes and time multiplexes them onto the same output port. The multicast algorithm for each plane is described as follows .

Figure 15.15. Three-dimensional multiplexed Clos network

At the first stage, no routing decision needs to be made. The first-stage switch elements maintain a table that indicates utilization of the center-stage arrays. This table is created on the basis of back-pressure information obtained from the center-stage arrays. A center-stage element least used at any particular time gets the highest priority. In fact, this table can be shared by all the first-stage arrays. Once a multicast packet arrives at the center-stage array, the switch looks at the first g bits of every address, where g = log ₂ n/d . The packet is then split according to the number of distinct k bits found. The maximum number of new packets that can be created is n/d . All addresses with the same first g bits are sent to the corresponding switch element in the last stage. Also, since four distinct addresses are identified by the first 2 bits in our example, the packet is split four ways.

The final stage (stage 3) looks at the remaining l bits, where l = log ₂ n , in each address of the multicast packet received by the last-stage array. The packet is again split according to the number of different sets of l bits found. The maximum number of packets at this stage is l . The packets are then sent to the appropriate output port corresponding to the last l bits. If a packet is split, the header has to be modified. The new header contains the addresses with the same initial g bits.

A multicast packet on a particular input port requesting F copies of a packet can distribute the request equally between the parallel planes. This reduces the load on any particular plane, leading to a reduction in the use of buffers. An analysis of a three-dimensional Clos network is based on the blocking probability P _b (i) for input number i out of n :

Equation 15.1

where is the offered load per input, and F _max is the maximum number of copies requested by any packet. For better performance, a priority is assigned to each plane on the basis of usage. Therefore, planes that are being used less, as determined by the back-pressure information, are given higher priorities. The distribution of multicast packets among planes is done by using the following algorithm.

Begin Multicast Algorithm in 3-D Switches

If F m : Send a packet to the first F available planes.
Each plane makes one copy of the packet and routes it to the appropriate output port, depending on the header information. Thus, we get the required F copies.
If F > m : Find , the number of copies equally distributed among the planes, Q , and the copies left over, R .
Choose the highest-priority plane. For the first R planes, add 1 to the number of copies desired to be made by that plane, which is simply the value of Q . That is, the remaining R copies are further divided among the least-busy planes (or the highest-priority planes). The remaining m - R planes now have to make Q copies.
The first Q (or Q + 1 for those planes that take care of the extra requests ) addresses form the address space and add the new header to it, which has information about the destination plane, number of copies, and so on. Repeat this for all addresses.

A distribution circuit at the input has the processing ability to take the packet header and modify it according to the algorithm for distribution of packets to the planes. Some fields are added on top of the original packet. In fact, an original packet is encapsulated into this new packet with the new packet header.