Link State Information | Microsoft Exchange Server 2003 Administrators Companion (Pro-Administrators Companion)

The link state protocol is a binary protocol that greatly improves message routing in Exchange 2003 over Exchange 5.5. Message routing in Exchange 5.5 is based on the GWART, which is a table that keeps track of each available connector in the Exchange 5.5 organization, along with the cumulative cost of using those connectors. The GWART is limited in that it contains only next-hop information. It doesn’t monitor whether the link is actually up or down. The GWART is located in the Site Addressing object and is referenced by the MTA when determining messaging paths to a destination server.

The link state protocol operates over TCP port 691 within the routing group. In each routing group, one server is configured as the routing group master (RGM). The RGM receives link state information and propagates this information to the other servers in the routing group, including the BHS. When one BHS connects to another BHS in a different routing group, the exchange of link state information occurs over TCP port 25, using SMTP. The RGM keeps track of which servers are up and which are down and propagates that information to the RGM in every other routing group.

Link State Algorithm

The link state algorithm is new to Exchange Server 2003, although it has been around for many years. First developed by Edsger Dijkstra in 1959, it forms the foundation of the Open Shortest Path First (OSPF) protocol, used extensively by routers today. Although Exchange Server 2003 incorporates routes and costs, it also relies heavily on link state information to route messages between routing groups.

The link state algorithm propagates the state of the messaging system almost in real time to all servers in the organization. There are several advantages to this:

Each Exchange server can make the best routing decision before sending a message downstream where a link might be down.
Message “ping-pong” is eliminated because alternate route information is propagated to each Exchange Server 2003.
Message looping is eliminated.

Given the extensibility of this protocol, it is possible that future versions will be able to interact with network routers to achieve even greater routing capabilities. Networks that collaborate in this way are known as directory-enabled networks.

Link State Concepts

Link state information is rather important when an organization has multiple routing groups with multiple paths between the groups. The RGM maintains the link state information, sending it to and receiving it from the RGMs in other routing groups. The RGM is not necessarily the same server as the BHS, which is the server that you designate to send and receive messages across a given connector to another BHS. The RGM, by default, is the first server installed into the routing group. You can change this in the ESM by right-clicking a server that is not the RGM and selecting Set As Master. You can also manually configure one server to perform both roles.

The RGM ensures that all the servers in its routing group have correct link state information about the availability of the messaging connectors and servers in other groups. In addition, it ensures that other groups have correct information about its servers.

Link state information is propagated among the servers within a group over SMTP. Between groups, however, link state information is replicated from RGM to RGM over TCP port 25. There are only two states for any given link: up or down. The link state information does not include any connection information, such as whether a link is in a retry state. This information is known only to the server involved in the message transfer.

Note

Connectors such as the Lotus cc:Mail Connector, the Microsoft Mail Connector, and other EDK-based connectors will always display their link state information in Exchange System as up, even if the link is unavailable.

Link state information is held in memory, not on disk. If the RGM goes down or needs to be rebooted, it will need to replicate in all current link state information from other RGMs in the organization. Since the routing group information is held in the naming partition of Active Directory, the definitions of connectors and costs are also held in Active Directory. The link state protocol references each connector by its globally unique identifier (GUID).

Note

When a BHS determines that a link is unavailable, it marks the link as down. It then sends this information to all the servers in its own routing group (over TCP port 691) and to the bridgehead servers in the other routing groups (over TCP port 25). If you are performing a trace for link state data, look for the X Link2state command verb, which denotes this type of data. The information is sent in chunks labeled “first chunk,” “second chunk,” and so on, up to “last chunk.”

Figures 3-12 and 3-13 illustrate what you’ll see in Network Monitor. Figure 3-12 shows the link between Folsom and Minneapolis as DOWN. After you restore the link, you can see that Figure 3-13 shows the link as being UP. However, the X- Link2state command doesn’t appear in the description of the packet. You’ll need to read the data in each packet to find the X-Link2state command packets.

click to expand
Figure 3-12: Trace showing link state as DOWN.

click to expand
Figure 3-13: Trace showing link state as UP.

How Link State Information Works

Let’s look at how link state information works and what happens to messages in the event of a failure. Figure 3-14 shows the topology for a network consisting of five routing groups and indicates the connector cost for each connector. We will assume that all connectors are RGCs.

click to expand
Figure 3-14: Routing topology for link state example.

Failure of a Single Link

Normally, a message sent from a server in RG1 to a server in RG5 will pass through RG2 and RG4 because it is the route with the least cost. Let’s assume, however, that there is a link failure between RG2 and RG4. In this type of single-line failure, the link state protocol causes the routing processes to carry on as follows:

The BHS in RG1 sends the message to the BHS in RG2.
The BHS in RG2 attempts to open an SMTP connection to the BHS in RG4. If RG4 contains more than one BHS, the BHS in RG2 attempts to open a connection to each BHS in sequential order.
The BHS in RG2 is unable to contact any of the BHSs in RG4 because the physical link is down. Therefore, the BHS in RG2 places the connection into a glitch-retry state. The BHS waits for 60 seconds and then attempts to retransfer the message to the BHS in RG4.
After three unsuccessful attempts to connect to RG4, the BHS in RG2 marks the link as down, updates the link state information on the RGM in RG2 over TCP port 691, and makes a call to reroute the message that is sitting in the SMTP out queue.
The RGM, upon receiving the notification that the link is down, immediately floods this data to all other Exchange Server 2003s in the routing group.
The BHS in RG2 recalculates an alternate route to RG5 via RG1, RG3, and RG4.
Before rerouting the message back to RG1, the BHS in RG2 sends the information about the down link to the BHS in RG1. This communication occurs on TCP port 25 and consists of an EHLO command and a X-Link2state command. (For more information about these and other SMTP commands, see Chapter 20.)
The BHS in RG1 immediately connects to the RGM in RG1 over TCP port 691 and transfers the information about the down link.
The RGM in RG1 immediately floods this data to all other Exchange Server 2003s in the routing group.
Using the new link state information, the BHS in RG1 calculates the best route to RG5, through RG3 and RG4.
Before routing the message to RG3, the BHS in RG1 propagates the link state information to the BHS in RG3. This process continues until all the routing groups know of the down link between RG2 and RG4.

It is highly likely that subsequent messages will be sent to RG5 from RG1. When this happens, the messages will be routed through the alternate route (RG1-RG3-RG4-RG5) because each server in the organization knows that the primary route is down.

The BHS in RG2 will continue to try to contact the BHS in RG4 every 60 seconds, even if no messages are awaiting transfer. This value is not configurable. When a link becomes operational again, the new link status is replicated to all the other Exchange servers in the organization. The BHS transmits the “up” status to the local RGM, which in turn floods the Exchange Server 2003 servers in the local routing group. Then, similar to the way in which the “down” message was propagated, the “up” message is sent to the rest of the Exchange organization.

Tip

As Figure 3-15 shows, by default the SMTP virtual server will check the status of the link after 10 minutes for the first retry interval, after 10 minutes for the second retry interval, after 10 minutes for the third retry interval, and then every 15 minutes for each subsequent check. Reduce these intervals if the link is passing mission-critical information between two routing groups. Each retry interval can be as short as 1 minute.

click to expand
Figure 3-15: Default retry intervals for SMTP virtual server.

Note

In both this example and the next example on multiple link failures, link state information might appear to be transmitted only immediately ahead of a user-initiated message. This is not the case. Link state information is always transmitted immediately, independently of whether other messages need to be transmitted. We mention that the link state information is transmitted ahead of a user’s message to indicate the importance that Exchange Server 2003 places on replicating link state information to all its servers. There is no necessary connection or correlation between the transfer of a user-initiated message and the sending of a link state message to another routing group master.

Failure of Multiple Links

If more than one link fails at any given time, the link state protocol ensures that the message doesn’t bounce back and forth between routing groups in a continual attempt to find an open message path. Let’s look again at the example of a server in RG1 attempting to send a message to a server in RG5. This time, however, let’s assume that both the link between RG2 and RG4 and the one between RG3 and RG 4 is unavailable. RG1 sends a message to RG2, and RG2 returns the message to RG1, as it did in our single-link-failure scenario. Here are the steps that the link state protocol will then perform:

The BHS in RG1 opens a connection to the BHS in RG3. Before sending the message, however, it propagates the down status of the link between RG2 and RG4 to the BHS in RG3. The BHS in RG3 forwards this information to the RGM, which floods the other servers in the routing group with this information.
Then the BHS in RG1 sends the message to the BHS in RG3. The BHS in RG3 sees that the message is intended for RG5, attempts to open a connection to the BHS server in RG4, and fails. The BHS server marks the link as being in the glitch-retry state and retries three times at 60- second intervals.
If it cannot establish a connection, it marks the link as down and notifies the RGM, which in turn floods the other servers in the routing group with this new information.
The BHS in RG3 attempts to calculate a new route for the message, given the new information. With both the link between RG2 and RG4 and the link between RG3 and RG4 in the down state, the cost of routing a message to RG5 becomes Infinite.
Once the cost has been calculated as Infinite, the message remains in the queue of the BHS in RG3, which makes calls to routing, based on the schedule you have configured on the Delivery tab in the property sheet for the SMTP virtual server, to see whether either link has become available.
When a link becomes available, the message is rerouted as appropriate. If the message stays in the queue for more than 48 hours, it is returned to the sender in RG1 as a non-delivery report (NDR).

If additional messages are sent to RG5 from RG1 while both links are down, the messages remain queued at the BHS in RG1 until one of the links becomes available and a fully functioning route can be established. This is the best place for the messages to remain queued.

Note

The 48-hour time period is the default length of time for messages to sit in a queue before an NDR is generated and sent back to the user in Exchange Server 2003. You can configure this value on the Delivery tab of the property sheet for the SMTP virtual server.

Unchanging Link State

If no alternate path exists for a link to another routing group and the physical link goes down, Exchange 2003 will not change the link’s state to DOWN. This functionality is a change from Exchange 2000 and was implemented to enhance performance by reducing unnecessary link state propagation traffic on your network.

Hence, if you have two routing groups connected by a single physical link and that link goes down, the link state table will continue to show the link as UP and will allow e-mail to be queued for that link. Once the link becomes available, mail will flow normally between the two sites.

Oscillating Link State Information

Sometimes, a physical link can go up and down during maintenance or repair operations. In Exchange 2000, each change in the physical link triggered a change in the link state information, which led to the flooding of link state information multiple times in the Exchange environment.

This tendency to flood has been resolved in Exchange 2003. If a link is oscillating between up and down, Exchange 2003 will leave the link in the UP state. Microsoft feels that leaving an oscillating connector up is better than continually changing the link state, which floods the network with link state packets at each change. This reduces the amount of link state traffic that is replicated between the servers.

Failure of a Routing Group Master

If the RGM goes offline, a new master is not automatically nominated. Therefore, the link state information held by other servers in the routing group could, over time, become increasingly antiquated. If the RGM will be down for a period of time, it is very important that a new RGM be manually configured. This action will ensure that other servers in the routing group have up-to-date link state information.

To manually configure a server to become the RGM, navigate to the routing group in which the server resides, highlight the Members folder for the server, right-click the server in the details pane, and choose Set As Master. Figure 3-16 illustrates this process.

click to expand
Figure 3-16: Selecting a routing group master in Exchange System Manager.