Section 7-4. Managing Firewall Failover

team bbl


7-4. Managing Firewall Failover

By nature, firewall failover is a feature that can take action automatically, based on whether two firewalls are operational and connected. You might want to monitor or troubleshoot the failover mechanism on a failover pair so that you can verify its operation. As well, there might be occasions when you need to manually force the failover action between the peers. The following sections cover these topics.

Displaying Information About Failover

When you connect to a firewall remotely, it isn't always apparent which unit is the active one. Because the active unit configuration is replicated to the standby unit, the command-line prompt (and the underlying host name) is identical on both units. This can make interacting with the correct firewall very difficult.

After you connect to a firewall, use the show failover command to determine the state of that unit, as shown in the following example:

 Firewall# show failover Failover On Cable status: Normal Reconnect timeout 0:00:00 Poll frequency 15 seconds         This host: Primary  Active                 Active time: 2421015 (sec)                 Interface stateful (192.168.199.1): Normal                 Interface dmz2 (127.0.0.1): Link Down (Shutdown)                 Interface outside (192.168.1.1): Normal                 Interface inside (192.168.254.1): Normal         Other host: Secondary  Standby                 Active time: 0 (sec)                 Interface stateful (192.168.199.2): Normal                 Interface dmz2 (0.0.0.0): Link Down (Shutdown)                 Interface outside (192.168.1.2): Normal                 Interface inside (192.168.254.2): Normal 

Remember that you should make configuration changes to only the active unit, because those changes are replicated in only one directionactive to standby. Active-active failover takes this one step furtherconfiguration changes to the system execution space or the admin context must be made on the firewall unit that is active for failover group 1. If you attempt to configure the standby unit, the standby firewall displays a warning that the configurations are no longer synchronized.

In the case of active-active failover, this gets a little more complicated. Now, a firewall can be either the primary or secondary unit, but it can be active in some contexts while being standby in others. You can find out which failover group the firewall is active in by using the show failover command in the system execution space, as shown in the following example:

 Firewall# show failover Failover On Cable status: N/A - LAN-based failover enabled Failover unit Primary Failover LAN Interface: Failover Ethernet2 (up) Unit Poll frequency 3 seconds, holdtime 9 seconds Interface Poll frequency 15 seconds Interface Policy 1 Monitored Interfaces 3 of 250 maximum Group 1 last failover at: 13:10:46 EST Dec 9 2004 Group 2 last failover at: 13:10:04 EST Dec 9 2004   This host:    Primary   Group 1       State:          Active                 Active time:    149706 (sec)   Group 2       State:          Standby Ready                 Active time:    121650 (sec) [output omitted]   Other host:   Secondary   Group 1       State:          Standby Ready                 Active time:    120936 (sec)   Group 2       State:          Active                 Active time:    148995 (sec) 

If you can't enable failover, check the status of your firewall license with the show activation-key or show version command. The following example shows the results for a PIX Firewall running 7.0:

 Firewall# show activation-key Serial Number:  801021134 Running Activation Key: 0x7411c36d 0x639a94fa 0xa3f0b034 0x913c0374 0x3f3632ba License Features for this Platform: Maximum Physical Interfaces : 6 Maximum VLANs               : 25 Inside Hosts                : Unlimited Failover                    : Active/Active VPN-DES                     : Enabled VPN-3DES-AES                : Enabled Cut-through Proxy           : Enabled Guards                      : Enabled URL-filtering               : Enabled Security Contexts           : 5 GTP/GPRS                    : Enabled VPN Peers                   : Unlimited This machine has an Unrestricted (UR) license. The flash activation key is the SAME as the running key. Firewall# 

In the example, the firewall has an "Unrestricted" license, which allows any type of standalone or failover operation, including "Active/Active" mode.

Displaying the Current Failover Status

You can use the following command to display a summary of the current failover status:

 Firewall# show failover 

The output from this command displays the configured failover state (on or off), along with failover cable status, the last failover date and time, the failover roles (primary or secondary) for both units, the firewall role (active or standby) for both units, the status of each configured interface, and the statistics for the stateful failover link (if configured).

PIX 7.x also presents this information for each failover group (1 and 2). Within each group, the status of each of the security contexts and its allocated interfaces are shown. For example, the system execution space on the primary firewall has the following output. Notice that at a glance, the shaded text gives a snapshot of every state and role involved in failover:

 Firewall# show failover Failover On Cable status: N/A - LAN-based failover enabled Failover unit Primary Failover LAN Interface: Failover Ethernet2 (up) Unit Poll frequency 3 seconds, holdtime 9 seconds Interface Poll frequency 15 seconds Interface Policy 1 Monitored Interfaces 3 of 250 maximum Group 1 last failover at: 13:11:02 EST Dec 7 2004 Group 2 last failover at: 15:01:04 EST Dec 7 2004   This host:    Primary   Group 1       State:          Active                 Active time:    7536 (sec)   Group 2       State:          Standby Ready                 Active time:    663 (sec)                 admin Interface outside (192.168.93.138): Normal                 CustomerA Interface outside (192.168.93.139): Normal                 CustomerA Interface inside (192.168.200.10): Normal   (Not-Monitored)                 CustomerB Interface outside (192.168.93.143): Normal                 CustomerB Interface inside (192.168.220.11): Normal   (Not-Monitored)   Other host:   Secondary   Group 1       State:          Standby Ready                 Active time:    0 (sec)   Group 2       State:          Active                 Active time:    6879 (sec)                 admin Interface outside (128.163.93.141): Normal                 CustomerA Interface outside (128.163.93.142): Normal                 CustomerA Interface inside (192.168.200.11): Normal   (Not-Monitored)                 CustomerB Interface outside (128.163.93.140): Normal                 CustomerB Interface inside (192.168.220.10): Normal   (Not-Monitored) Stateful Failover Logical Update Statistics         Link : Failover Ethernet2 (up)         Stateful Obj    xmit       xerr       rcv        rerr         General         135508407  7          53412868   0         sys cmd         266210     0          266207     0         up time         14         0          0          0         RPC services    0          0          0          0         TCP conn        123228648  0          47758798   0         UDP conn        663934     0          448445     0         ARP tbl         6          0          0          0         Xlate_Timeout   617643     0          556745     0         Logical Update Queue Information                         Cur     Max     Total         Recv Q:         0       35      7519538         Xmit Q:         0       1       18562497 Firewall# 

The Stateful Failover Logical Update Statistics represent the number of connection or table synchronization update messages that the firewall has transmitted and received. The Logical Update Queue Information shows the number of stateful update messages that have been queued as they have been transmitted to or received from the failover peer. Nonzero values mean that more updates have been queued than could be processed. A large value 125might indicate that the stateful failover bandwidth needs to be increased, usually by choosing a faster interface.

To see the failover status information for just one failover group, you can use the following command:

 Firewall# show failover group {1 | 2} 

On a PIX 7.x Firewall, you can also get a quick summary of the failover status with the following command:

 Firewall# show failover state 

In the following example, the firewall is shown to be the primary unit with the active role, and the other peer is the secondary in standby. The configurations are synchronized, and the interface MAC addresses have been set according to the primary and secondary burned-in addresses. If one of the units had failed, a reason would be shown:

 Firewall# show failover state ====My State=== Primary | Active | ====Other State=== Secondary | Standby | ====Configuration State===         Sync Done ====Communication State===         Mac set =========Failed Reason============== My Fail Reason: Other Fail Reason: Firewall# 

Displaying the LAN-Based Failover Interface Status

An FWSM or a firewall running PIX 7.x can't display LAN-based failover interface statistics. However, a firewall running PIX 6.x will display this information if you use the following command:

 Firewall# show failover lan [detail] 

For example, in the following output, the LAN-based failover interface is called lan-fo. It uses 192.168.1.1 and 192.168.1.2 on the two peers:

 Firewall# show failover lan LAN-based Failover is Active         interface lan-fo (192.168.1.1): Normal, peer (192.168.1.2): Normal Firewall# 

You could see much more detail about the interface activity by adding the detail keyword, as shown in the following example. Notice that statistics are kept for the number of failover messages sent, received, dropped, and so on, as well as the response times for message exchanges with the failover peer (the shaded text):

 Firewall# show failover lan detail LAN-based Failover is Active This PIX is Primary Command Interface is lan-fo My Command Interface IP is 192.168.198.1 Peer Command Interface IP is 192.168.198.2 My interface status is Normal Peer interface status is Normal Peer interface down time is 0x0 Total cmd msgs sent: 107856, rcvd: 107845, dropped: 1, retrans: 8, send_err: 0 Total secure msgs sent: 147375, rcvd: 147301 bad_signature: 0, bad_authen: 0, bad_hdr: 0, bad_osversion: 0, bad_length: 0 Total failed retx lck cnt: 0 Total/Cur/Max of 52719:0:3 msgs on retransQ, 52718 ack msgs Cur/Max of 0:7 msgs on txq Cur/Max of 0:34 msgs on rxq Number of blk allocation failure: 0, cmd failure: 0, Flapping: 0 Current cmd window: 3, Slow cmd Ifc cnt: 0 Cmd Link down: 17, down and up: 0, Window Limit: 17266 Number of fmsg allocation failure: 0, duplicate msgs: 0 Cmd Response Time History stat: < 100ms:         52681 100 - 250ms:     12 250 - 500ms:     13 500 - 750ms:     12 750 - 1000ms:    0 1000 - 2000ms:   4 2000 - 4000ms:   1 > 4000ms:        3 Cmd Response Retry History stat: Retry 0 = 52719, 1 = 4, 2 = 1, 3 = 1, 4 = 1 [output truncated] 

Displaying a History of Failover State Changes

A firewall running PIX 7.x or FWSM 2.x keeps a running history of each time its failover state changes. Although the history events aren't recorded with a timestamp, the sequence of events can still be useful information. For example, if failover didn't come up correctly, you could trace through the history to see the sequence of state changes and the cause for each. You can see the history with the following command:

 Firewall# show failover history 

For example, the following output shows the failover state change history for a firewall running in multiple-context mode. Failover groups 0 (for system execution space failover), 1, and 2 are listed, because failover operates independently in each group. This sequence of state changes occurred as failover was configured for the first time. During the No Active unit found changes, the secondary peer had not yet been configured for failover.

 Firewall# show failover history ========================================================================== Group                    From State                   To State  Reason ==========================================================================     0   Active Applying Config   Active Config Applied    No Active unit found     0   Active Config Applied    Active                   No Active unit found     1   Disabled                 Negotiation              Failover state check     2   Disabled                 Negotiation              Failover state check     2   Negotiation              Cold Standby             Detected an Active mate     1   Negotiation              Just Active              No Active unit found     1   Just Active              Active Drain             No Active unit found     1   Active Drain             Active Applying Config   No Active unit found     1   Active Applying Config   Active Config Applied    No Active unit found     1   Active Config Applied    Active                   No Active unit found     2   Cold Standby             Sync Config              Detected an Active mate     2   Sync Config              Sync File System         Detected an Active mate     2   Sync File System         Bulk Sync                Detected an Active mate     2   Bulk Sync                Standby Ready            Detected an Active mate     2   Standby Ready            Just Active              Set by the CI config cmd     2   Just Active              Active Drain             Set by the CI config cmd     2   Active Drain             Active Applying Config   Set by the CI config cmd     2   Active Applying Config   Active Config Applied    Set by the CI config cmd     2   Active Config Applied    Active                   Set by the CI config cmd     2   Active                   Standby Ready            Set by the CI config cmd ========================================================================== Firewall# 

Debugging Failover Activity

Table 7-1 summarizes some of the commands you can use to generate debugging information about firewall failover operation.

Table 7-1. debug Commands Relevant to Firewall Failover Operation

Command

Display Function

debug fover cable

Failover cable status

debug fover {rx | tx}

Failover messages parsed or sent (serial cable only)

debug fover {rxip | txip}

Failover hello messages received or sent on all interfaces

debug fover fmsg

Stateful failover memory activity

debug fover {get | put}

Stateful failover packets received from or sent to the other unit (not available in PIX 7.x)

debug fover sync

Configuration command replication

debug fover switch

Health monitoring activity

debug fover ifc

Interface health polling


TIP

Commands using the debug keyword produce real-time output for troubleshooting purposes. To see these messages, you must first enable logging output to the firewall console (logging console), to a Telnet or SSH session (logging monitor), to a logging buffer (logging buffered), or to a Syslog server (logging host). The debug output also must be sent to the Syslog destination with the logging debug-trace configuration command. See Chapter 9, "Firewall Logging," for more information.


Monitoring Stateful Failover

As soon as stateful failover is enabled, you should make sure your stateful failover interface isn't being overrun with stateful information packets. In other words, verify that the stateful interface bandwidth is sufficient for the load. Otherwise, information about some active connections will not be passed from the active to the standby firewall. If a failover occurs, these unknown connections are terminated.

To do this in PIX 6.x and 7.x (single-context mode), you can make a quick manual estimate by using the show traffic command. Unfortunately, this command shows only cumulative values collected since the traffic counters were last cleared. For the packets-per-second and bytes-per-second values, a running average is computed since the counters were last cleared.

However, you can issue the clear traffic command on the active firewall to clear the counters, wait 10 seconds, and issue the show traffic command. You should do this during a peak load time so that you see a snapshot of the busiest stateful information exchange. The following example shows how this is done:

 Firewall# clear traffic Firewall# show traffic stateful:         received (in 9.050 secs):                 3 packets       395 bytes                 0 pkts/sec      43 bytes/sec         transmitted (in 9.050 secs):                 84 packets      98682 bytes                 9 pkts/sec      10904 bytes/sec [output deleted] 

In PIX 7.x multiple-context mode (active-active failover), things get a little more difficult. The interface used for stateful failover is defined and configured only in the system execution space, where there is no show traffic command. (That command is available in each security context; however, the stateful failover interface is not!)

To gauge the stateful failover interface usage, you can use the show interface command instead. Issue that command and note the number of bytes shown. (This is a cumulative total, not a bytes-per-second rate.) Then, wait 10 seconds and issue the command again. Note the new byte count, subtract the two, and divide by 10. This gives you an estimate of the bytes per second being sent and received over the stateful interface.

You can also use PIX Device Manager (PDM) to generate statistics or a utilization graph of a stateful LAN interface. Running the graph over a period of time shows you the maximum bit rate that has been used to transfer stateful information. Figure 7-9 shows a sample PDM graph.

Figure 7-9. Using PDM or ASDM to Gauge Stateful Failover Traffic


Finally, the firewall performance itself affects the stateful failover operation. As stateful messages are generated, they are put into 256-byte memory blocks and placed in a queue before being sent to the failover peer. If the firewall cannot generate and send the stateful messages as fast as they are needed, more memory blocks are used. Although the firewall can allocate more 256-byte blocks as needed, the supply of these blocks can be exhausted in an extreme case.

You can use the show blocks command as a gauge of the stateful failover performance. Over time, the 256-byte block "CNT" value should remain above 0. If it continues to hover around 0, the active firewall cannot keep the connection state information synchronized with the standby firewall. Most likely, a higher-performance firewall is needed.

Manually Intervening in Failover

When the firewalls in a failover pair detect a failure and take action, they do not automatically revert to their original failover roles. For example, if the primary firewall is active and then fails, it is marked as failed, and the secondary firewall takes over the active role. After the primary unit is repaired and returned to service, it doesn't automatically reclaim the active role (unless it has been configured to pre-empt active control).

You might occasionally find that you need to manually intervene in the failover process to force a role change or to reset a failover condition. The commands discussed in the following sections should be used from configuration mode in PIX 6.x and in the system execution space in multiple-context mode in PIX 7.x.

Forcing a Role Change

Ordinarily, the firewalls fail over to each other automatically, without any intervention. However, they do not automatically fail back to their original roles. If for some reason you need to force one unit to become active again, you can use the following privileged EXEC command:

 Firewall# [no] failover active [group {1 | 2}] 

You can also force a unit into standby mode with the no failover active command.

For PIX 7.x with active-active failover, you can specify the failover group (1 or 2) that will become active. For example, suppose the secondary firewall should be standby for failover group 1 and active for failover group 2. After a failure, it ends up in standby mode for both failover groups, as shown in the following output:

 Firewall# show failover Failover On Cable status: N/A - LAN-based failover enabled Failover unit Primary Failover LAN Interface: Failover Ethernet2 (up) Unit Poll frequency 3 seconds, holdtime 9 seconds Interface Poll frequency 15 seconds Interface Policy 2 Monitored Interfaces 3 of 250 maximum Group 1 last failover at: 10:29:18 EST Jan 30 2005 Group 2 last failover at: 16:18:28 EST Mar 9 2005   This host:    Secondary   Group 1       State:          Standby Ready                 Active time:    3311601 (sec)   Group 2       State:          Standby Ready                 Active time:    3304092 (sec) 

To restore the secondary unit to the active role for failover group 2, you could take two different approaches:

  • Force the primary unit (currently active) into the standby role by using the no failover active group 2 command

  • Force the secondary unit (currently standby) into the active role by using the failover active group 2 command

Resetting a Failed Firewall Unit

If a firewall has been marked as failed but has been repaired or its connectivity restored, you might have to manually "unfail" it or reset its failover role. You can use the following privileged EXEC command:

 Firewall# failover reset [group {1 | 2}] 

You can use this command on either the active or failed unit. If it is issued on the active unit, the command is replicated to the failed unit, and only that unit's state is reset. In PIX 7.x, you can add the group keyword and failover group number for the firewall role to be reset.

Reloading a Hung Standby Unit

Sometimes, an active and standby firewall can communicate over a failover connection but cannot synchronize their failover operation. In this case, you can manually force the standby unit to reload and reinitialize its failover role with the following PIX 7.x command:

 Firewall# failover reload-standby 

After the reload, it should resynchronize with the active unit.

    team bbl



    Cisco ASA and PIX Firewall Handbook
    CCNP BCMSN Exam Certification Guide (3rd Edition)
    ISBN: 1587051583
    EAN: 2147483647
    Year: 2003
    Pages: 120
    Authors: David Hucaby

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net