Locating the Problem and Gathering Details Using CTC


It is human nature to attempt to correct an issue immediately after it is discoveredsometimes by connecting and reconnecting cabling, and other times by making configuration change upon configuration change, in an effort to solve the issue quickly. Sometimes this approach works, but often it doesn'tor worse, configuration changes create an even bigger problem or issue.

The first step in troubleshooting is to identify the issue and gather as much detail as possiblebefore making any changes. Not only is this approach useful to you, but if you need to call for additional support, you also now have documented detail that might help solve your issue more quickly.

CTC is a great tool for gathering all kinds of detailed information that will help you determine the potential causes of your ONS 15454 network problems or issues. The following tools and approaches can help you gather data during the initial stages of troubleshooting:

  • CTC Alarms tab

  • CTC Conditions tab

  • CTC History tab

  • CTC Performance tab

  • Other data

Alarms Tab

While on CTC, you can be looking at either one of three views:

  • Card view Visibility and access to every port on a single card on a single node.

  • Node view Visibility and access to an entire node.

  • Network view Visibility to the entire network (every node on the interconnected network).

Each of the three CTC views has an Alarm tab. You select the Alarm tab by clicking it. On it, you can see the various alarms for that particular view (for example, selecting Card view displays all alarms for all ports in the card you select, and clicking the Alarm tab from Network view displays all alarms for all cards on all nodes on the network).

In Figure 13-5, you can see a snapshot of the Alarm tab while in Network view. You can quickly identify which two nodes are in alarm by just looking at the network diagram. In the top-left pane, you can see how many and which types of alarms this network has. For detailed alarm information, take a look at the bottom pane. In this pane, you can see the type and description of the alarm, and port/card/node and other details.

Figure 13-5. CTC Network View, Alarm Tab


Figure 13-6 provides a snapshot of the Alarm tab while in Node view. Notice that the card in alarm is yellow (representing a minor alarm on this card). This view also provides details on each of the alarms on its bottom pane.

Figure 13-6. CTC Node View, Alarm Tab


If you need further detail on any given alarm displayed by CTC, go to the ONS 15454 Troubleshooting Guide and see the "Alarm Troubleshooting" chapter. All alarms are explained in the guide.

As you begin to gather details for the problem or issue that you are troubleshooting, take a snapshot of the alarms for every node (in alarm) on the network. The Export feature can also be used to retrieve alarms. This can be found on CTC's menu bar by selecting File, Export. Be sure to name this file so that it is easily recognizable.

Conditions Tab

The Conditions tab (available in all three CTC views) provides details of retrieved fault conditions. A condition is the status or a fault detected by either hardware or software. The Conditions tab displays all current events, even related alarm/conditions that might have been superseded by a more critical related alarm.

The Conditions tab enables you take a deeper, more detailed look at activity on the node or network. In Figure 13-7, you can view the additional details provided by the Conditions tab (Network view), compared to the details provided by just looking at the Alarms tab (see Figure 13-5) on the same network.

Figure 13-7. CTC Network View, Conditions Tab


History Tab

Another useful source of information is the History tab. The historical alarm data can be retrieved from all three CTC views (Network, Node, Card) and provides you with both alarming and nonalarming activity (for example, threshold crossing alerts, ring switch, and so on).

As with most other screens in CTC, you can export this data to a file. This information might become very useful during troubleshooting.

The ONS 15454 can store up to 640 critical alarms, 640 major alarms, 640 minor alarms, and 640 condition messages. As these limits are reached, the oldest events on each category are discarded to allow new events to be recorded.

Figure 13-8 displays the History tab information from Network view. Notice that the historical data for both alarming and nonalarming events is stored.

Figure 13-8. CTC Network View, History Tab


Performance Tab

You can retrieve detailed performance-monitoring (PM) data from the various service cards. Because this is information that is specific to the service ports, it can be retrieved only from Card view. The performance-monitoring values vary depending upon the type of card being monitored.

While looking at this screen, you might see errors, but an alarm might not be reported. This is because the errors/values are below the predetermined threshold crossing alerts (TCA) values; therefore, they do not raise a flag.

Values are recorded for both near and far ends (depending on the type of service), and in 15-minute and 24-hour intervals. This information can be used as a proactive tool to identify potential future issues.

Figure 13-9 displays the PMs for a four-port OC3 card. Details are shown for Port 1. In the figure, notice the values for all PMs on that portfor example, the Error Seconds Section (ES-S) value for each 15-minute interval is 900. This means that during all the 15-minute intervals, this port has seen errors every second.

Figure 13-9. CTC Card View, Performance Tab


Other Data and Items to Check

Any available information related to this node or network facilitates troubleshooting. For example, a detailed ring map displaying nodes, fibers, LXG ports, IP addresses, pass-through locations, and so on can help determine the root cause of an issue by identifying all potential points of failure.

Diagnostics File

If your issue appears to be complex and you will be calling Cisco TAC for support, you can download a diagnostics file from the node. This file can be interpreted only by TAC and provides you with no useful information.

You can download the diagnostics file from Node View > Maintenance > Diagnostic.

Database Backup

Many companies have an automated mechanism for backing up the complete database of each node routinely. Other companies either don't automatically back up their node's databases or don't perform this task often enough to keep a current copy. The risk involved with not having a current backup copy of every node's database can be significant. If a customer site is flooded or damaged by fire and requires complete hardware replacement, it could take days to reconfigure that node (including potentially hundreds of circuits) from the time the new hardware is installed. Having a current copy of a node's database enables you to completely restore that node in a matter of minutes.

It is also a good practice to back up the database of the nodes you are about to start making configuration changes to. This ensures that, if you need to restore a node to its original state, you have that database backup readily available. It is unlikely that you will ever need that backup, but if you did need it, it would save you a considerable amount of time and a few headaches.

Backing up a node's database is safe and simple, and it only takes a few seconds. The most common issue with backing up a node's database is the fact that many technicians forget where in their PC or laptop's hard drive they saved this file.

Backing up the node's database is performed from Node View > Maintenance > Database, Backup (at which point you select where in your hard drive you want this file.

Tip

If you are backing up multiple nodes, use a naming convention that will easily identify each node, time, and date of backup.


Note

Do not restore the database to a node unless this is required. Restoring the database to a node reboots the TCCs. This causes a temporary connectivity loss to the node through CTC.

If you are restoring the database to multiple nodes, wait 1 minute after the TCC has completed before proceeding to the next node.


Card Light Emitting Diodes

The light emitting diodes (LEDs) on each of the cards (both common and service cards) provide useful and instant information. Take a few minutes to familiarize yourself with the various LEDs. These LEDs are marked and are self-explanatory. For specific detail on each LED on each card, see the ONS 15454 Reference Guide.

As an example of the LEDs' usefulness, you might be waiting for a protection card that is currently active to revert and become the standby card once again. Also, you can easily see whether you have a signal fail on a service card by quickly looking at the LEDs. Don't overlook this simple but useful feature.

Cabling

As an estimate, based on personal experience, issues related to cabling and fiber top the list of most common troubles. These issues include improperly labeled cables and fibers; dirty, crimped, or cut cable or fibers; incorrect type of fibers or cables; lack of or inadequate attenuation; air gaps; and cable length vs. Line Build-Out (LBO) settings, just to name a few.

Always check the physical medium because it is often at fault. Checking the Signal Fail (SF) LEDs on the service cards can often indicate a potential cabling/fiber issue.

A common straight CAT5 cable is required for an end user to log on directly from a PC to the RJ45 jack on the TCC card of an ONS 15454.

If it does not affect traffic, light readings should be taken to and from optic cards to ensure that the transmitting and receiving light levels are within specification.

Power

Warning

Do not attempt to work with live power unless you are qualified to do so. When working with live power, always use proper tools and eye protection.


The ONS 15454 has an operating range of 40.5 to 56.7 VDC (48 VDC nominal). There are two power feeds (A and B) to the ONS 15454; you can easily monitor these by visually checking the dedicated power-monitoring LEDs on each of the TCC cards.

Note

Only the TCC2 and later versions of the TCC cards provide power-monitoring LEDs on their faceplate. Older TCC and TCC+ cards do not provide this feature, and nodes with these older cards require the Alarm Interface Controller (AIC) to monitor A and B power.


Although maximum draw is about 22 amps at nominal voltage, 30-amp fuses are recommended (assuming a maximum draw at low voltage while still providing a buffer).

Both the TCC2 and TCC2P controller cards support alarming of extreme voltage thresholds and also provide the capability to select lesser thresholds at which to report alarms. It is not uncommon to see these lesser thresholds crossed (and therefore alarming) at customer sites where certain brands of rectifiers are used. These power-monitoring thresholds can easily be modified (of course, following your company's guidelines) by going to Node View > Provisioning > General > Power Monitor.

Confirm that the node is properly grounded. Taking a power reading of both A and B power is necessary when troubleshooting a power issue. Also note the thresholds currently configured on the node you are investigating, and compare them to the power levels coming into the 15454; a mismatch could be the cause for your power-related alarms.

Connectivity

CTC has a very user-friendly interface, which provides you with great detail. This enables you to quickly narrow the potential root cause for the issue you are troubleshooting. However, such a great tool will not do you any good if you cannot connect to the ONS 15454.

Confirm the Internet Protocol (IP) address of the node you are trying to connect to. Confirm that your PC is properly configured to communicate to the node you are trying to reach.

If your PC is physically connected to the ONS 15454, ensure that you have a working straight CAT5 cable without broken tabs.

If you are connecting through a network, make sure that you have the appropriate rights to gain access to the network, and confirm that any firewalls between you and the ONS 15454 network are not preventing you from accessing the network. Also ensure that the Gateway Network Element (GNE) is properly wired (if using wire-wrap pins in the back of the ONS 15454also confirm that they were wired correctly).

Data Gathering Checklist

By simply gathering this data in the previous section, you probably already have identified the root cause of the problem.

If the issue is not obvious and you are still troubleshooting, you can find a checklist of items to gather before you start making any configuration changes or requesting technical support. Having this information ready by the time you call technical support will significantly reduce your time on the phone. Use this troubleshooting checklist:

  • Refer to the Cisco Trouble Shooting Guide provided in the documentation for further detail, at http://www.cisco.com/univercd/cc/td/doc/product/ong/15400/index.htm.

  • Create an onsite event log with the date, time, event history, and any previously completed troubleshooting.

  • Verify that there is power to the bay.

  • Perform a database backup for each of nodes possibly affected.

  • Save an alarm log file to the hard drive.

  • Record the working software version.

  • Record the backup software version, if applicable.

  • Print a copy of the Network view.

  • Print a copy of the circuits.

  • Record the ring configuration (unidirectional path switch ring [UPSR]/bidirectional line switch ring [BLSR]).

  • Verify Sync configuration and wiring to the Building Integrated Timing Supply (BITS) clock.

  • Verify and record any alarms on the node or network.

  • Verify that there are no cards with Fail LED(s) on or blinking.

  • Verify that all the fiber-optic cables are properly plugged into the cards.

  • Verify that the fiber-optic cables are routed properly, to avoid micro bends and pinches.

  • Verify that there are no identifiable loose cables.

  • Verify that all the cables have been visually inspected for physical damage.

  • Verify that all the optical power levels have been checked, to ensure that they are within the specified range.

  • Verify that the air filter to the node is clear of debris.

  • Verify that all the cards are properly seated in the chassis.

  • Verify that there are no bent pins on the card and backplane.

  • Verify that the correct card is plugged into the correct slot, to ensure proper seating of the card.

  • Verify that there are no blown fuses in the bay.

  • Verify that the voltage level to the chassis is within the proper range.

  • Verify that the shelf is properly grounded.

  • Verify that the fibers are good with a fault finder.

  • Verify that the circuit(s) have not been deleted.

  • Note all nodes and interfaces where bit errors are being recorded.

Troubleshooting Tools

In addition to a comprehensive set of detailed real-time status and monitoring information, the ONS 15454 provides the following troubleshooting mechanisms to further aid in finding the root cause of a trouble.

Loopbacks

CTC allows for software loopbacks at all ONS 15454 electrical cards, OC-N cards, G-Series Ethernet cards, muxponder cards, transponder cards, and Fibre Channel cards.

Loopbacks are useful not only for testing newly created circuits, but also for determining the source of a network failure.

Because CTC allows for software loopbacks, these can be created either locally or remotely.

Note

To create loopbacks in the ONS 15454, the port (where the software loopback will be created) must be changed to an Out-of-Service Maintenance service state.


Caution

Before patching a hard loopback to an optical interface (fiber jumper from Tx to Rx on same port), always check the specifications for the Tx and the Rx ports. If the transmit port puts out a higher light level than the receiver's maximum receive level, you must use an attenuator to lower the light level to the allowable Rx range. Irreparable damage to the Rx port in a given card can result if this caution is ignored.

As an example, the OC-192 LR card puts out a significantly higher light level than the allowable Rx range in the sensitive receiver. If a hard loopback is used in the optical port of this card, an attenuator is necessary.


CTC provides three types of software loopbacks (more detail on loopbacks can be found in the Cisco ONS 15454 Troubleshooting Guide):

  • Facility loopback A facility loopback takes the signal coming in the Rx port of an I/O card and transmits it back out the Tx port of that same interface. Depending on the location of your test set and the type of interface tested, a facility loopback can eliminate (or identify) issues with cabling, fibers, Electrical Interface Assembly (EIA), and the interface's Tx and Rx ports.

  • Terminal loopback With a terminal loopback created at a given interface, a signal coming through the cross-connect cards is looped back toward the same cross-connect cards that it came from. A terminal loopback tests the entire path (from where the test signal is generated), including cross-connect cards, all the way to the port where the terminal loopback was performed. However, a terminal loopback does not test the physical ports (Tx and Rx) of that interface where the terminal loopback was performed

  • Cross-connect loopback Given an OC-N signal, a cross-connect (XC) loopback allows looping back through a given STS (or STS-Nc) circuit without affecting other traffic on the optical port. As the name implies, the looping of the specified STS takes place at the cross-connect cards. As an example, using an XC loopback, you can loop the third STS of an OC-12 signal without affecting the remaining 11 STSs on that OC-12 signal.

STS Around the Ring

This feature allows the creation of a circuit from an originating node, having that circuit traverse every span/node around the ring and then go back to the same originating node. It provides a simple method to test all spans on a given ring. Stringent service providers commonly use this feature during ring turn-up to test all spans around the ring.

Monitor Circuit

You can create a monitor circuit to monitor a specific span of a given circuit carrying traffic. CTC enables you to identify a destination port (for this monitoring circuit) on a different node. This tool enables you to monitor spans on a circuit remotely.

Test Access

You can create nonintrusive test access points (TAPs) to monitor for errors on a circuit. Third-party broadband test units can monitor these test access points. Details on creating and deleting test access points can be found in the Cisco ONS SONET TL1 Command Guide. You can view test access information in CTC (Node view > Maintenance > Test Access).




Building Multiservice Transport Networks
Building Multiservice Transport Networks
ISBN: 1587052202
EAN: 2147483647
Year: 2004
Pages: 140

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net