In this chapter, you learned what configuring events is all about. Then, we discussed why you would want to configure events and what types of triggers are used to generate events. Then, we discussed how to determine what events you want to have configured on your network. Next, we talked about what events your devices may already have configured. Finally, we discussed how to configure the events you need in addition to the events built-in to your devices.
Next, we are going to have to process all these events and determine what to do with them. Chapter 6, "Event and Fault Management," covers this topic in detail.
RFC 1213, "Management Information Base for Network Management of TCP/IP-based Internets: MIB-II"
RFC 1757, "Remote Network Monitoring Management Information Base"
RFC 1903, "Textual Conventions for Version 2 of the Simple Network Management Protocol (SNMPv2)"
RFC 2021, "Remote Network Monitoring Management Information Base Version 2"
RFC 2233, "The Interfaces Group MIB Using SMIv2"
RFC 2515, "Definitions of Managed Objects for ATM Management"
RFC 2579, "Textual Conventions for SMIv2"
Chapter 6. Event and Fault Management
Your event management system (EMS) is the place where everything comes together and where the network lets you know when it needs attention. If your EMS is configured and working properly and if your thresholds are properly set, you can relax and wait for the system to report problems to you. If you don't want to
The problem is that the network will produce many more events than you'll want to deal with directly. So your EMS needs to process all these events and somehow just report to you when there's a problem that needs your attention. To do so, your EMS must have the knowledge to determine what events require what type of action, if any. This chapter will assist you in ensuring that your EMS can perform this function successfully.
Your EMS should be the point to which all events are delivered and the point to which everything interested in faults should go to find them. So, for example, when your availability monitor discovers devices it can't contact or regains contact with devices, it should deliver this information as events to your EMS. The network devices will discover issues that you'll want to process through your EMS. Your EMS is responsible for determining when there are faults and distributing these faults to your team to repair and to your network health displays. The EMS also is responsible for logging low-priority faults. And finally, it needs to record faults and time to resolution and deliver the faults to reporting systems to enable you to determine your network reliability or uptime. These relationships are shown in Figure 6-1.
Figure 6-1. Event and Fault Management System
For clarity of discussion, this chapter treats your EMS as if it were separate and distinct from your NMS. We are distinguishing your EMS from the rest of your NMS, regardless of whether your EMS is a separate product or set of scripts, or an integrated part of a commercial NMS.
This chapter covers the following: