Most IT organizations try to anticipate and avert system problems before calls come into the help desk. The larger the organization and number of servers, the more likely it is that the organization has one or more display windows whose colored icons represent the health of components of various services provided to end users. These health monitors usually come with two functions:
What is unique with Linux on the mainframe is that due to virtualization, z/VM has some very useful data about both the individual guests and the aggregate usage of the real and virtual hardware resources. z/VM needs this data in order to manage the guests. Thus, no significant additional cycles are needed to share this data with a health monitoring tool. To the extent that most outages of service can be seen with hardware level performance metrics (for example, the CPU is no longer busy or the outbound LAN traffic has gone to zero), the operations staff can use this easily available data as a first level indicator of potential problems. While some other servers do have some rudimentary hardware monitoring capabilities, it is usually quite expensive to tie that information into the health monitor.
Most organizations also include two other crucial functions that move health monitoring into true health management:
A tool that includes all four functions of health management becomes a single point of control for the system with all of its disparate images. Some installations have built their Linux systems' health management tool on top of z/VM because z/VM covers all four functions quite well.
In the following section, we will expand on the "control" aspect of health management.