Defining Migration-Specific Management
The new IT environment
must be integrated into the existing IT management infrastructure. The approach you take and the degree of effort involved with it will depend on the current degree of operational capability the organization possesses. An IT organization with a
set of management processes, supported by an appropriate tools infrastructure, will be able to integrate the
environment with a reasonable degree of effort. A less mature organization will need to consider using the migration effort as the catalyst to improving its operational capability, which will require significantly more effort. In this case, the organization might consider improving operational capabilities in a separate project.
It has been our experience that very few organizations will be able to address operational readiness issues by taking a "green field" approach. Even immature organizations will have some existing practices and technology that must be accounted for. A generalized strategy for addressing operational readiness should include the following steps:
Assessing the current IT management infrastructure
Addressing the critical gaps
Extending the infrastructure to account for the migration
The following sections describe each of these steps in more detail. These steps should be focused on the requirements and benefits case defined during the justification activities and the architect phase.
Assessing the Current IT Management Infrastructure
As a first step to ensuring that they can support the migrated environment, organizations undertaking a migration must develop a realistic understanding of their operational capability. You can begin this process by reviewing the available standards and frameworks, looking for opportunities to leverage them as much as possible. You should adopt a management framework and a maturity model as the basis for both the assessment effort and
improvement activities. A management framework describes what needs to be in place. A maturity model defines the evolutionary
In previous sections, we described both a management framework (the STMF) and a maturity model for operational capability. Other models are available, including the capabilities maturity model and its derivatives from the Software Engineering Institute (http://www.sei.cmu.edu), the IT Infrastructure Library or ITIL (http://www.itsmf.com), and Control Objectives for Information Related Technology or COBIT (http://www.isaca.org).
Assess People Requirements
When assessing the human aspects of a management solution, focus on the skills that will need to be developed and exploited to manage and control the new environment. Obviously, new skills will need to be adopted and absorbed, which puts a focus on training and development processes that will enable the development of the appropriate competences to successfully control the new environment.
Additionally, the introduction of new migrated technology will require the evaluation of the current workforce and a determination of what is immediately needed to provide the appropriate staffing levels. In addition, to promote the desired behavior of personnel, compensation plans and employee performance management might need to be aligned to include the right measures for success.
When significant changes are introduced, it is essential that you
communication and coordination between workgroups to limit the resistance to change and to increase the
To be proactive, evaluate the following processes and bring them to an appropriate level of maturity:
Obviously, the ability to introduce changes into the environment with minimal risk and at an acceptable cost is key to the successful migration of any technology. If the change management process is implemented successfully, this can be achieved. The following diagram describes how such a process can be defined. It is ITIL based and incorporated into the STMF.
Figure 8-7. Change Management Process Flow
Naturally, implementation management has close ties with change management because projects of this
typically introduce significant change. A solid development and deployment process is key to a smooth transition into production.
For execution management (often referred to as operations management) to become more proactive in accepting and introducing changes, you will need to start addressing the following key areas:
Ensure that jobs are scheduled appropriately, to avoid
system resource levels.
Enable the quick and safe introduction of new and different technologies while ensuring that minor and daily activities are performed in a secure and nondisruptive way.
Look at the available resources and try to predict future needs or changes to production control activities. By doing this, you will significantly reduce
outages that result from insufficient resources.
Introduce technologies and procedures to limit the impact of an unexpected outage. Think of Sun Cluster software as an example to facilitate automatic failover or disaster recovery procedures to limit the downtime resulting from a major
The ability to manage problems and incidents quickly and consistently is essential to avoid solving the same issue more than once. The focus of problem management, when shifting from reactive to proactive, must move to root cause analysis to enable the most effective solutions for existing issues. The SunTone Management Framework has leveraged many of the ITIL best practices here. The following figure shows how such a process could look.
Figure 8-8. Problem Management Process Flow
In moving to a more mature level of operational capability, an organization shifts from technology that is focused on monitoring the lower portions of the E-stack to technology that extends the monitoring coverage and facilitates proactive management of the IT environment. In applying the tools solutions model described previously, we would focus on the following items at each layer:
Element and resource management.
Both monitoring and measurement capability for the application infrastructure should be in place. This enables organizations to collect performance data and proactively monitor for critical conditions in all portions of the E-stack, from the facilities to the application infrastructure. Additionally, a robust security and backup capability should be deployed, and basic mechanisms for provisioning new systems and distributing simple packages (such as patches) should be available.
Event and information management.
In most organizations, the low level of the management tools architecture will be
along silos of expertise. For example, the systems, database, and network expertise centers will each have its own sets of tools to monitor and measure their portions of the environment. As the organization moves to a more mature level of operational capability, it becomes necessary to consolidate and correlate information across
of expertise. This function is performed at the event and information layer. A common event management console and an associated event model for the organization are key components of this effort. This common event management platform serves as a key integration point in the management tools architecture. With the common event management platform, technology to support the analysis of performance data and IT management reporting should also be in place.
Service level managers.
Although systems with this level of operational maturity do not focus on service management, the tool assessment phase is when organizations need to consider managing the end-user experience. To this end, technology to assess the availability and performance of applications and supporting services should be applied. Synthetic transaction generators can test how the
of a web-based application is being
, or can test the availability of common network services such as DNS or IMAP.
Process workflow managers.
Other points of integration and cross-domain correlation are the tools used to manage the execution of IT management processes. At this level, the process workflow technology is expanded from providing simple trouble ticket functionality to supporting the automation of problem (incident and root cause), change, and asset management. Additionally, integration of these systems with other portions of the management tools infrastructure is realized.
To determine the current state of IT operational capability, you should conduct some type of audit. Audits
fall into one of two categories:
These audits determine the degree to which the process, tools, and skills
meet the needs of the organization. Part of a compliance audit would be the identification of critical processes, tools, and staff to support management of the IT environment.
These audits assess how well the organization is executing its management processes.
Compliance and effectiveness audits should be
as part of any assessment activity. Our experience has been that the existence of a well-documented process does not always mean that the process actually meets the needs of the organization or that it is being followed by the organization. Depending on the expertise and
of the staff
involved in such an effort, organizations might want to consider using an external agency to conduct the assessment.
Addressing Critical Gaps
Detailing the means to address all possible shortcomings in the management infrastructure is beyond the scope of this book. However, we provide the following important rules to help you understand the scope of this task.
Improving operational capability is an organization-wide effort.
Moving up the maturity scale requires the application of resources (time, skills, and money) and the cooperation of the entire organization. Efforts to improve operational capability require senior management commitment.
Improving operational capability is an evolutionary, not revolutionary, activity.
Various maturity models for IT operations can communicate the goals of capability improvement activities and define an incremental approach to realizing those goals. Experience has shown that few, if any, organizations are capable of realizing the entire management framework in one big effort. Organizations should focus on incremental activities with a quick return that are conducted within the context of a well-defined strategy. The "big bang" approach to building operational capability is strongly discouraged.
IT management is a
to issues of IT management by acquiring technology to manage the environment without considering the processes needed to
the environment. It is our experience that focusing on the tools portion of the framework results in poorly executed implementations that do not meet the needs of the organization. The initial focus of efforts associated with improving operational capability should be on the definition and implementation of the processes to be used in managing the environment. The process architecture should drive the tools and skills architectures.
To be successful, organizations must measure progress.
We saw that improving operational capability requires organization-wide commitment. As part of meeting that commitment, you must use meaningful metrics to measure baseline capability before starting improvement efforts. Then, periodically evaluate the effect of the improvement efforts. Examples of metrics include cost data, availability data, ratio of supported systems to head count, and mean time to repair (MTTR). Investments in operational capability should be justified by corresponding improvements in key performance metrics.
Established continuous improvement methodologies (like Six-Sigma and Sun-Sigma) are great tools for enabling these projects of change.
Selecting Tools for Managing the Migrated Environment
The tools solutions model described above implies the application of sound systems-design principles to include modular design, separation of function, and well-defined interfaces. As a result, the management tools architecture should be loosely
with the corresponding managed environment. The degree of dependency between the two architectures decreases the farther up the management framework you go. In a properly architected management infrastructure, most of the impact resulting from the introduction of Sun technology will be seen in the instrumentation and element management layers. Most of the Sun technology that is available for managing migrated platforms is focused on these two
of the tools model. Sun works with a number of industry
to provide solutions for other portions of the tools framework. The following figure shows some of these tools and how they map to the SunTone Management Framework. This sample is not exhaustive; however, it provides a good overview of the available solutions. The sections that follow
explain each of these tools. You are encouraged to
available vendor documentation for specific information about the tools mentioned.
The following tools can be used to manage various aspects of the newly migrated environment:
Software that provides basic SNMPv1 functionality for the Solaris OS. The agent uses a mastersubagent architecture that supports both MIB II and additional subagents. Other SNMP agents can be installed and run on a Solaris system as subagents. SEA controllers access the SNMP UDP ports and direct SNMP
to the proper subagent. A desktop management interface (DMI) is also provided.
Solaris Web-Based Enterprise Management Services.
Sun also provides an implementation of the Distributed Management Task Force (DMTF) Web-Based Enterprise Management (WBEM) standard. This standard defines a common information model (CIM) that provides a consistent, vendor-independent way to identify managed objects. FIGURE 8-10 on page 156 shows the architecture of Solaris WBEM. For more information, visit the DMTF Web site at http://www.dmtf.org or refer to Sun documentation.
Figure 8-10. Solaris Web-Based Enterprise Management Architecture
Sun knowledge modules for BMC Patrol.
BMC Software is a provider of systems management technology and is a Sun partner. Sun has developed extensions to the BMC Patrol agent that allow Patrol customers to manage the Sun environment. These extensions use standard Patrol Knowledge Module (KM) design, which enables them to plug into the Patrol agent. KMs are available for a variety of Sun platforms and software including enterprise class servers, Sun Cluster software, the Sun ONE Messaging Server, the Sun ONE Portal Server, and the Sun ONE Application Server. For more information, visit the BMC Software Web site at http://www.bmc.com
Sun Management Center
Sun Management Center (SunMC) is the primary technology platform for the management of Sun products. As seen in FIGURE 8-9 on page 154, SunMC provides functionality in a number of different areas within the framework.
Figure 8-9. Tools Mapped to the SunTone Management Framework
As shown in FIGURE 8-10 on page 156, SunMC has a three-
architecture consisting of the following components:
A console layer that is the user interface for the system.
A server layer that provides
management services to management applications.
agent layer that resides on the managed systems and executes management actions on
of the server.
This agent is
intelligent because many of the management functions such as sampling, threshold comparison, and alarm generation are carried out by the agent. The agent is extensible, using the APIs and developer facilities for creation of additional agent management modules.
The core SunMC functionality includes monitoring Solaris hardware and software components. The visibility SunMC provides into the hardware layer of the Sun environment is a key element of this structure. The Sun Management Center's core functionality can be extended using available product add-ons as shown in FIGURE 8-11 on page 157. Examples include the following:
SunMC Change Manager.
SunMC Change Manager software supports the deployment of integrated software stacks to managed systems. It uses Flash archives as the basis of a
package and allows for the provisioning of multiple systems.
Performance Reporting Manager.
collected performance data and provides tools to analyze and generate
using this data.
Service Availability Manager.
This package provides service level monitoring through test transactions that are generated against core network services. Supported protocols include HTTP, LDAP, DNS, IMAP, and SMTP.
Hardware Diagnostic Suite (HDS).
HDS provides a facility to automate the testing of SPARC hardware. HDS tests and reports on field-
Figure 8-11. Layers of SunMC
Additional management functionality to monitor other Sun and non-Sun software is available from third-party
that have extended SunMC capabilities. For example, Halcyon Monitoring Solutions offers the PrimeAlert product line, which extends the management capabilities of SunMC. PrimeAlert modules are also available for a variety of applications including Oracle, Sybase, VERITAS, BEA WebLogic, and the Sun ONE software stack. For more information, visit the Halcyon Web site at http://www.halcyoninc.com
SunMC is designed to coexist with the major system's management framework. Integration facilities are available for CA Unicenter, Tivoli, and BMC. A probe to integrate SunMC with Microm by using Omnibus is also available.
Solaris Management Console GUI
Solaris Management Console is a
GUI for administration tools used in the Solaris environment. It provides a common look and feel, access control, and application launch points for a variety of applications used to managed Sun systems. Administrative functions that are supported include systems status, account management, storage management, and management of projects and tasks for Solaris Resource Manager software. Solaris Management Console is a two-tier application with a Java client and a server layer. This application is the primary replacement for the AdminSuite tool set.
Solaris Resource Manager Software
Solaris Resource Manager software is an example of a control application that enables system resources like CPU, memory, and network bandwidth to be allocated among applications. Solaris Resource Manager software supports the definition of resource usage groups, constraints of resource use, and accounting of use. Use of this tool enables multiple applications to coexist on a single instance of the Solaris OS, with each application being able to count on the availability of specific levels of CPU, memory, and network resource availability. Solaris Resource Manager software was
as an add-on application before the release of version 9 of the Solaris OS. Starting with version 9, the software is bundled with the operating system. This enhanced version of the product also includes functionality from Solaris Bandwidth Manager.
Solaris Bandwidth Manager Software
Solaris Bandwidth Manager software is another control application that enables the management of available network resources. The application enables the allocation of IP traffic to specific traffic classes. These allocation schemes can be based on a number of factors, including IP address, source or destination port, protocol, and type of service (TOS). Incoming and outgoing IP packets are assigned to specific classes. Classes have a
and maximum bandwidth assigned. Solaris Bandwidth Manager also provides a number of interfaces that can be used to export collected performance information for processing by accounting and billing applications.
Automated Dynamic Reconfiguration
Solaris Resource Manager and Solaris Bandwidth Manager software allocate resources within a single instance of the Solaris OS. Automated dynamic reconfiguration (ADR) allows hardware resources to be allocated between multiple OS images on platforms running more than one Solaris domain. The idea is to reallocate system boards with CPU or memory from domains that are of a lower priority or not under heavy load to a higher-priority domain under a heavy workload. ADR provides a high-level command set for dynamic reconfiguration that can be used within scripts. Additionally, this functionality is exposed through a CIM/WBEM interface in the Sun Fire 15K/12K servers. In addition to using scripting to control reallocation of systems boards, you can also use agent technology from the framework vendors to integrate this functionality into the management framework. BMC provides this, using the Patrol for Sun ADR product. Patrol for ADR is a KM that uses the exposed ADR interfaces (CLI on the Sun Enterprise 10000 server, CIM/WBEM on the Sun Enterprise 12000 server and the Sun Enterprise 15000) to control allocation of systems
according to CPU use. FIGURE 8-12 on page 160 shows the use of the Patrol ADR KM.
Figure 8-12. Patrol Automated Dynamic Reconfiguration Knowledge Module