Service Life Cycle Measurements

You can gather data to help show the quality and increase the acceptance of changes, and you can work through measurable requirements that define the desired future state. However, you must not focus only on the production environment.

Implementing a service is rarely a one-time process. The service is deployed and debugged in a development environment before it is put into a test environment. After it has been load tested and certified to be ready, it is deployed into the production environment. In addition to the changes in the lower layers (for instance, operating system changes, patches, or new, upgraded, or expanded hardware), the service often changes (for instance, patches, configuration, and tuning changes and things to compensate for or take advantage of activity in the lower layers of the stack). Metrics relating to the number and frequency of services and changes, and the time spent in the life cycle at each stage and concerning activities within each stage should be gathered. The N1 Grid software enables, and businesses require, measurements that businesses can use all through the enterprise.

The N1 Grid service life cycle model encompasses a number of high-level stages that are presented in this section using UML state diagrams (FIGURE 5-6). A state represents a specific cycle in the overall life cycle, and the transition between cycles represents a shift in the phase of the life cycle of a service. The transitions between phases in the life cycle are critically important for the following reasons:

  • A service can be in only one phase of its life cycle at a time. Knowing the current phase imparts a great deal of information regarding where the service is, where it has been, and what its potential is.

  • A transition is the only way for a service to migrate between life cycles.

  • The requirements of a specific transition helps you understand the nature of any specific phase and its relationship with other phases.

  • Transitions represent a contractual obligation with respect to the specificity of the information required to effect a transition between phases. The outputs of a phase are consumed as the input of the next phase in the life cycle.

  • A preflighting mechanism can be prescribed in which the requisite information needed to affect an outgoing transition can be synthesized from templates and default values. This enables the service designer to perform what-if analysis of the service and, potentially, have various portions of the specification automatically supplied.

Figure 5-6. Example Service Lifecycle Phases

Testing Structure to Prove SLRs

Accompanying each SLR and requirement are the method to show it has been satisfied and the owner to which the SLR and requirement must be shown. Although many requirements are more binary in nature (the architecture is either an ISP with web services or it is not), many SLRs require testing harnesses and data gathering and roll-up activities (for instance, it supports 8000 web connections per second under this distribution of use cases) that must be included when considering the project effort and cost.

The data from a load test (for example, it only reached 7000 web connections per second) or the operational maturity analysis might result in changes to the original architecture. This iterative activity is a natural part of most complex architectural design methodologies.

The collection and presentation of this data takes time and disk space, but it is the only way to unambiguously demonstrate that requirements have been met. Make sure you are collecting data for all of your needs:

  • Data collected to measure successful delivery on a requirement, as specified after negotiation with the owner of a requirement (for example, being asked to measure a user load maximum or a particular cluster failover time or an SNMP trap arrival time to the central console)

  • Data collected to help you distinguish between possible architecture solutions and arrive at the maximal business solution (for example, articulating various scalability mechanisms to allow an economic model to also be included in the final decision)

  • Data useful in a tollgate to allow transition to the next architectural methodology phase (for example, output showing successful demonstration of the use cases that prove the Jini architecture will work)

Operational Maturity Measurements

This section describes measuring the people and process aspects of your organization's operational capability. The ideas, definitions and methodology are based on Sun's Operations Management Capabilities Model (OMCM), which can describe the current state an organization's realization of the SunTone Management Framework. The tools aspects of the OMCM have been previously introduced. The authors would like to again thank Michael Moore and Edward Wustenhoff for allowing us to include an overview of their "Operations Management Capabilities Model" white paper.

The different levels of the OMCM are categorizations of an organization's service delivery capabilities. A degree of implementation description is used to characterize the extent to which an individual component of capability has been realized by an organization. There are five potential scores for degree of implementation:

  1. Adhoc

  2. Emerging

  3. Functional

  4. Effective

  5. Optimized

Though this scoring mechanism creates a consistent terminology and simplifies the application of the model to real situations, the definition of each characteristic applies only to the component being analyzed. For example, the characteristics of a functional IT operational process are described differently than the characteristics of a functional monitoring infrastructure (for example, the characteristics that make a monitoring process optimized are very different than those that make asset management optimized).

The characteristics adhoc, emerging, functional, effective and optimized are fully defined and described for each management practice and subpractice in "Operations Management Capabilities Model." After being described, the degree of implementation for a management process can be mapped to the OMCM levels. This mapping allows the creation of a capabilities profile that describes the degree of implementation for every component at a given OMCM level. You use this profile to determine an organization's OMCM level.

People Management

A major part of delivering IT services is managing the organizations that have responsibility for executing the various IT management processes. People management describes a set of practices necessary to ensure that the IT infrastructure is staffed in an appropriate fashion and that people have the necessary skill sets. The people management practice should be a process-oriented improvement model in which the IT organization is matured through the institutionalization of different workforce management processes. The more integrated into this organization these activities become, the more effective and efficient the organization will be.

The OMCM measures capabilities that describe the degree to which people management practices have been implemented within an organization. The measuring is performed by evaluating the subpractices that comprise the five people management practices:

  • Organization

  • Skills development

  • Resourcing

  • Knowledge management

  • Workforce management

The individual People Management practices and their subpractices are described below. The characteristics Adhoc, Emerging, Functional, Effective and Optimized are fully described for each practice and subpractice in the OMCM white paper.


Organizing IT services refers to activities that are related to the design of the organization's structure. These would include items such as identifying organizational groups, developing specific roles and responsibilities for each group, and describing the interfaces among groups.

The following practices are part of the organizing activity grouping:

  • Communication and coordination

    This practice is focused on the establishment and maintenance of information sharing within the organization.

  • Workgroup development

    This practice is focused on identifying and creating collections of individuals working together in support of specific objectives using a common, repeatable methodology.

  • Workforce planning

    This practice aligns the IT organization with the business goals and objectives of the larger organization.

  • Participatory culture

    This practice is focused on ensuring that decision making is performed in a structured manner and executed at the appropriate levels of the organization.

  • Empowered workgroups

    This practice creates workgroups that have the responsibility and authority to determine how to most effectively conduct their operations.

  • Competency integration

    This practice integrates different workforce competencies to improve the efficiency of activities that have dependencies across areas of competency.

  • Organizational performance alignment

    This practice is focused on assessing how the aggregated performance of the various workgroups within the organization impact business performance.

Skills Development

Skills development is the set of activities that helps individuals acquire the knowledge and practical abilities necessary to perform current jobs or prepare them for future assignments:

  • Training and development

    This practice closes the gaps between individual skills and the requirements of their current position.

  • Career development

    This practice provides individuals with the opportunity to meet their career goals and objectives and is focused on continuously improving the ability of the workforce to execute the required competency based processes.

  • Mentoring

    This practice facilitates the transmission of experience and knowledge throughout the organization.


Resourcing is the set of activities necessary to acquire the individuals to meet the goals of the organization. This would include activities to identify required skill sets, determine how many of each type is required, develop a timeline for acquiring them, and identify sources to fill the requirements. The following practices support resourcing activities:

  • Staffing

    This practice matches work to individuals, including processes to recruit, select, and transition individuals into specific roles.

  • Competency analysis

    This practice analyzes the business activities of the organization and develops the complete inventory of competencies needed to support them.

  • Organizational capability management

    This practice manages workgroup capability to perform the competency-based processes that they are expected to use.

  • Continuous capability improvement

    This practice provides the basis for supporting workgroup efforts to continuously improve the performance of their competency-based practices.

Knowledge Management

Knowledge management is the set of activities related to the capture, documentation, maintenance, and dissemination of organizational learning. Knowledge management activities enable the creation and maintenance of competency-based practices. Through the execution of knowledge management, organizations can take successful solutions and institutionalize them for reuse. This set of practices ensures that organizations move effective processes and make them repeatable:

  • Competency-based practices

    This practice develops workforce competencies used to align the staffing, compensation, and other resourcing practices with the competency-development goals of the organization.

  • Competency-based assets

    This practice captures the lessons learned and artifacts developed during the execution of competency-based processes, including the activities necessary to capture knowledge and disseminate this knowledge so it becomes an integral part of the organization.

  • Continuous workforce innovation

    This practice drives activities necessary to set policies for workforce improvement, measure the performance of the organization against the goals, and facilitate workforce process improvement through identification of opportunities and implementation of new approaches.

Workforce Management

Workforce management is the set of activities performed to control and support individuals as they perform their tasks. This includes management of individual performance and compensation and activities necessary to provide the workforce with the infrastructure to successfully perform their job functions:

  • Work environment

    This practice ensures an appropriate physical working environment so that individuals perform their job functions in an effective and efficient manner.

  • Staff performance management

    This practice identifies metrics against which individual and workgroup performance can be measured. Mechanisms for rewarding superior performance are identified and formalized to reinforce the appropriate behaviors.

  • Compensation

    This practice provides financial rewards to individuals in proportion to their contributions to the organization.

  • Quantitative performance management

    This practice is focused on the continuous performance improvement of critical competency-based processes, which involves identifying the priority processes, developing metrics that are descriptive of the effectiveness and efficiency of these processes, and applying a process improvement methodology to performance.

Process Management

Business process management is required to support the business service life cycles the existence and management of processes for creating, deploying, and managing IT and business services. The OMCM measures capabilities that describe the degree to which each process management practice has been implemented within an organization. The measuring is performed by evaluating the IT service subpractices that comprise the six process management practices:

  • Create

  • Implement

  • Deliver

  • Improve

  • Control

  • Protect

The individual process management practices and their subpractices are described below. The characteristics adhoc, emerging, functional, effective and optimized are fully described for each practice and subpractice in "Operations Management Capabilities Model."

Creating IT Services

This category describes all processes related to the creation of new services, which includes activities necessary to identify, quantify, architect, and design IT services:

  • Service level management

    This process involves the planning, coordinating, drafting, agreeing, monitoring and reporting on SLAs, and the ongoing review of service achievements to ensure that the required and cost justifiable service quality is maintained and gradually improved. SLAs provide the basis for managing the relationship between the provider and the IT customer.

  • Availability management

    This process manages key components of the predictability and availability of the IT services. Availability requirements heavily influence service architecture design.

Implementing IT Services

This category describes all aspects relating to the physical realization of the IT service as it is defined and created in the previous category. It addresses all aspects that ensure proper rollout of a new or updated service.

The degree of implementation is assessed by analyzing the ITIL release management process. This process protects the live environment (or IT service delivery environment) and its services through the use of formal procedures and checks. Release management works closely with the change management and configuration management processes.

Delivering IT Services

Delivering IT services is the most visible part of an IT organization's activities. This category addresses activities for the proper delivery and ongoing operation of the IT services. It is often referred to as "IT operations" or "data center operations." To assess the degree of implementation of this category, the OMCM looks at the following ITIL defined processes:

  • Capacity management

    This process ensures that the capacity of the IT infrastructure matches the evolving demands of the business in the most cost effective and timely manner.

  • Incident management

    This process addresses activities associated with the occurrence of service disruptions. The primary goal of the incident management process is to restore normal service operation as quickly as possible and to minimize the adverse impact on business operations.

  • Service desk

    To meet both customer and business objectives, many organizations have implemented a central point of contact for handling customer, user, and related issues. The service desk is customer-facing and focused on improving service to and on behalf of the business.

Improving IT Services

This category addresses all activities surrounding the measurement and optimization of IT service activities with the goal of continuously improving service levels. To assess the level of operational capability in this category, the OMCM looks at the following processes:

  • Problem management

    This process minimizes the adverse impact of incidents and problems on the business and prevents recurrence of incidents related to these errors.

    To achieve this goal, problem management seeks to get to the root cause of incidents and to then initiate actions to improve or correct the situation. The problem management process should have both reactive and proactive aspects. The reactive aspect should be concerned with solving problems in response to one or more incidents. Proactive problem management should be concerned with identifying and solving problems and known errors before incidents occur in the first place.

  • Continuous process improvement

    Although ITIL understands the need for continuous process improvement, it has not defined a separate discipline to address this important aspect. The SunTone Management Framework (STMF) uses the processes as defined by Sun today; however, any Six Sigma based approach will most likely have sufficient rigor and commitment to sufficiently address this area.

Controlling IT Services

This category addresses activities to deliver the IT service within the constraints identified by the governing body, including the processes that facilitate the IT governing activities. Examples of governing functions are financial controls, audit, and alignment with business objectives.

To assess the level of operational capability in this category, the OMCM looks at the following processes:

  • IT financial management

    This process reflects activities that control the monetary aspects of the business. It supports the organization in planning and executing its business objectives and requires consistent application throughout the organization to achieve maximum efficiency and minimum conflict.

  • Configuration management

    The configuration management process provides a logical model of the infrastructure or a service by identifying, controlling, maintaining, and verifying the versions of configuration items in existence.

  • Change management

    This process standardizes methods and procedures for efficient and prompt handling of all changes to minimize the impact of change-related incidents on service quality. The basic concepts of change management are principally process related and managerial, rather than technical (whereas incident management is primarily technical, with a strong emphasis on the mechanical nature of some of the processes).

Protecting IT Services

This category addresses all activities that ensure that IT services are still available under extraordinary conditions such as catastrophic failures, security breaches, or unexpected heavy loads. As businesses depend more and more on IT services, this area becomes more and more important to address.

To assess the degree of implementation of this category, the OMCM looks at the following ITIL defined processes:

  • IT service continuity management

    This process supports the overall business continuity management process by ensuring that the required IT technical and services facilities (including computer systems, networks, applications, telecommunications, technical support, and service desk) can be recovered within required, and agreed, business time constraints.

  • Security management

    ITIL defines security management as the process of managing a defined level of security on information and IT services. Included is the reaction to security incidents. Security management is more than physical security and password disciplines. It includes data integrity, confidentiality, and availability. Security management is not an isolated process. It is part of IT and business. The relationship between security management and the other ITIL processes is such that each process has the obligation to perform the required security tasks wherever possible. These tasks in each ITIL process should address the security aspects in their specific area, but the point of control of these tasks is centralized by the security management process. Security management is governed by a corporate policy that drives budget, focus, and management direction. Within ITIL practices, this information is normally found in the service level agreements.

Policy Measurements

This chapter has discussed metrics and useful data for provisioning and observability. Policy combines those functions into a feedback loop that anticipates, corrects, and improves your business. There are many opportunities to observe and react to events in your data center. Examples of measurements to consider include:

  • Deployable service and container relationships

    As discussed in Chapter 4, a policy can use information describing the capabilities of containers and the requirements of a deployable service to perform matches, check for health, and provide information to other entities interested in this type of N1 Grid architecture relationship.

  • Infrastructure

    Changes to business services affect the infrastructure upon which they run. Instrumenting infrastructure components and providing policies to assist with the scalability, security, and tuning of infrastructure components is a necessary step in moving to the mobile, adaptable data center that the N1 Grid enables.

  • Business continuity

    Now that the N1 Grid software makes configuration and reconfiguration quick and effortless, measuring for the necessary triggers to kick off business continuity can inform the N1 Grid software policy and accompany a revisiting of the business continuity initiatives. Disaster recovery sites can be filled with resources doing useful business until the trigger comes to inform the N1 Grid software that it is time to configure the site for business continuity.

Planning for a policy can start immediately, solidifying the measurements to guide the N1 Grid software policy. It is worthwhile to organize a policy model, begin populating the information model, and start to connect the business and system viewpoints. The N1 Grid vision, architecture, and products can help inform this effort.

Security Measurements

Just because the N1 Grid software flawlessly automates the installation of your approved and hardened golden services does not automatically mean that someone cannot log onto one of your servers and make undesired changes. N1 Grid solutions require the same vigilance, defense in depth, and change control that your environment deserves today, but it offers several additional capabilities that make some of that work easier.

Many N1 Grid software products support security processes by making it possible to compare the current environment against the approved and hardened golden image that is expected to be running in a particular location. Clearly, it is beneficial to be able to easily identify unexpected deltas to your builds. The roles used by the N1 Grid software can also segment access and entitlements. They divide not only along lines of operation (for instance, only certain roles can create services and only certain roles can install them), but also along access and control within business groups or within services in a given group. You can best choose how to parse out the identity and entitlements needed to control and run your business, but the N1 Grid software can help you leverage the value and safety out of a strong identity infrastructure.

Mobility is another area where policy, security, and a common information model combine to provide safety and efficiency. Having identity and rules to guide what entities can (and cannot) reside in common locations or in particular combinations speeds up the process to deployment or to react to a business need or observed and reported situation. A secure environment runs with data. Information must be collected and presented to enable command and control of who, when, and under what conditions an action is taken in the N1 Grid operating environment. Your security officer should have clear requirements for the confidentiality, integrity, and availability of the data in your data center and a clear picture of the security architecture and security operations that combine to minimize risk. Sun's layered security architecture approach elevates security to a systemic quality that must be viewed holistically. Data comes from many individual layers. FIGURE 5-7 shows an example of a possible layered security architecture.

Figure 5-7. Layered Security Architecture Example

Each layer represents a required area of discussion, action, and instrumentation, but equally important are the connections between layers and the capability of the stack as a whole. Security is a systemic quality and must be viewed holistically, and its CTQs must be identified, measured, and rolled up appropriately for reporting.

Buliding N1 Grid Solutions Preparing, Architecting, and Implementing Service-Centric Data Centers
Buliding N1 Grid Solutions Preparing, Architecting, and Implementing Service-Centric Data Centers
Year: 2003
Pages: 144 © 2008-2017.
If you may any questions please contact us: