7.4. Developing Your Project MethodologyOnce executive management has accepted your proposal and approved the funding, you're ready to begin the project. In this section, we provide a high-level outline of the project's phases, and along the way we discuss some best practices and lessons learned. This section is far from a complete methodology, however we discuss the essential steps at a high-level to help your project succeed. Many of the lessons learned come from actual Callisma customer engagements. Our focus is primarily on the project goals to reduce the number of hardware platforms within the production environment. Server consolidation projects should generally contain the following high-level phases:
7.4.1. Establishing the ProjectThe first phase of server consolidation is to establish the project. Server consolidation projects require a highly collaborative effort between many departments within the systems community and lines of business, including the server team, applications team, capacity planning, security, operations, the line of business (LOB) or application owners, and so on. Some organizations take a noncollaborative project approach: they develop the architecture and implement the new technology components, hoping that the lines of business adopt the technology. As a result, the adoption of the new technology tends to be lower. Instead, Callisma recommends that customers develop a comprehensive plan and engage the lines of business early in the process. Callisma is happy to assist customers in either approach; however, based upon our experience we estimate that the collaborative approach yields a higher ratio of consolidation. The core project team should not only develop a collaborative forum to work with the various departments within the systems community, but it should proactively work with the business to ensure processes are communicated and schedules are maintained. Throughout this chapter, we'll offer best practices and lessons learned that will help you envision how this takes place, what project efficiencies are gained, and how this builds cooperation within an organization. Server consolidation requires involvement from many departments within an organization. So one of the first steps in the project is to identify your stakeholders and the various teams that will take part in your project. You'll need to define roles and responsibilities, and you'll need to assess resource requirements. We've listed some of the key roles here. Of course, staff titles and roles within your specific organization may vary, and in some cases, it can be one person or a department of people representing the roles. The key roles include the following:
The number of departments and resources for a server consolidation project can be alarming. How can you organize all these groups? How do you build a consensus among them that will enable large infrastructure changes in addition to their day-to-day task of meeting the needs of the business and end-customer community? The following section details some best practices to aid you in this task. 7.4.1.1. Best PracticesAs mentioned already in this chapter, a collaborative environment is one of the first keys to success. This can take on many forms. From a server consolidation Web site to act as a centralized repository to a regularly scheduled project meeting, to monthly or quarterly technology briefings. All can be utilized to build collaboration and coalition between various departments within the organization. The following are some of the Callisma Best practices that have been utilized to develop a coalition and "buy in" for our server consolidation projects:
7.4.1.2. Addressing Typical ChallengesFigure 7.2 was taken from our Server Virtualization Lesson Leaned presentation. It represents some of the typical challenges that have been identified in the planning phase of Callisma projects. Many of these challenges are best addressed very early in the project. We'll now walk though some of these challenges, and you will see where the lack of engaging with other members of the systems community and even the LOBs can result in problems when it comes time for the implementation phase of the project. Figure 7-2. Typical Challenges of the Planning Phase7.4.1.2.1. Challenge No. 1: Lack of Coordination among Stakeholders and Poor Communal Understanding of ProcessFor larger, more complex, or even midsize organizations, determining the level of involvement and assessing resource requirements can sometimes be difficult. Without a good understanding of the server consolidation or remediation process, it is difficult to determine the person-hours needed or to make a budget assessment in order to map the existing organizational roles and responsibilities to the project needs. Many organizations have well-documented processes for deployment of a new application with a new server; however, these processes may not apply completely in a virtualized environment or may have never been used in a server consolidation project. It is Callisma's experience that many times organizational processes and responsibilities have to be adjusted in order to support the needs of the new consolidated or virtualized environment. It is for these reasons that it is essential to document the server consolidation remediation process and procedures formally and assess them against the existing applications, server deployment process, and the current organizational roles and responsibilities. The greatest architecture or technology design will not fix operational procedures or gaps related to project steps. If the current project is spawning many ad-hoc and one-off meetings, then this is a sign that the overall process needs to be assessed, defined, and communicated. 7.4.1.2.2. Challenge No. 2: Operational GapsAnother issue Callisma has routinely found is operational readiness gaps. While proper steps in the actual planning and design of the architecture may have been properly executed, there may exist gaps in the operational processes necessary to support the new architecture. For example, server run books may need to be re-evaluated. In many cases, the run books may still point to the server's physical specifications for processor, disk, and RAM. This changes in a virtualized environment where the physical processor, disk, and RAM are allocated across the various virtual machines running on the server. Run book documentation and operational procedures need to identify these differences. Disaster recovery, backup, and maintenance of the servers in a virtualized environment all can change moving from a physical server to a virtual machine. Typically, a server run book addressed one server with one application. In the new environment, the physical server may run several virtual machines and the virtual machines may run many applications with different owners, crossing several LOBs, and so on. These may all affect the server run book methodology. Another example of technology gap is that the server consolidation process may heavily impact the desktops that utilize these applications. Migration strategies to address ini files, registry settings, and server drive mappings will need remediation. Other gaps include criteria for when to use virtualization and when not to. These are all good areas for what Callisma calls "Server Consolidation Readiness" where many of these challenges are identified and remediation is applied to make the technology "implementation ready." This is just the tip of the icebergall we're saying is that many factors need to be considered and criteria developed to guarantee completeness of the system architecture and to ensure readiness for deployment. 7.4.1.2.3. Challenge No. 3: Significant Readiness Variance between LOBsOne of the lessons learned from numerous engagements is that the level of commitment for the project will vary across organizations and LOBs. A key enabler is the building of a collaborative environment to improve efficiencies during the project. One of the best methods for this is an applications owner forum. Ultimately, the business owners or LOB should "buy in" to the overall strategy and plan. This forum is leveraged for collaboration and sharing of lessons learned as well as for best practices that can ultimately be shared by peers. This can be invaluable when driving tasks, in addition to attaining and, more importantly, maintaining project schedules. Another area that shows significant variance is the application migration schedules. In some cases, it's difficult for a particular LOB not only to identify its applications but also to develop a resource plan and schedule that can be adhered to. The core project team needs to be prepared to assist LOBs that are challenged either by supporting data-gathering efforts, communicating the remediation process, or even assisting with their resource constraints. In one particular customer case, the LOBs were eight months into the program and still lacked a credible schedule. In less than six weeks, the server consolidation readiness phase was able to deliver the following project benefits:
Finally, it's noted during the life cycle of the project how "integrated" the teams become as the collaborative team builds the server consolidation process. Teams develop a detailed understanding of each others' peer roles within the organization. 7.4.1.2.4. Challenge No. 4: A Complete, Accurate, and Centralized Repository of InformationRare is an organization that has a complete and accurate repository of their applications and services running on all the servers. This is very challenging in larger environments where developers or application owners have possibly moved on and are no longer with the company. Other factors contributing to incomplete data include non-existent or inaccurate run books. Because of fast growth and constrained resources, additional services or applications may have been added through service packs or rushed needs to support the lines of business. Having the data in a centralized repository is even more uncommon. In many cases, there's no corporatewide database of all the applications. In other cases, only some of the LOB managers formally track their applications in a repository or a database. There are two important components to consider here: one is to determine the process at which the data is gathered, and the other is to determine what information should be gathered. Server scripts, reporting tools, or existing operations management tools have been used to attain the information. Callisma utilizes server consolidation databases to track and update server data as it's collected. An application repository database in Microsoft Access or SQL server can provide the necessary functionality. One of the benefits of unitizing the Callisma application database is that it's been used in several engagements to capture a comprehensive list of common services or application signatures. The services or application signatures typically make up 99 percent of the commercially available applications found on most servers in customer environments today. More about the server consolidation database later. 7.4.1.2.5. Challenge No. 5: Resource ConstraintsFinally, in the planning phase, schedules will need to be defined and resources allocated to the project. Many times, customers lack the necessary resources or have too many strategic initiatives, and additional staff is required for the project. A collaborative team for server consolidation draws resources from all groups within the systems community. Callisma has provided customers with additional staff in the following areas:
7.4.2. Gathering Data and Application/Server InventoryThe data discovery or inventory phase of the project includes interviewing stakeholders and gathering information about the current processes, current and future state requirements, architecture, methods, and tools. Data for the project can be gathered in several ways from one-on-one or group interviews, third-party tools, and custom scripts. One of the best sources of information is to hold structured interviews with the individual stakeholder. These interviews consist of IT staff, business units, and the customers' various business partners. This is a good way to leverage consultants who act as an independent third party. In some cases, this provides a more open forum in which the stakeholder can feel confident that issues or concerns can be raised anonymously. 7.4.2.1. Structured InterviewsDiscovery is enabled through interviews, customized questionnaires, and facilitated sessions, during which stakeholders brainstorm potential barriers and issues, and discuss potential alternative solutions or processes. Callisma recommends starting with the systems community or IT. Under the "Establish Project" section are documented several roles and responsibilities. Team and one-on-one interviews with identified stakeholders, developers, users, and managers help identify current issues and barriers to application consolidation. This is an excellent opportunity to solicit and identify potential current and future state requirements. Identification of potential or perceived issues or barriers within the organization is an area where it may be best to use a third party or consulting partner. A consulting partner can provide an independent or department neutral resource to allow the stakeholders to be more open and feel more comfortable about sharing potential issues within the organization. The duration of this task depends on the size of the organization as the roles can be filled by one or many people. A structured interview process should be followed to capture as many of the project requirements as possible. This gives the stakeholders an opportunity to voice their individual requirements and concerns for the project. This has been a great enabler for increasing efficiencies within our customer environments. The goal is not to fix all the problems within a particular environment, but to capture the requirements and issues for prioritization in a later phase. Planning for this activity involves the development of a list of standard questions that should be developed prior to these meetings. The following provides some high-level tasks that are completed in this phase:
Survey and interview data should be communicated back to the team to facilitate information sharing and aid in consensus building. These interviews are generally conducted after surveys are completed to clarify initial survey results. 7.4.2.2. Application InventoryAn essential component of a project is the actual application or server inventory. It is common to find that a majority of enterprises today lack a detailed and complete list of services and applications running in the environment. For this reason, Callisma uses an application database as the central repository for information. This repository typically houses the following information:
As the data is captured and sources are identified for information, the application repository can be populated. Many times, automated imports of data such as performance data for servers can be routinely scheduled. This is discussed in more detail in the section titled "Application Repository" later in this chapter. 7.4.2.3. Process DocumentationAnother key element to streamlining and gaining efficiency in the server consolidation process is the complete understanding of the current processes and technologies used for the provision of servers, rolling out new applications, testing, certification, and deployment within the customer environment. The operational process and procedures should be documented for ongoing support and maintenance of the existing environment. Moving from a non-virtualized environment to a VMware environment changes many aspects from an operational perspective. Process improvements to achieve specific business benefits may have an impact in the following areas:
Roles and responsibilities are defined and the current processes to deploy new services are documented in order to feed the assessment phase discussed later in this chapter. This allows the project team to identify opportunities for improved efficiencies and automation to achieve business and project goals. Eventually, this will define the overall process for the project that's communicated to the team. This process addresses many of the typical challenges that were identified in the "Establishing the Project" section of this chapter. 7.4.2.4. Application RepositoryOnce the applications group has lists of approved corporate applications, the application owners have their list of applications, operations possibly has tools that can provide the services running on the server, and, lastly, capacity planning has performance data on the servers, now what? In most cases, there are many good sources for server, application, and performance data located in various parts of the organization. The application repository brings this data together into one resource for analysis and tracking. Having all the information in one dashboard prevents the internal customer teams from being overburdened by potentially repetitive tasks and a bunch of ad-hoc meetings. This translates to maximizing the productive time of internal resources and reduces the overall project time and resource effort in the project. This repository can be a tremendous asset to the project team members. Having not only the server, services, and application data but also a repository to track the status of servers though the project is a useful tool. Program and project data such as migration dates and status of source and target servers can improve efficiencies of the planning team. Figure 7.3 provides an example of an actual server consolidation database that was used in a project. Figure 7-3. A Server Consolidation DatabaseInformation from the discovery phase mined from various sources and formats were organized into a single repository, which was then used to track and report on the various applications to be consolidated. In many cases, automated feeds to the database can keep the information up-to-date during the course of the project. If there were eight applications on a server to migrate off during various periods of the project, the team should not have to request capacity planning data eight times. This is one quick example of efficiency. A regularly scheduled task is better to update the performance metrics of the servers in the database. A data repository provided the business unit owners a location to store information regarding their applications, versions, classifications, and other information. Project teams can then use this information for planning migrations, determining the complexity of the applications, tracking risk, and developing project schedules for the application migration groups. 7.4.2.5. Buy In and CollationThe data-gathering phase is where the team is really getting out and working with the different representatives from the systems community and the various LOBs. This is the phase to foster relationships necessary to build a coalition. So, in addition to collecting information, one of the major objectives of the data-gathering phase is building stakeholder consensus. Facilitated planning sessions, combining interviewing in a group forum at the appropriate times, will foster results to assist in individual and group participation. 7.4.2.6. AssessmentAfter the data-gathering phase, the next logical step in the project is assessment. The purpose of the assessment phase is to review the data gathered in the inventory stage and to identify the current state and future state requirements necessary to migrate the application to the consolidated or virtualized environment. When all the information has been gathered in the discovery phase, a readiness assessment should be performed to ensure that all the documentation for the current state and potential gaps have been identified. The team should develop an enhanced process that can be used across all business units. The assessment phase begins with what Callisma calls "Readiness." Readiness helps ensure that the people, process, and technology components of the project are ready for deployment. Some of the activities of this phase are discussed in the next section. 7.4.2.7. Application ReadinessThe purpose of the application readiness phase is to identify gaps, issues, and risks, and to recognize the tasks necessary during the design and testing phases of a virtualization or a server consolidation project. This is where the use of a consulting partner can have value in providing an independent assessment of the current project documentation. Standards documentation is required by an architecture group to verify current standards needed for the migration. The project team will have collected all the existing documentation, architectures, and standards documents from the systems community and the various business units and partners. The project team assesses all existing documentation, processes, and current and future state requirements to identify major program and project components. The following list provides a high-level summary of some of the documents analyzed:
In addition to these elements, the following application design elements are analyzed:
Generally, the following output should be generated during the assessment phase:
7.4.2.8. RationalizationOne of the exercises of the assessment phase is working with the LOB and application owners in the area of rationalization. Rationalization is not only a process of reducing the number of servers in the environment but also of reducing the number of applications. This can also include reducing the number of legacy platforms and the infrastructure that supports them. Analysts such as the META Group say the cost savings from rationalization far exceed those of server consolidation. This should provide IT with an additional motivation to work with the various LOBs and application owners to determine if there are legacy applications that can be retired. It would be difficult to reduce the number of LOB applications without working with the actual LOB contacts. Tools can be leveraged to identify opportunities. For example, in one case with our Server Consolidation Database, Callisma was able to identify one particular set of servers within the LOB that was running very low utilization. In fact, the NIC I/O was so low that the project team thought there was an anomaly with the data collection process. By working with the LOB, the team was able to remove 35 legacy DB2 servers from the environment completely. That ratio is 35 to 0a very good return on investment. 7.4.2.9. Technical Leadership to Provide Guidance and StructureThe third-party professional services provider or lead architects should provide technical leadership throughout each phase of the project. Lessons learned and best practices should be shared from the provider. From time to time subteams will need to be developed to streamline a process, further the development, or ensure that end-state requirements are met. Consultants, IT architects, or business analysts should be heavily involved in the technical leadership and guidance of these subteams. In some cases, the use of consultants can provide independence or neutrality to facilitate current and future state requirements. The requirements can then be fed back into the core team, and subteams may be formed to address complex issues or barriers. Subteams can fast track potential issues and streamline the overall approach to the application server consolidation strategy. This ensures that members of the systems community are making the best use of their time. 7.4.2.10. Assessing Processes, Roles, and ResponsibilitiesPure server consolidation won't achieve a good return on investment by itself. The project team should focus heavily in the development of automation that enables operational efficiencies during the project. The process model assessment must identify the key deliverables and the task owners for driving increased operational efficiencies. A transition to new consolidated or virtualized environments may require changes in processes and operational procedures. Changes in the certification, packaging, server, or application provisioning process models may all require changes in the new state environment. Chapter 15 of this book covers the many options available in the areas of backup and restoration. Along with process changes, organizational roles and responsibilities may also need to be clearly defined to achieve these operational efficiencies. 7.4.3. Technology DesignThe technology design phase is based upon the specific business and technical requirements that were identified in the planning and assessment phases. In many cases, specific technologies such as applications, web servers, databases, or file servers were identified as candidates for server virtualization. The technology design phase will typically involve a documented design, the procurement of software/hardware, test/certification, and training or knowledge transfer. One of the most common reasons for a virtualization or server consolidation project involves a technology refresh or replacement of aging or out-of-lease hardware. Because all servers will be replaced, it will provide the organization an excellent opportunity for rightsizing servers by selecting new hardware and performing testing and performance analysis. In many cases, the project team has taken this as an opportunity to reevaluate systems' configuration and documentation. It's not uncommon for subteams to focus on initiatives such as new run book templates, server login scripts, server directory and share structures, and the overall strategy for storage and backup as good subteam initiatives. All aspects of the server environment must be addressed prior to deployment. The project repository provides an excellent source to disseminate new project-related documentation and communicate any new templates available to the team. It's very important that the technology design phase include an operational readiness phase, too. The operational readiness phase will help ensure that the design is ready to be installed. The technology design phase varies greatly between projects. In the following list, we've identified just a few of the technology options that may be available to your project:
7.4.3.1. Testing and ValidatingTesting is a critical step in ensuring that the technologies selected meets or exceeds the requirements identified in the planning phase. Functional specifications, performance, and reliability and scalability are just some of the criteria that should be utilized to ensure your solution meets the intended needs. Test plans should be developed ahead of time that document expected results. Any anomalies should be documented and worked out in a test environment. 7.4.3.1.1. ModelingApplication modeling can be one of the best tools to ensure that you do not exceeded thresholds in your consolidated or virtualized environment. Modeling can, however, be a time-consuming process so you may want to only model those applications that have been identified as mission critical. One of the best uses for modeling is also when selecting a specific class of service to consolidate. In the planning stage, we identified all the applications and loaded them into our server consolidation database. The server consolidation database is the tool that gives you the opportunity to look at the total number of servers based upon class. Basically, we have a view into the total number of servers broken down by file, print, SQL, web, DB2, e-mail, and so on. We can look for opportunities in large populations and identify this as a class of server for virtualization or consolidation and model that specific class of service. One example of an excellent tool for modeling is ISM's PerfMan for VMware. PerfMan for VMware provides the ability to group physical server metrics together and understand what will happen when virtualized images are running under the same ESX Server, as well as identify possible contentions that may occur based on the user-defined definitions for each virtual machine. As virtual machines compete for resources, significant wait times can occur. Understanding priorities and conflicts up front improves the chances of your projects success and removes a level of uncertainty. 7.4.3.2. Capacity PlanningCapacity planning is leveraged throughout the life cycle of the project. It can be utilized to assist in developing a business plan around underutilized corporate assets, to using the data for grouping applications or consolidation decisions, and to enabling a proactive capacity planning environment to manage a shared pool of virtualized machines. Earlier in this chapter, we discussed a customer account that had 4-percent utilization, on average, across 3000 servers. Consolidation or virtualization of servers does not always translate to better use of company assets. Capacity planning and the organization process to enable it are the only way to ensure that corporate assets are utilized to their best potential. A robust capacity planning process is essential in all phases of the project. Evaluation of your current processes and tools for capacity planning may become just one of the operational readiness tasks needed to ensure the success of your server consolidation and virtualization project. 7.4.3.2.1. TrendingIt's not just about having server statistics such as utilization for CPU, memory, RAM, network and disk I/O; it's also important to have historical data. Historical data provides more planning data when determining future sizing requirements for CPU, memory, RAM, and network and disk I/O. Microsoft Performance Monitor can give a snapshot of the performance data on a given server; however, it's more important that a tool provide historical data to make critical planning decisions. Historical data as part of your overall capacity planning process will provide a better approach to determine future growth patterns necessary for sizing decisions. Although ESX Server Virtual Machines are separate servers, they all utilize the server hardware resources. So capacity planning does not necessarily become easier; in fact, it can sometimes become more complex. Ongoing management of VMware ESX systems requires operational processes to be reviewed and defined to ensure appropriate capacity planning disciplines are followed. As workloads increase on systems over time, new bottlenecks may occur. Capacity planning in order to be successful requires data based upon a historical perspective to determine growth patterns over long periods of timenot just weeks or months but sometimes years. 7.4.3.3. MigrationA migration phase in your project allows time for the technical team to develop various migration strategies to the new environment. There are many different tools and strategies that can be used in developing the plan to get to the new environment. VMware provides tools such as P2V Assistant. Callisma frequently finds that the VMware P2V Assist migration tool may not address all of the migration scenarios you are considering in your project. For these reasons, other technologies and solutions need to be considered. One example of other vendor's tools includes PlateSpin's PowerP2V and PowerRecon tools PowerP2V and PowerRecon are excellent tools when considering both reporting (data gathering) and migration tools for your project. In addition to vendor tools, a complementary set of custom scripts may also need to be developed to automate manual- or process-intensive tasks that are prone to human error. 7.4.3.3.1. Migration Phase ReadinessThe Migration phase should detail some or all of the following outputs:
7.4.3.3.2. Piloting and Full-Scale ImplementationPiloting and implementation begins when all technology components and organization processes have been documented and signed off on. Through the use of best practices and lessons learned that have been shared in this chapter, you should have already assessed, documented, and communicated the end-to-end server consolidation process to all team members. There is an executable schedule and all the organizational processes to support the project should be in place in order to successfully track and migrate to the new environment. 7.4.3.3.3. Migration ScheduleThe project manager should complete a detailed migration schedule that has leveraged data gathered from the planning phase, including sunset dates, earliest migration dates, and source and target mappings. By working with the various LOBs, the blackout dates and application sunset dates should also have been collected. Blackout dates may be holidays or specific times of the year where any changes to the production environment can have a significant impact to the business. The migration schedule also identifies time and resource requirements for all tasks in the project. Potential site-access conflicts, as well as critical events and milestones requiring team discussion and review should be conducted prior to pilot or full-scale implementation. 7.4.3.3.4. Pilot Process Model with StakeholdersPrior to full-scale implementation, the new end-to-end process should be excised with the stakeholders. Execution of the process should not involve a large number of ad hoc meetings, otherwise a specific process (or procedures) may need to be adjusted prior to full-scale implementation. 7.4.3.3.5. Production VerificationProduction verification includes the end-user environment, performance and capacity assessment, security configuration validation, backup and recovery, and the overall consolidation of services that are working. The production verification provides further assurance that the solution, as tested, is working in the production environment. Deviations from expected results should be evaluated and addressed immediately. The impact of such deviations should be assessed in terms of the technology, process, and organizational implications. End user, management, and operations review and sign-off follow verification. This sign-off is the milestone necessary to proceed with production turnover and full-scale deployment. The formality at this juncture ensures clear communication of implementation and verification results. The implementation becomes a logistical task managed and tracked by the program or project manager. 7.4.3.3.6. Full-Scale ImplementationThe implementation plan is executed by first notifying the team. Notification ensures that team members receive timely communications as to change approval. The environment's formal change management process should be strictly followed. The production turnover is the process of transferring control from the project and implementation team to ongoing server consolidation or internal operations teams. These teams should have received the necessary training and may have participated in several aspects of the implementation. |