16.3 Tasks that are unique to Linux on the mainframe

The mainframe environment offers a number of opportunities to the system administrator for becoming more efficient in dealing with numerous Linux images, thus lowering operational costs.

16.3.1 System layout

In a Linux-on-the-mainframe environment, the system administrator might have to manage hundreds of Linux images. One way of making this multitude manageable is to curb the variety by using a small number of standard images, as ISPCompany does. The aim is to cover most needs with these standard images and have as few as possible unique images for special purposes. While standard images also can be used in distributed server farms, they are particularly advantageous in a Linux-on-the-mainframe environment.

When creating a new image from a standard image, the binaries from the golden image typically are copied over. With Linux on the mainframe, you can share code for standard images more directly.

Linux allows you to map parts of its file systems to individual disks, and z/VM can provide these disks in the form of read-only shareable z/VM minidisks. Because Linux images on the mainframe can share disks, instances of the same standard image can actually use the same physical file system, or parts of it, for their kernel and application code. Considering the multitude of images, code sharing can amount to substantial disk storage savings.

Code sharing requires careful planning from the outset. There are always data that make an image unique and, therefore, not all files can be shared. Code sharing puts some constraints on how freely you can make alterations to the derived images. The system administrator needs to judge what is the most advantageous strategy for sharing code in an individual installation.

For the system administrator, code sharing can simplify maintaining the Linux images that share the code. In a distributed environment with no disk sharing, a code fix has to be applied to every instance of a standard image. If disks are shared and a fix is applied to a shared file system, all images that share the file systems inherit the fix.

Code changes are usually applied to a copy of the shared code. An image then can pick up the change by being booted from the changed copy. This approach saves you from having to simultaneously take down all Linux images that share the code. The penalty is that you have to keep track of which Linux image runs from which copy.

While booting a Linux image, there is a lot of I/O from the system disks. If a disk is shared by many images, booting the images might have to be spaced over some time to avoid contention. If time is critical, more than a single disk might be desirable. Typically, there would be at least two copies of the shared system libraries on disk, with one disk as a backup and a further copy on tape.

Sharing code allows rapid creation of a new instance of a standard image. All that is needed for a new Linux image is a new guest definition in the z/VM directory. Only a small amount of data that includes the image's unique parameters (for example, TCP/IP parameters) needs to be copied for the new image.

ISPCompany, for example, physically shares most of the code for each of its standard images. Each standard image is used by tens of clients. To quickly create a new image for a client, ISPCompany uses a scripted procedure that makes the required entry in the z/VM directory, points to the shared file system, inserts the unique parameters on the image-specific disks, and starts the new image.

16.3.2 Handling a diversity of Linux images

Using standard images can reduce the number of diverse Linux images that system administrators have to manage. Even with z/VM, for large server farms, it can be a challenge to keep track of which golden image is running where, and there are always a number of images with special requirements. Does attending to these Linux images mean a proportional growth in the number of system administrators, each specializing in a number of variations of Linux images? The answer is no, but you will probably need tools to make your system administrator more productive in change management tasks.

Depending on your requirements, there might not be a single tool that covers your tasks sufficiently. We will use an example to show how a combination of two tools can allow a company to implement a robust Linux change management that supports both central control and image owner control. Aduva and Linuxcare are two young companies that are pioneering Linux change management and already offer products. Each represents a different approach to software change management.

Linuxcare is based on a push model where the system administrator uses a central control point to apply common changes to groups of similar images. The push model puts the responsibility for changes into the hands of its most skilled staff.

Aduva uses a pull model where image owners can decide when to apply specific changes to their individual images. The pull model gives a degree of freedom and responsibility to image owners.

Aduva approach

Aduva assumes that a primary challenge of putting together a Linux image is handling the dependencies among the Linux components. Aduva provides a central OnStage server and an OnStage agent on each of the controlled Linux images. The OnStage server has a software component repository and an associated knowledge database (KnowledgeBase). The software component repository contains certified software modules, and the knowledge database describes the dependencies between these modules. Users interact with OnStage either from the OnStage agent or through an OnStage console. Environments where the OnStage server is remote from the Linux images have an OnStage local rules lab that has the local rules and interacts with the remote OnStage server (Figure 16-2).

Figure 16-2. OnStage structure

graphics/16fig02.gif

The OnStage agent is the component that adds, changes, or removes software components of a Linux image. It communicates with the OnStage server to draw on the information in the knowledge database and assesses what a modification means for the other components on the Linux image, that is, which other components must be upgraded, downgraded, installed, and so on, to support the proposed modification.

Changes can be initiated from the OnStage console or an OnStage agent. From the manager, the system administrator can enforce changes (for example, apply security fixes) for a set of Linux images. From an OnStage agent, the owner of a Linux image can request changes for that particular image. The content of the OnStage component repository gives the image owner a well-defined degree of freedom to make changes.

See http://www.aduva.com/ for more information on Aduva OnStage.

Linuxcare approach

Linuxcare Levanta uses a golden image model in which a customized Linux image is built and thoroughly tested before it is to be copied to locations where it is needed. Fixes or additions are first applied to a copy of the golden image and tested. When the tests are successful, the change is propagated to all images that descend from the particular golden image.

The Levanta components run on a z/VM that also contains the managed Linux images. The central component is the Linuxcare Configuration Manager (LCM). LCM communicates with an agent, the Linuxcare Instance Manager (LIM), that runs on each managed Linux image. A file server gives the LCM write access to each image's system disks. A CP agent plugs into z/VM, for example, to boot Linux instances, or to create or make changes to the directory entries of the Linux guests (Figure 16-3).

Figure 16-3. Levanta structure

graphics/16fig03.gif

In the Linuxcare model, all changes to the Linux images are made by the system administrator who ensures that each change is first tested. The administrator interacts with the LCM from a Linux, CMS, or Web-based user interface.

See http://www.linuxcare.com/products/index.epl for more information on Linuxcare Levanta.

A mixed approach

The two approaches can be complementary. The Linuxcare model focuses on scenarios where the risk of putting a malfunctioning Linux image into production must be minimized. The Aduva model caters to scenarios where a degree of control and responsibility for changes is delegated to image owners who are not highly experienced Linux administrators.

A company might want to manage both risk scenarios with tools. It is possible to combine both approaches and use Linuxcare Levanta to manage a set of well-tested standard images as a stable basis for further customization, which can be performed by the IT staff or the image owner using the OnStage agent and console. The IT staff can control the list of OnStage-managed components that is available to the image owners, and it is at the discretion of the IT division which amount of customization is permissible before renewed testing and management through Levanta are required.

Figure 16-4 illustrates how OnStage and Levanta can interleave.

Figure 16-4. Using both OnStage and Levanta

graphics/16fig04.gif

An initial standard image is created with the help of OnStage. OnStage could also be used to upgrade an existing standard image even to the extent of migrating to a new release of a distribution. The standard image is then tested and verified by the IT division. When testing is completed, Levanta is used to deploy and maintain multiple copies of the standard image. Individual Linux images that need to be adapted for special purposes can then be further customized with OnStage by either the IT division or the individual image owners.

ISPCompany change management practices

Within its outsourcing section, ISPCompany has different business segments with different needs. ISPCompany uses both OnStage and Levanta.

The majority of ISPCompany's outsourcing clients use the standard images under a standard contract. ISPCompany uses Levanta to build, distribute, and maintain clones of these standard images. Clients with special needs can either have full control of and responsibility for their image, or they can have ISPCompany experts build the images according to their specification and have it maintained under a special SLA.

ISPCompany uses the Aduva model to cater to clients who want some control of their Linux images without losing ISPCompany maintenance support. The OnStage software component repository defines a set of program packages that clients can use to modify their own images. Changes made through OnStage are covered within the support contract.

StoreCompany change management practices

StoreCompany's IT division provides a number of Linux images to various departments. It uses OnStage and Levanta in combination to optimize its change management.

Because the requirements of the various departments differ significantly, the standard image approach is not applicable. Instead of creating a standard image, the IT division uses Levanta to build a Linux image that constitutes the greatest common denominator of all requirements and provides these images to the departments. The departments can then build on the tested foundation of the Levanta-built image and make their alterations from an OnStage software component repository.

The expertise of the IT division is used for building the basic system and for defining the possible alterations through the OnStage software component repository. The IT division delegates making small, relatively safe alterations to the image owners. This saves time for the image owners as they can apply the changes without first consulting the IT division. It also saves time for the system administrators because they do not have to attend to every change in individual images.

StoreCompany uses OnStage and Levanta as tools that provide the means to safely delegate workload from the system administrator to others with less systems management skill.

16.3.3 Accounting

Depending on your accounting objectives, Linux on the mainframe offers different accounting points and opportunities to accomplish the task. There are two basic reasons why you might want to do accounting: cost recovery or behavior modification. There are tools for Linux on the mainframe to accomplish either.

For cost recovery, you typically provide an infrastructure that can cope with a workload with the terms laid down in an SLA. Billing is then based on the costs associated with the provided resources: hardware cost, floor space, power, license fees, and so on. In a distributed server farm, billing is usually based on the number and types of the machines provided and the software that runs on them. In a Linux-on-the-mainframe environment, the equivalent of machines would be the number of images and the resources (for example, processing power) that have been allocated to them.

In cost-recovery-driven accounting, billing is based on the resources provided, regardless of how these resources are being used. Accounting done to encourage a change in resource-consumption patterns is usually motivated by a resource shortage and an attempt to discourage usage of this critical resource. For example, if a company is running short of CPU power and there is no budget to acquire more CPU power, an IT division can charge more for CPU usage and, thus, encourage economic use of CPU resources.

If the goal of accounting is behavior modification, you must gather data on the critical resource and on the accounting group (for example, user, user group, or application) that consumes the resource. The accounting point has to be chosen so that the resource consumption can be measured at the granularity of the consuming group. Accounting as an agent for behavior modification is controversial because users tend to find workarounds that lead to other resource shortages. If achieving behavior modification through accounting makes sense in your environment, Linux on the mainframe allows you to do it.

The resource consumption of a Linux application can conveniently be measured from z/VM if the rule of having a single application per Linux image is adhered to and the image is used by only one accounting group.

z/VM cannot measure resource consumption at the level of individual Linux users. For accounting at the user level there are tools you can install on Linux (see 25.5, "System administrator tools").

Accounting can be difficult where an application acts as a server that consumes resources on behalf of different accounting groups. Measurements would then show that the application is using resources but not indicate on behalf of which accounting group. Linux on themainframe offers a unique solution for this accounting problem. On the mainframe, it is possible to set up a separate Linux image with an instance of the application for each accounting group without having to buy additional hardware. The resources consumed on a zSeries machine by the sum of these instances are not significantly more than the resources used by a single Linux image that does all the work.

16.3.4 Debugging

For its Linux guests, z/VM offers an answer to a well-known dilemma in debugging.

On the one hand, debugging uncovers the underlying reasons for system failures and thus provides the basis for corrective action to avoid similar failures. On the other hand, the demands on availability in many production environments simply do not allow for real-time debugging or even for taking a dump. Immediate reIPLs are often the rule, and this means loss of the data for debugging.

z/VM offers the opportunity for hot standbys as backup images that consume almost no resources. On failure, a failover to the backup image occurs. The failed image remains in place and can now be analyzed without impacting the downtime as seen by the user.

Part of the system administrator's work is calling maintenance personnel (vendor support) with problems. Techniques for finding the root of problems are described in Chapter 22, "Debugging and Dump Analysis."