3.4 CSM management components

< Day Day Up >

CSM provides several components that give the administrator tools to manage the cluster. This section describes some of the most important CSM components. For the most part, the commands described are the commands that have to be used to perform the full cluster installation and to begin managing the cluster.

3.4.1 Node and group management commands

The group notion is very important in a cluster environment. CSM has been designed to provide the cluster administrator with flexible functions to manage groups or nodes. Groups are used often by several components of CSM, such as the Configuration File Manager and the distributed shell.

The main node and group commands are the following:

definenode

This command predefines a node that will be inserted in the cluster database when the node installation is completed. It does not install the node.
installnode

This command installs the nodes that have been predefined. At the end of a successful installation process, the node will be inserted in the cluster database.
lsnode

This command lists the managed node definitions in the IBM Cluster Systems Management database.
chnode

This command changes a node definition in the cluster database.
rmnode

This command removes a node definition from the cluster database. CSM will not manage this node anymore.
nodegrp

This command is used to create or delete static or dynamic groups.

3.4.2 Controlling the hardware

CSM uses specific facilities to control nodes. Primarily, the hardware control is based on the Advanced System Management (ASM) processor, which is on the motherboard of Model 335 and Model 345 machines. CSM also offers control via MRV In-Reach Terminal Server (MRV) or Equinox Serial Provider (ESP) hardware.

The ASM processor monitors the machine itself for conditions like disk failures, CPU or motherboard over-heating, fan failures, and so on. It also is used to control the machine power status and can be used to start, stop, or reboot a node.

The MRV or ESP provide the cluster administrator with the capability to monitor all of the nodes as if he had individual consoles for each.

There are two commands available to control the nodes:

rpower

This command gives the administrator the capability to power on, power off, or restart a node or a group of nodes. This command use the ASM processor to control the machine. It can work with or without a running operating system on the node(s).
rconsole

The rconsole command can be used with a node or group. It opens an xterm window on the management server for each node selected. The font size used will be adapted to the number of open nodes. There is also an option to view a single console via the current terminal session for administrators outside of an X Windows environment.

This command uses the MRV In-Reach or Equinox Serial Provider hardware to connect to the node through its serial port.

3.4.3 Using DSH to run commands remotely

To allow an administrator to manage a large number of nodes concurrently, CSM uses a distributed shell capability. This allows an administrator to execute the same command on multiple machines from a single console. This could be accomplished in a variety of ways, but if this command generates output, it could be very difficult for the administrator to interpret all of the returned data and associate errors with individual nodes. Therefore, CSM's distributed shell facility helps to organize and display the command output in a way that makes it easy for the administrator to see the results.

dsh

The dsh command can be used with nodes or groups and can be run from any node, assuming that the appropriate security definitions have been implemented.

The dsh command output is not formatted node-per-node. See the dshbak command to format the dsh result to a more readable output when multiple nodes are involved.
dshbak

The dshbak command is actually a filter rather than a stand-alone tool. This command works closely with the dsh command to format the results in a readable way. The syntax is:
```
 dsh -a date | dshbak 
```
Output from the dsh command is captured by dshbak and displayed on a per-node basis.

CSM also installs the Distributed Command Execution Manager (DCEM) graphical user interface, which provides a variety of services for a network of distributed machines. DCEM allows you to construct command specifications for executing on multiple target machines, providing real-time status as commands are executed. You can enter the command definition, run-time options, and selected hosts and groups for a command specification, and you have the option of saving this command specification to use in the future. It allows the administrator to define dynamic groups so that the targets of a specific command can be programmatically determined when the command is executed. For a complete description of the DCEM functions, refer to the IBM Cluster Systems Management for Linux: Administration Guide, SA22-7873.

3.4.4 Configuration File Manager (CFM)

The Configuration File Manager is used by CSM to replicate common files across multiple nodes. For example, common files like the /etc/hosts or the /etc/passwd file often need to be identical (or close to identical) across all nodes in a cluster. When a change is made, these changes need to be duplicated. CFM makes this process much simpler.

Using CFM, each node could ask the management server periodically if files have been updated, and, if so, to download them. Also, the administrator can force the nodes to do a replication.

CFM allows you to define multiple instances of the same file, which could be assigned to different cluster parts. For example, the compute nodes could have a certain requirement for the /etc/hosts file, while the storage nodes that are visible from outside the cluster may require a different version.

CFM can also handle the conditions where a node may be unavailable, by ensuring the files are synchronized once the node returns to active status.

The following commands are the most common for managing CFM:

cfm

The cfm command is used to start the CFM service. This command should not be run at the command line. It should be automatically started on the management server during system boot.
cfmupdatenode

The cfmupdatenode command requires a node to do a synchronization with the management server without waiting for the next scheduled update.

< Day Day Up >