Rebooting Nodes for Preventative Maintenance


In this case study, we'll examine one method of performing preventative maintenance on the cluster. The cluster was built as described in Part III of this book, using ldirectord in conjunction with the Heartbeat program to make the LVS Director a highly available server (a primary and a backup server with no single point of failure). When built this way, there are several options for removing cluster nodes for maintenance. Cluster nodes can be removed by doing any one of the following:

  • Modifying the contents of the health check file (htdocs/.healthcheck.html was used in the examples in Chapter 15) on the cluster node. ldirectord will then automatically remove the node from the cluster.

  • Killing the daemon responsible for the port that ldirectord is monitoring on the cluster node. For, example if ldirectord is monitoring port 80 (the http port), simply stopping the httpd daemon on the cluster node will cause ldirectord to remove it from the cluster.

  • Modifying the ldirectord configuration[7] file on the primary Director and removing the cluster node. If you have set the autoreload=yes option in this file, the change will be made automatically by ldirectord.

  • Modifying the IPVS table on the Director using ipvsadm commands. If this method is used, the ldirectord configuration file should also be modified to match the changes made with the ipvsadm command.

Note 

In this case study we will use the fourth method.

Using ipvsadm Commands to Remove a Cluster Node

There are two methods for removing a cluster node from the IPVS table using ipvsadm commands. The first is to set the weight of the cluster node to 0 (assuming you are using a weighted scheduling algorithm), and the second is to simply remove the server from the table.

Changing the Weight of a Cluster Node to 0

By setting the weight of a cluster node, or real server, to 0, you deny any future clients access to the node. However, all of the existing connections remain unaffected. The purpose for using this method is to remove nodes for maintenance without affecting users. Unfortunately, this method does not always guarantee that you will have an unloaded node if users continue to reconnect to the cluster before their connection template entry expires (see Chapter 14), especially if you are using a large TCP session timeout value.

Disabling Telnet Access to One of the Cluster Nodes

Clients only connect to the Linux Enterprise Cluster in this case study using telnet. To remove a node from the cluster, the cluster administrator will therefore only need to use the following command to edit the IPVS table on the LVS Director:

 #ipvsadm -e -t 209.100.100.100:23 -r 10.11.11.11:23 -g -w 0 

In this command, 209.100.100.100 is the VIP, 10.0.0.1 is the cluster node to remove, and :23 is the telnet port.

New users that connect to the cluster (the VIP) will not be able to access the 10.11.11.11 cluster node until its weight is set back to a number greater than 0. The hypothetical organization in this case study changes shifts at 5:00 p.m. each day. When all of the users log out at 5:00 p.m. and the new shift comes on duty, the system administrator can safely reboot the cluster node without affecting any of the users.

When ldirectord running on the Director sees the cluster node come back online after the reboot, it will automatically add it back into the IPVS table and make it available for the end-users.

[7]Usually located in the /etc/ha.d/conf directory.



The Linux Enterprise Cluster. Build a Highly Available Cluster with Commodity Hardware and Free Software
Linux Enterprise Cluster: Build a Highly Available Cluster with Commodity Hardware and Free Software
ISBN: 1593270364
EAN: 2147483647
Year: 2003
Pages: 219
Authors: Karl Kopper

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net