Managing a cluster in MySQL Cluster is likely to seem strange to you if you are used to installing a database server, configuring my.cnf, and forgetting about it (which many data-base administrators are). With MySQL Cluster, you need to proactively monitor your cluster to make sure that it remains healthy and well configured.
The following sections cover some tricks for managing a successful cluster.
The Management Console and Scripting
appendixB, "Management Commands," provides a complete list of the commands you can use in the management console, and it is well worth experimenting with some of the less commonly used ones if you have the time.
To issue commands to the management console, you can either enter the console first and then issue the commands, as you have been doing so far, or you can issue them all on the command line with the -e flag, as in the following example:
[root@localhost mysql-cluster]# ndb_mgm -e SHOW Connected to Management Server at: localhost:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=2 @192.168.254.21 (Version: 5.0.13, Nodegroup: 0, Master) id=3 @192.168.254.22 (Version: 5.0.13, Nodegroup: 0) [ndb_mgmd(MGM)] 1 node(s) id=1 @192.168.254.20 (Version: 5.0.13) [mysqld(API)] 2 node(s) id=4 @192.168.254.21 (Version: 5.0.13) id=5 @192.168.254.22 (Version: 5.0.13)
If you wanted to, you could put the following line in your crontab to email you the status of your cluster every day at 1 a.m.:
00 1 * * * /usr/bin/ndb_mgm -e show | mail -s "cluster status" youremail@yourdomain.tld
However, the trick of passing commands to ndb_mgm on the command line is more useful when it comes to running things regularly. For example, to run a backup every hour at half past the hour, you could enter the following line in your crontab:
30 * * * * /usr/bin/ndb_mgm e "START BACKUP" | mail -s "backup results" youremail@yourdomain.tld
You could also write a simple script such as the following that emails you if any node disconnects for some reason:
Note
If you have spare SQL nodes (that is, nodes that are often not connected) for backup restorations and other uses, you will have to modify this script to exclude them.
#! /bin/bash # check.sh checks all cluster nodes connected /usr/bin/ndb_mgm e SHOW | grep -v grep | grep "not connected" > /dev/null if [ $? = 0 ]; then /usr/bin/ndb_mgm e SHOW | mail s "CLUSTER ERROR: node down" youremail@yourdomain.tld fi ;
You will then get an email containing the following output if a node fails:
From: root To: youremail@yourdomain.tld Subject: CLUSTER ERROR: node down Connected to Management Server at: localhost:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=2 (not connected, accepting connect from 192.168.254.21) id=3 @192.168.254.22 (Version: 5.0.13, Nodegroup: 0, Master) [ndb_mgmd(MGM)] 1 node(s) id=1 @192.168.254.20 (Version: 5.0.13) [mysqld(API)] 2 node(s) id=4 @192.168.254.21 (Version: 5.0.13) id=5 @192.168.254.22 (Version: 5.0.13)
You can see that Node 2 has indeed died. You can run this script from cron every 5 minutes by putting the following line in crontab:
*/5 * * * * /root/scripts/check.sh >> /dev/null
As mentioned previously, however, this will fail if you have extra SQL nodes that are always not connected. The way to exclude these is to define their IDs and add a grep -v command to exclude them. For example, you might want to do this if you have the following cluster output when everything is working:
Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=2 @192.168.254.21 (Version: 5.0.13, Nodegroup: 0, Master) id=3 @192.168.254.22 (Version: 5.0.13, Nodegroup: 0) [ndb_mgmd(MGM)] 1 node(s) id=1 @192.168.254.20 (Version: 5.0.13) [mysqld(API)] 3 node(s) id=4 @192.168.254.21 (Version: 5.0.13) id=5 @192.168.254.22 (Version: 5.0.13) id=6 (not connected, accepting connect from any host)
In other words, you want to exclude SQL node ID 6 from the alert system if it happens to be down. You can do this by using the following command:
/usr/bin/ndb_mgm e SHOW | grep -v grep | grep v id=6 | grep "not connected" > /dev/null
Doing a Rolling Cluster Restart
You will often find it useful to complete a rolling cluster restartfor example, after you have changed DataMemory or after you have upgraded your cluster nodes. The manual method for doing restarts is covered in Chapter2, "Configuration". You have probably noticed that this is a time-consuming process that requires you to log in to each node. (Imagine how long this would take for clusters with more than 10 servers!) This section shows how to create a script that does this for you. This script needs to run on the management daemon, and it uses SSH between nodes; we strongly suggest that you set up authentication with keys to allow the management daemon to log in to the storage nodes as root without passwords. (Otherwise, you will have to enter the root password many times during the execution of the script.) You can find a guide to do this at this book's website, www.mysql-cluster.com.
Rather than just give the script to you in one piece, we have split it into sections and now explain what each does so you can modify it for your needs. Even if you are new to bash scripting, you should be able to make minor modifications to suit your particular needs.
Note
You can download a complete version of this script from www.mysql-cluster.com.
To start with, you define some variables for the whole script:
#! /bin/sh # # Restart all nodes with the --initial parameter # Check each node comes back up before restarting next node # Restart management daemon first. # # **NOTE** # This script will only work if all storage nodes are # currently "up" and working. # Define variables export CLUSTER_PATH="/var/lib/mysql-cluster" #DataDir on the storage nodes export NDB_MGMD="/usr/sbin/ndb_mgmd" #Path to management daemon on management node (the node you run the script off) export NDB_MGM="/usr/bin/ndb_mgm" #Path to management client on management node export NDBD="/usr/sbin/ndbd" #Path to storage daemon on storage nodes export RESTART_MYSQL="/etc/rc.d/init.d/mysql restart" #Path to the mysql init script on the SQL nodes on some distros this is # different, e.g. /etc/init.d/mysqld restart export STORAGE_NODES="192.168.254.21 192.168.254.22" List of storage nodes (list of IP addresses separated by a space) export SQL_NODES="192.168.254.21 192.168.254.22" List of SQL nodes, as per list of storage nodes.
Now you have defined your variables, so this script must restart the management daemon. First, you check whether it is active, and if it is, you kill it. Then you start it, wait a few seconds, and check that it is still alive:
# # PART 1: Restart Management Daemon # # Stop ndb_mgmd ps -ef|grep ndb_mgmd|grep -v grep > /dev/null if [ $? -eq 0 ]; then echo "stopping ndb_mgmd on management node" pkill -9 ndb_mgmd echo "ndb_mgmd stopped" sleep 2; fi ; # Start ndb_mgmd echo "starting ndb_mgmd from directory $CLUSTER_PATH"; cd $CLUSTER_PATH $NDB_MGMD echo -e " ndb_mgmd started; checking that it is still alive in 2 seconds"; sleep 2; # Check ndb_mgm is running ps -ef | grep -v grep | grep ndb_mgmd > /dev/null if [ $? = 1 ]; then # Means that ndb_mgmd was not found in process list echo "ndb_mgm is not running; aborting restart."; exit; fi ; echo "ndb_mgmd is still running; stage 1 success!";
Now the script needs to start restarting the storage nodes. It is critical that at any one time, it kills only one storage node so that the script will wait until each node hits the "started" state before moving on to the next node in the list. You must make sure that all storage nodes are started before you run this script; otherwise, you could potentially cause a cluster crash. Here is the next part of this script:
# # PART 2: Restart Each storage node # for each in $STORAGE_NODES; do # Check if ndbd is running; stop it nicely if it is # and if it fails to stop kill it echo -e " Checking if ndbd is running on storage node $each" ssh -t $each 'ps -ef | grep -v grep | grep ndbd'> /dev/null if [ $? -eq 0 ]; then echo "ndbd is already running on host $each; stopping process nicely" ndb_mgm -e "show" | grep -m 1 $each | awk '/id=/ { print $1 }' | awk 'BEGIN { FS = "=" } ; { print $2 }'; export NUM=`ndb_mgm -e "show" | grep -m 1 $each | awk '/id=/ { print $1 }' | awk 'BEGIN { FS = "=" } ; { print $2 }'` ndb_mgm e $NUM STOP sleep 10; echo "node given 10 seconds to die nicely, now killing process just in case it is still alive" ssh -t $each pkill -9 ndbd echo "ndbd stopped on host $each" sleep 1; fi ; echo "Now starting ndbd on host $each" ssh -t $each $NDBD --initial echo "ndbd started on host $each" # Now, check that the node comes back up - # otherwise we risk a cluster crash if we repeat this loop # because we kill another node) echo -e " checking that ndbd on host $each has completely started" echo "waiting for node on host $each to completely start" while true; do $NDB_MGM -e show | grep "@$each (Version: [[:digit:]][.][[:digit:]][.][[:digit:]][[:digit:]], Nodegroup:" > /dev/null if [ $? = 1 ]; then echo "Waiting...."; sleep 3; else echo "Node started OK"; sleep 5; break; fi done done
As a final action, the script logs in to each SQL node to restart it. This is optional but often a good idea after a complete cluster restart:
# # PART 3: Restart Each SQL node # for each in $SQL_NODES; do echo -e " Restarting SQL node $each" ssh -t $each $RESTART_MYSQL done
To complete the process and for information only, the script now prints the new and completed status of the cluster:
echo -e " Cluster status post-restart:" $NDB_MGM -e show for each in $SQL_NODES; do echo -e " Restarting SQL node $each" ssh -t $each $RESTART_MYSQL done
Issuing a SQL Command to Each SQL Node
You will find that you often need to issue a SQL command to all your SQL nodesfor example, when adding a user (assuming that you have not set up replication for the mysql database) or when creating a database (which must be done on each SQL node before the node will "see" the database).
The following script will help you here. It logs you in from the local machine rather than logging in remotely, which allows it to log you in to nodes that only allow SQL access to the localhost. The only requirement is that the MySQL root password must be the same on each SQL node:
#! /bin/sh # # Issue SQL command to each SQL node in cluster # # Define variables export ROOT_PASSWORD="mypass" export DATABASE="mydb" export SQL_QUERY="SELECT 1" export SQL_NODES="192.168.254.21 192.168.254.22" for each in $SQL_NODES; do echo -e " Issuing command to SQL node $each" ssh -t $each echo "$SQL_COMMAND" | /usr/bin/mysql -uroot p $ROOT_PASSWORD $DATABASE done
Installation
Configuration
Backup and Recovery
Security and Management
Performance
Troubleshooting
Common Setups
A MySQL Cluster Binaries
B Management Commands
C Glossary of Cluster Terminology
Index