MySQL Cluster is an inherently insecure system: It has no security whatsoever built into it. If you have followed along with the examples so far in this book, anyone with even the most basic knowledge could connect to your cluster and steal or sabotage your data.
The first important thing to do in implementing security with MySQL Cluster is to separate any traffic within your cluster from any traffic coming in from outside. You can do this by using firewalls, which control what packets can pass through them on the basis of some simple rules or you can separate MySQL Cluster traffic onto a different network that is physically separate from your public network and therefore secure.
MySQL Cluster requires significant numbers of ports to be open for communication internode, which is why we suggest that you use a totally separate network for cluster traffic. It is then easy to prevent the forwarding of traffic from one network to the other, and you have then achieved the first goal.
You should be aware that currently there is no checking of the source IP address for ndb_mgm clients, so anyone who has access to port 1186 on the management daemon can download a copy of ndb_mgm, connect to your management daemon, and do just about anything (for example, shut down the cluster). Clearly, this is not acceptable. Three different network designs eliminate this risk: The most preferable is also most expensive, although one solution is inexpensive and relies purely on software protection, with a third option available which is a mix of the other two.
Figure 4.1 shows the best possible scenario for your cluster, with separate public and private networks.
Figure 4.1. An ideal network for both performance and security.
The network in Figure 4.1 has two completely separate networks: the private network (solid box) and the public network (dashed box). For performance reasons, we suggest that you use a gigabit switch for the private network because cluster nodes can transfer enormous amounts of data between each other during the normal course of operations. You can see that there is now no way for someone sitting on the public network to access the data that is traveling between cluster nodes on the private network without going through either of the application servers. In other words, internal cluster traffic (for example, from Storage Node 1 to Storage Node 2 or from App Server 1 to Storage Node 2) is completely segregated, which is excellent.
Notice in Figure 4.1 how the local network is using local addresses. We suggest that you use local addresses for your cluster nodes if they are on a separate network (we have done so in this book). This is primarily for clarity rather than security. You can choose from one of the following three sets of reserved IP addresses:
10.0.0.010.255.255.255 172.16.0.01722.214.171.124 192.168.0.019126.96.36.199
If the application servers are set so that they do not allow any forwarding (which they will be by default), and if they are themselves not hacked, your cluster data is safe from sniffing and interference. Of course, sniffing and interference are not the only threats to your cluster by any stretch of the imagination, but they are the major threats that you are much more vulnerable to if you run a cluster in MySQL Cluster when compared to a standard MySQL server due to the volumes of plain-text information moved around between nodes and the complete lack of any form of authentication or security.
This means you still need to secure your application servers against hackers, as you normally would. (Such securing is beyond the scope of this book. However, we do cover setting up software firewalls, in the context of what you have to allow for the cluster to continue to work, in the section "Software Firewalls," later in this chapter.) You can be more restrictive with your firewall rules. For example, if your application servers are only running a web service, you can block all traffic coming in on eth0 except to 80 (and 443, if you are running SSL, and 22 if you need SSH).
If you are unable to set up two separate networks, you should make sure your network is partitioned as far as possible, ideally with all your cluster nodes plugged in to a switch, which is then plugged in to a router to keep all internal traffic in your network away from any other computers in the network. If you can get a hardware firewall, you can implement the second-best setup (in terms of both security and performance), as shown in Figure 4.2. (It is recommended even more strongly in this case that your switch be a gigabit switch.)
Figure 4.2. The second-best network setup in terms of security and performance.
The setup shown in Figure 4.2 requires the hardware firewall to follow a simple rule that says something like this: "Deny all traffic except to the two application servers on the ports that whatever application requires for clients to connect to it."
This may seem unclear, so let's consider an example: "Ports that whatever application requires" would be 80 for HTTP, 443 for SSL, 3306 for MySQL client access, and so on. In other words, you could only allow ports that are actually required for your end users to connect to the servers, and this depends on what application your end users are connecting to.
Such a firewall should also filter egress traffic, again denying all traffic except traffic that is specifically required for the application to run.
If you do not have access to a hardware firewall, then security can become very difficult, and you will have to be very careful with your software firewalls. To start with, you should be very careful with your network partitioning to make sure that malicious elements can't either sniff data traveling between nodes in your cluster or, worse, modify it with spoofed packets. Many routers incorporate features that can block both of these forms of attack; it is certainly worth making sure that yours are configured to do so and that ideally your cluster is on a dedicated network segment.
In a typical cluster, you will be left with two "zones" of nodes, as shown in Figure 4.3.
Figure 4.3. Two "zones" of nodes that require protection with software firewalls.
Figure 4.3 shows the storage and management nodes in one zone (dotted box). These nodes need to be able to communicate with each other and communicate with the SQL nodes. The figure also shows a zone that contains the application servers (solid box). These nodes need to be able to communicate with the users of your application and communicate with the storage nodes.
It is essential that the users of the application are not able to communicate with the storage and management nodes directly. Otherwise, malicious traffic could bring down your cluster. You therefore need to be very careful with your software firewalls, and you need to block everything that is not necessary as discussed in the next section.
This section covers the most restrictive firewall ruleset possible for the sample cluster shown in Figure 4.3. It explains what rules you want, and then it shows the exact configuration required for an APF firewall.
For the application nodes, you want rules that operate like this:
It is possible to define exactly what ports each node will actually use, but there seems little point.
We assume at this stage that you are actually running your application on the application servers in Figure 4.3 (that is, that they are running SQL nodes and the application is connecting to localhost). This is the most standard way of doing it, but you can of course put your application on a separate set of servers and have them connect to a MySQL server on 3306 running on the application servers in Figure 4.3.
For the storage nodes, you want rules that work like this:
Linux is blessed with many excellent software firewalls. The ones most commonly used are based on a package called iptables. iptables is itself a development of an older and in its own time very popular system called ipchains (which itself was based on a package called ipfw for BSD).
However, iptables is rather complex to configure directly, and we suggest that you use a program called APF to configure your firewall, particularly if you are new to Linux software firewalls, because APF produces the correct iptables rules for you based on a simple-to-understand configuration file. APF also has several modules that make it an excellent choice as a firewall, including brute force detection (which is essential for any Internet-facing server with SSH or any other interface enabled) as well as advanced denial of service mitigation. APF is released under the General Public License and is completely free.
Installing and Configuring an IPTables-Based Firewall
IPTables is installed by default on almost all Linux distributions. It operates as a set of rules, and every time the Linux kernel receives or wants to send a packet, it runs through the rules until it finds one that matches the packet and defines what to do with it. If it does not find a rule that matches the packet, the fate of the packet is determined by the default action, which is typically DROP.
APF has a very simple configuration file that sets the global options and then there are two extra filesone that specifies traffic that you certainly want to get rid of (drop) and one that specifies traffic that you certainly want to allow.
The following sections show how to set up servers to disallow all traffic except SSH on all nodes from a certain IP address and web traffic on port 80 to the application nodes.
Downloading and Installing APF
APF is released as a tar package that you must untar and install. However, it does not require compiling, and the installation process is very simple, as shown here:
[user@host] su - Enter Password: [root@host] cd /usr/src/ [root@host] wget http://www.rfxnetworks.com/downloads/apf-current.tar.gz --12:49:19-- http://www.rfxnetworks.com/downloads/apf-current.tar.gz => 'apf-current.tar.gz' Resolving www.rfxnetworks.com... 188.8.131.52 Connecting to www.rfxnetworks.com[184.108.40.206]:80... connected. HTTP request sent, awaiting response... 200 OK Length: 82,187 [application/x-gzip] 100%[====================================>] 82,187 57.39K/s 12:49:22 (57.30 KB/s) - 'apf-current.tar.gz' saved [82,187/82,187] [root@host] tar -xvzf apf-current.tar.gz apf-0.9.5-1/ (output from tar command here) [root@host] cd apf* [root@host] ./install.sh Installing APF 0.9.5-1: Completed. Installation Details: Install path: /etc/apf/ Config path: /etc/apf/conf.apf Executable path: /usr/local/sbin/apf AntiDos install path: /etc/apf/ad/ AntiDos config path: /etc/apf/ad/conf.antidos DShield Client Parser: /etc/apf/extras/dshield/ Other Details: Listening TCP ports: 111,631,1024,1186,3306 Listening UDP ports: 68,111,631,804,1024 Note: These ports are not auto-configured; they are simply presented for information purposes. You must manually configure all port options. [root@host]
Congratulations! You have installed APF. However, by default, APF will disable itself every 5 minutes (this is in case you deny access to SSH from your IP address by mistake and lock yourself out). You also need to configure it in a bit more detail, as described in the following section.
You now need to configure the main configuration APF file: /etc/apf/conf.apf. This file contains settings such as what ports are open to all source IP addresses and whether to filter egress (outgoing) traffic.
The following lines are the ones you are likely to be interested in:
Finally, you should change the line right at the top of the file that by default is DEVM="1" and set it to DEVM="0". Otherwise, your firewall will disable itself every 5 minutes.
You should now configure the allow_hosts.rules file that sets out exceptions to the global rules listed in the main file for some IP addresses.
You have two choices with this file:
This chapter covers how to do only the first option because it is perfectly secure and a lot easier and less likely to cause confusion. However, there is very good documentation for APF included with APF as well as excellent commenting of the allow_hosts.rules file, and if you want to lock everything down that little bit further, you can work out how to do it very quickly.
To allow all traffic from and to cluster nodes, in the file /etc/apf/allow_hosts.rules on all nodes, you simply put a list of the IP addresses of the nodes that make up your cluster, one per line. You then need to restart APF (by using service apf restart).
Standard Security Procedures
On any clean installation of MySQL, you should always ensure that MySQL is running as the unique and clearly defined underprivileged user mysql. This is the default behavior in most cases (in fact, it is quite difficult to run MySQL as root, but sometimes it can run as other users). To check your installation, you run the following command when MySQL is running:
[root@host]# ps aux | grep mysql root 3749 0.0 0.4 5280 1060 ? S 00:02 0:00 /bin/sh /usr/bin/mysqld_safe --pid-file=/var/run/mysqld/mysqld.pid mysql 3778 0.0 1.1 124868 2820 ? Sl 00:02 0:00 /usr/libexec/mysqld --defaults-file=/etc/my.cnf --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-locking --socket=/var/lib/mysql/mysql.sock
Notice that the /usr/libexec/mysqld daemon is running as the mysql user. If it is not (that is, if it is running as root), you need to follow these steps to make it run as a nonprivileged user:
chown R mysql:mysql /var/lib/mysql chown R mysql:mysql /var/lib/mysql-cluster
The other thing you should always do is to set a root password. This is so simple, yet many people fail to do it. After the installation process completes, you have a root user with no password. To change it, you run these commands:
[user@host]# mysql -u root mysql> UPDATE mysql.user SET Password=PASSWORD('NEWPASSWORD') ->WHERE User='root'; mysql> FLUSH PRIVILEGES;
If MySQL processes on your application servers are listening only on localhost, you can further increase security without using firewalls by adding skip-networking to the [mysqld] section of /etc/my.cnf to stop MySQL from responding to TCP/IP requests.
You must ensure that all local scripts connect using a UNIX socket and not via a TCP/IP connection to 127.0.0.1. This can sometimes cause problems, although most of the time, anything that tries to connect to localhost will do so via a socket if possible because that is much more efficient.
An Important Note on Permissions Tables
All the tables in the mysql table (which contain all the information that MySQL requires, including the list of users and permissions) must not be converted to NDB format and must therefore not be clustered. MyISAM is an excellent format for these tables because they are likely to be read a lot and hardly ever written to. Converting them to any other storage engine is likely to result in instability, and converting them to NDB guarantees instability.
If you have a large cluster and need to simplify management of it, you can use replication to maintain the mysql database (although you should make sure you set up replication only to replicate the mysql databasenot the clustered databases, or you will get a mess). There are plans to allow the system tables to be switched into the NDB type in a future release of MySQL Cluster.