29.7. The Cluster Storage Engine | MySQL 5.0 Certification Study Guide

The NDBCluster storage engine originally appeared in MySQL 4.1. Using the cluster engine is complex, and for the purposes of MySQL 5 certification you are not expected to know the details of how to set up and use NDBCluster. You are, however, expected to know the general properties of the cluster engine as compared to other storage engines.

In literature, you will see the two terms "NDB Cluster" (or just "NDB") and "MySQL Cluster." NDB Cluster refers to the cluster technology and is thus specific to the storage engine itself, whereas MySQL Cluster refers to a group of one or more MySQL servers that works as a "front end" to the NDB Cluster engine. That is, a MySQL Cluster consists of a group of one or more server hosts, each of which is usually running multiple processes that include MySQL servers, NDB management processes, and NDB database storage nodes. Cluster processes are also referred to as "cluster nodes" or just "nodes."

The cluster engine does not run internally in MySQL Server, but is, instead, one or more separate processes running outside MySQL Server (perhaps even on different server hosts). In effect, MySQL Server provides the SQL interface to the cluster processes. From the perspective of the server, however, NDBCluster is just another storage engine, like the MyISAM and the InnoDB engines.

NDB Cluster consists of several database processes (nodes) running on one or more physical server hosts. It manages one or more in-memory databases in a shared-nothing system. In-memory means that all the information in each database is kept in the RAM of the machines making up the cluster. (Updates are written to disk so that they are not lost if problems occur.) Shared-nothing means that the cluster is set up in such a way that no hardware components (such as disks) are shared among two nodes.

The NDB cluster engine is a transactional storage engine, like the InnoDB storage engine.

The following list describes the main reasons to consider using MySQL Cluster:

High availability: All records are available on several nodes. If one node fails (for example, because the server host stops working), the same data can be gotten from another node. Spreading copies of the data across multiple nodes also makes it possible to have replicas of the data in two or more widely distributed locations.
Scalability: If the load becomes too high for the current set of nodes, extra nodes can be added and the system will reconfigure itself to make data available on more nodes, reducing the load on each individual node.
High performance: All records are stored in memory, making data retrieval extremely fast. This does not mean that information is lost if the cluster is shut down (as is the case for tables created with the MEMORY storage engine). All updates are written to disk, and are available when the cluster is restarted.