Clustering Windows Server 2003


Before you can install SQL Server 2005 clustering, you must first install Windows Server 2003 clustering services. Once it is successfully installed and tested, then you can install SQL Server 2005 clustering. In this section, we take a step-by-step approach to installing and configuring Windows 2003 clustering.

Before Installing Windows 2003 Clustering

Before you install Windows 2003 clustering, you need to perform a series of important steps. This is especially important if you didn't build the cluster nodes, as you want to ensure everything is working correctly before you begin the actual cluster installation. Once they are complete, then you can install Windows 2003 clustering. Here are the steps you must take:

  1. Double check to ensure that all the nodes are working properly and are configured identically (hardware, software, and drivers).

  2. Check to see that each node can see the data and Quorum drives on the shared array or SAN. Remember, only one node can be on a time until Windows 2003 clustering is installed.

  3. Verify that none of the nodes have been configured as a Domain Controller.

  4. Check to verify that all drives are NTFS and are not compressed.

  5. Ensure that the public and private networks are properly installed and configured.

  6. Ping each node in the public and private networks to ensure that you have good network connections. Also, ping the Domain Controller and DNS server to verify that they are available.

  7. Verify that you have disabled NetBIOS for all private network cards.

  8. Verify that there are no network shares on any of the shared drives.

  9. If you intend to use SQL Server encryption, install the server certificate with the fully qualified DNS name of the virtual server on all nodes in the cluster.

  10. Check all of the error logs to ensure there are no nasty surprises. If there are, resolve them before proceeding with the cluster installation.

  11. Add the SQL Server and Clustering service accounts to the Local Administrators group of all the nodes in the cluster.

  12. Check to verify that no anti-virus software has been installed on the nodes. Anti-virus software can reduce the availability of clusters and must not be installed on them. If you want to check for possible viruses on a cluster, you can always install the software on a non-node and then run scans on the cluster nodes remotely.

  13. Check to verify that the Windows Cryptographic Service Provider is enabled on each of the nodes.

  14. Check to verify that the Windows Task Scheduler service is running on each of the nodes.

  15. If you intend to run SQL Server 2005 Reporting Services, you must then install IIS 6.0 and ASP.NET 2.0 on each node of the cluster.

These are a lot of things you must check, but each of these is important. If skipped, any one of these steps could prevent your cluster from installing or working properly.

Installing Windows Server 2003 Clustering

Now that all of your nodes and shared array or SAN is ready, you are ready to install Windows 2003 clustering. In this section, we take a look at the process, from beginning to end.

To begin, you must start the Microsoft Windows 2003 Clustering Wizard from one of the nodes. While it doesn't make any difference to the software which physical node is used to begin the installation, we generally select one of the physical nodes to be the primary (active) node, and start working there. This way, you won't potentially get confused when installing the software.

If you are using a SCSI shared array, and for many SAN shared arrays, you will want to be sure that the second physical node of your cluster is turned off when you install cluster services on the first physical node. This is because Windows 2003 doesn't know how to deal with a shared disk until cluster services is installed. Once you have installed cluster services on the first physical node, you can then turn on the second physical node, boot it, and then proceed with installing cluster services on the second node.

Installing the First Cluster Node

To begin your installation of SQL Server 2003 Clustering, open Cluster Administrator. If this is the first cluster, you will be presented with the dialog shown in Figure 20-1.

image from book
Figure 20-1

From the Action drop-down box, select Create New Cluster and click OK. This brings up the New Server Cluster Wizard.

Click Next to begin the Wizard.

The next steps seem easy because of the nature of the Wizard, but if you choose the wrong options, they can have negative consequences down the line. Because of this, it is important that you carefully think through each of your responses. Ideally, you already have made these choices during your planning stage.

The first choice you must make is the domain the cluster will be in, as shown in Figure 20-2. If you have a single domain, this is an easy choice. If you have more than one domain, select the domain that all of your cluster nodes reside in.

image from book
Figure 20-2

The second choice is the name you will assign the virtual cluster, as shown in Figure 20-3. This is the name of the virtual cluster, not the name of the virtual SQL Server. About the only time you will use this name is when you connect to the cluster with Cluster Administrator. SQL Server 2005 clients will not connect to the cluster using this virtual name. Once you enter the information, click Next to proceed.

image from book
Figure 20-3

Now you have to tell the Wizard the physical name of the node you want to install clustering on. Assuming that you are running the Cluster Wizard on the primary node of your cluster, the computer name you see in Figure 20-3 will be the name of the physical node you are installing on. If you are installing from one node but want to install clustering on a different node, you can, but it just gets confusing if you do. It is much easier to install on the same node.

Notice the Advanced button in Figure 20-3. If you click it, you see a window that looks like Figure 20-4.

image from book
Figure 20-4

The Advanced Configuration Option screen allows you to choose between a Typical and an Advanced Configuration. In almost all cases, the Typical configuration will work fine, and that is the option we use during this example. The Advanced configuration option is only needed for complex SAN configuration, and is beyond the scope of this book.

So click Cancel and the original screen returns, as shown in Figure 20-3. Enter the correct physical node, if need be, and click Next.

This next step is very important. What the Cluster Wizard does is to verify that everything is in place before it begins the actual installation of the cluster service on the node. As you can see in Figure 20-5, the Wizard goes through many steps, and if you did all of your preparation correctly, when the testing is done, you will see a green bar under Tasks Completed, and you will be ready to proceed. But if you have not done all the preliminary steps properly, you may see a yellow or red icons next to one or more of the many tested steps, and a green or red bar under Tasks Completed.

image from book
Figure 20-5

Ideally, you will want to see a screen like Figure 20-6, with a green bar and no yellow icons next to the test steps. In some cases, you may see yellow warning icons next to one or more of the test steps, but still see a green bar at the bottom. While the green bar indicates that you can proceed, it does not mean that the cluster will complete successfully or be configured like you want it to be. If you see any yellow warning icons, you can drill down into them and see exactly what the warning is. Read each warning very carefully. If the warning is something unimportant to you, it can be ignored. But in most cases, the yellow warnings need to be addressed. This may mean you have to abort the cluster service installation at this time, fix the problem, and install it after you correct the problem.

image from book
Figure 20-6

If you get any red warning icons next to any of the test steps, then you will also get a red bar at the bottom, which means that you have a major problem that needs to be corrected before you can proceed any farther. Drill down to see the message and act accordingly. Most likely, you will have to abort the installation, fix the issue, and then try installation again.

Assuming that the installation is green and you are ready to proceed, click Next.

The next step is to enter the IP address of our virtual cluster. This is the IP address for the cluster, not the virtual SQL Server. The IP address must be on the same subnet as all of the nodes in the cluster. Click Next.

In the Cluster Service Account screen, shown in Figure 20-6, you enter the name of the domain account you want to use as the cluster service account. You will also enter the account's password and the name of the domain where the account was created. This account should have already been created in your domain and added to all of the cluster nodes in the Local Administrators Group. Click Next.

The next Cluster Wizard screen is the Proposed Cluster Configuration, shown in Figure 20-7. But before you click Next, be sure to click on the Quorum button and check which drive the Cluster Wizard has selected for the Quorum. In this case, Drive Q: has been chosen, which is correct. Most of the time, the Cluster Wizard will select the correct drive for the Quorum, but not always. This is why it is important to check this screen to see if the correct drive was chosen. Because we named by Quorum drive Q:, it is very easy to determine that the correct drive was chosen by the Cluster Administrator. That is why we earlier suggested that you name the Quorum drive Q:.

image from book
Figure 20-7

Assuming everything is OK, click OK to accept the Quorum drive, and then click Next. At this time the Cluster Wizard will reanalyze the cluster, again looking for any potential problems. If none are found, click Next; then click Finish to complete the installation of SQL Server 2003 clustering on the first node.

Installing the Second Node of Your Cluster

Once you have installed the first node of your cluster, it is time to install the second node. Like the first node, the second node is installed from Cluster Administrator. Because the cluster already exists, you are just adding the second node to the currently existing cluster. You can install the second node from either the first node or the second node. We suggest you do it from the second node so that you don't get confused.

To install the second node, bring up Cluster Administrator. If you are doing this from the second node, you will get the same screen as you saw when you installed the first node (See Figure 20-1). From here, select Add Nodes to Cluster. This brings up the Add Nodes Wizard, which is very similar to the previous Create Cluster Wizard you just ran, except it has fewer options.

As the Wizard proceeds, you will enter the name of the physical node to add to the current cluster, after which a series of tests will be automatically run to verify that the node is ready to be clustered. As before, if you run into any problems - yellow or red warnings - you should correct them first before continuing. Once all problems have been corrected, you are then asked to enter the password for the cluster service account (to prove that you have permission to add a node to the cluster) and the node is added to the cluster.

Verifying the Nodes with Cluster Administrator

Once you have successfully installed the two nodes of your cluster, it is a good idea to view the nodes from Cluster Administrator. When you bring up Cluster Administrator for the first time after creating a cluster, you may have to tell Cluster Administrator to Open a Connection to Cluster, and type in the name of the virtual cluster you just created. Once you have done this, the next time you Open Cluster Administrator, it will automatically open this cluster for you by default.

After opening up Cluster Administrator, you will see a screen very similar to Figure 20-8.

image from book
Figure 20-8

Notice that two resource groups have been created for you: Cluster Group and Group 0. The Cluster Group includes three cluster resources: the Cluster IP Address, the Cluster Name, and the Quorum drive. These were all automatically created for you by the Cluster Wizard. We will talk more about Group 0 a little later.

When you look next to each cluster resource, the State for each resource should be online. If not, your cluster may have a problem that needs fixing. As a quick troubleshooting technique, if any of the resources are not online, right-click the resource and choose Bring Online. In some cases, this will bring the resource online and you will not experience any more problems. But if this does not work, you need to begin troubleshooting your cluster.

Also, next to each resource is listed the Owner of the resource. All the resources in a resource group will always have the same owner. Essentially, the owner is the physical node where the cluster resources are currently running. In Figure 20-8, the physical node they are running on is SQL2005A, which is the first node in the two-node cluster. If a failover occurs, all of the resources in the resource group will then change to the other node in your cluster.

Configuring Windows Server 2003 for Clustering

Before you install SQL Server clustering, there is one small step you need to perform, and that is to prepare a resource group for the SQL Server resources that will be created when SQL Server is installed.

Most likely, when you created the cluster, as above, you will see a Resource Group named Group 0. This resource group was created when the cluster was created, and it most likely includes the shared resource for your SQL Server databases to use. See figure 20-9.

image from book
Figure 20-9

In the example, Disk F:, the shared array for SQL Server, is in Group 0. If you like, you can leave the resource group by this name, but it is not very informative. I suggest that you rename Group 0 to SQL Server Group. You can do this by right-clicking on Group 0 and selecting Rename.

In some cases, the Cluster Wizard may put the SQL Server shared disk array in the Cluster Group resource group and not create a Group 0. If this is the case, you will need to create a new resource group and then move the SQL Server shared disk array from the Cluster Group to the newly created SQL Server Resource group.

Here's how you create a new resource group using Cluster Administrator:

  1. Start Cluster Administrator.

  2. Select FileNewGroup. This starts the New Group Wizard.

  3. For the Name of the group, enter "SQL Server Group." Optionally, you can also enter a description of this group. Click Next.

  4. Now, you must select which nodes of your cluster will be running SQL Server. This, of course, will be all of your nodes. The nodes are listed on the left side of the Wizard. Control-Click each of the nodes on the left and then select Add. This will move the selected nodes from the left side of the Wizard to the right side. Click Finish.

The new SQL Server Group resource group has now been created.

Now that the group has been created, it must be brought online. From Cluster Administrator, right-click the SQL Server resource group (it will have a red dot next to it) and select Bring Online. The red dot next to the resource group name goes away, and the SQL Server Group resource group is now online and ready for use.

Now, your next step is to move any disk resources from the Cluster Group (except the Quorum drive) to the SQL Server Group. This is a simple matter of dragging and dropping the disk resources from the Cluster Group to the SQL Server Group. Once you have done this, you are ready for the next step.

Test, Test, and Test Again

Once you have installed Windows 2003 clustering on your nodes, you need to thoroughly test the installation before beginning the SQL Server 2005 cluster install. If you don't, and problems arise later with Windows 2003 clustering, you may have to remove SQL Server 2005 clustering to fix it, so you might as well identify any potential problems and resolve them now.

The following are a series of tests you can perform to verify that your Windows 2003 cluster is working properly or not. After you perform each test, verify if you get the expected results (a successful failover), and also be sure you check the Windows event log files for any possible problems. If you find a problem during one test, resolve it before proceeding to the next test. Once you have performed all of these tests successfully, you are ready to continue with the cluster installation.

Preparing for the Tests

Before you begin, identify a workstation that has Cluster Administrator on it, and use this copy of Cluster Administrator for interacting with your cluster during testing. You will get a better test using a remote copy of Cluster Administrator than trying to use a copy running on one of the cluster nodes.

Move Groups Between Nodes

The easiest test to perform is to use Cluster Administrator to manually move the Cluster Group and SQL Server resource groups from the active node to a passive node, and then back again. To do this, right-click the Cluster Group and then select Move Group. Once the group has been successfully moved from the active node to a passive node, use the same procedure above to move the group back to the original node. The moves should be fairly quick and uneventful. Use Cluster Administrator to watch the failover and failback, and check the Event Logs for possible problems. After moving the groups, all of the resources in each group should be in the online state. If not, you have a problem that needs to be identified and corrected.

Manually Initiate a Failover in Cluster Administrator

This test is also performed from Cluster Administrator. Select any of the resources found in Cluster Group resource group (not the cluster group itself), right-click on it, and select Initiate Failure. Because the cluster service always tries to recover up to three times from a failure, if it can, you will have to select this option four times before a test failover is initiated. Watch the failover from Cluster Administrator. After the failover, fail back using the same procedure, again watching the activity from Cluster Administrator. Check the Event Logs for possible problems. After this test, all of the resources in each group should be in the online state. If not, you have a problem that needs to be identified and corrected.

Manually Failover Nodes by Turning Them Off

This time, you will only use Cluster Administrator to watch the failover activity, not to initiate it. First, turn off the active node by turning it off hard. Once this happens, watch the failover in Cluster Administrator. Once the failover occurs, turn the former active node on and wait until it fully boots. Then turn off the now current active node by turning it off hard. And again, watch the failover in Cluster Administrator. After the failover occurs, bring the off node back on. Check the Event Logs for possible problems. After this test, all of the resources in each group should be in the online state. If not, you have a problem that needs to be identified and corrected.

Manually Failover Nodes by Breaking the Public Network Connections

In this test, you will see what happens if network connectivity fails. First, both nodes being tested should be on. Second, unplug the public network connection from the active node. This will cause a failover to a passive node, which you can watch in Cluster Administrator. Third, replug the public network connection back into the server. Fourth, unplug the public network connection from the now active node. This will cause a failover to the current passive node, which you can watch in Cluster Administrator. Once the testing is complete, replug the network connection into the server. Check the Event Logs for possible problems. After this test, all of the resources in each group should be in the online state. If not, you have a problem that needs to be identified and corrected.

Manually Failover Nodes by Breaking the Shared Array Connection

This test is always exciting, as it is the test that is most apt to identify potential problems. First, from the active node, remove the shared array connection. This will cause a failover which you can watch in Cluster Administrator. Second, reconnect the broken connection. Second, from the now active node, remove the shared array connection. Watch the failover in Cluster Administrator. When done, reconnect the broken connection. Check the Event Logs for possible problems. After this test, all of the resources in each group should be in the online state. If not, you have a problem that needs to be identified and corrected.

If you identify any problems, check the troubleshooting section found later in this chapter. As we mentioned before, if any particular test produces unexpected problems, such as failover not working or errors are found in the Event Logs, identify and resolve them now before proceeding with the next test. Once you have resolved any problems, be sure to repeat the test that originally indicated the problem in order to verify that it has been fixed.



Professional SQL Server 2005 Administration
Professional SQL Server 2005 Administration (Wrox Professional Guides)
ISBN: 0470055200
EAN: 2147483647
Year: 2004
Pages: 193

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net