|< Day Day Up >|| |
A critical piece in the high-availability puzzle is a piece that is often taken for granted: the client configuration. With each release, Oracle has attempted to move more and more of the configuration on to the server side, lessening the need to configure the client. In the traditional world of Oracle and SQL*Net configurations, a client-side tnsnames.ora is configured, which tells the client which host to connect to, on what port, and what instance to look for once the connection arrives. In Oracle Database 10g, it is possible to make connections without the benefit of even using a tnsnames.ora on the client, using the Easy Connect Net Naming Method.
Nevertheless, the client machine is where it all begins, as it is at the client where the connection is first initiated. As such, how the client is configured to connect can still play a critical role in determining the availability, or perceived availability, of the application. We saw in Chapter 6 how services are used at the server side to segment workloads and assign nodes. We will now see how this configuration works together with the client and listener configurations.
Let's begin our discussion by defining some terms, before we dive into the configuration details. You will hear many terms bandied about regarding various failovers, and load-balancing configurations, so a basic understanding of these terms is necessary. What is Transparent Application Failover? What is listener load-balancing vs. client- side load-balancing? What is connect-time failover? What is the meaning of life? These are the questions that haunt us, and keep us awake at night, and more specifically, keep the HA DBA awake at night, so we will attempt to answer them here.
We begin with connect-time failover, which is defined as a failed initial connection attempt that must be retried against a different address. Consider the following entry in a client's tnsnames.ora:
GRID = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = rmscvip1.us.oracle.com)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = rmscvip2.us.oracle.com)(PORT = 1521)) (CONNECT_DATA = (SERVICE_NAME = grid) ) )
Given the above entry for the alias called GRID, when a client attempts to connect, the first address in the list will be the one initially tried-in this case, rmscvip1. Should the initial connection attempt to rmscvip1 be unsuccessful, the next address in the list will be tried. This is known as a connect-time failover. You can add as many entries to the list as you have available nodes to accept connections. Note also that the entries in the list do not have to be RAC nodes. The above may apply in a Streams or advanced replication environment, or one of the entries in the list may be a standby database-either physical or logical, as long as there is an available service defined and running on that node. The other criteria, of course, is that the addresses in the list will all allow the client to get to the data that is needed by the application. In the case of a RAC environment, we know that the data is always the same, as it is the same database. In the case of Streams, advanced replication, and physical or logical standby, whether or not the data is accessible depends on how these options are configured.
Transparent Application Failover, or TAF, refers to the failover of a connection that has already been made when a node or instance goes down. This is controlled by the parameter FAILOVER=ON, in the tnsnames.ora. By default, this parameter will be set to on, if not defined. Setting it to OFF will disable failover. When it is enabled, there are two types of failover, defined as part of the FAILOVER_MODE parameter: session (the default), or select. With TYPE=SESSION, should an instance fail after a session is connected, that session will be failed over to another available address in the list, without the user needing to reconnect. However, any SQL statements that were in progress will need to be reissued. With TYPE=SELECT, if the session fails over in the middle of a query, the query will be restarted after the failover. Consider the following modification to the CONNECT_DATA section of the tnsnames.ora file:
(CONNECT_DATA = (SERVICE_NAME = grid) (FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) ) )
In this case, should a failure occur, the failed over session will keep track of the number of rows retrieved so far, so that when the query is restarted, the session will discard those rows that have already been returned.
Note that the query still must start over from the beginning; so if we had returned 9,000 out of 10,000 total rows, when the query is reissued it will have to go through those 9,000 rows again. The user or application will start to see a response again once we hit row number 9,001-at this point, new rows will be returned to the application. However, it will not be necessary to reissue the query. With DML statements, such as insert, update, or delete statements, an error will be returned if the statement is in progress during failover, and uncommitted transactions will be rolled back. In this case, the session is still failed over, so that the statements can be reissued without the need to reconnect.
In our example entry in the tnsnames.ora, we specified METHOD=BASIC. What this means is that the initial connection is made to only one node. In such a case, should a failure occur, it will take some additional time to failover the connection as it must reconnect to the next node in the list. With METHOD=PRECONNECT, at the initial connection, a session will be opened against all addresses in the list, with only one actually being used, but the others already connected. The advantage of this is that if a failure occurs, the failover will be quicker, since a connection is already opened on the alternate node(s). The disadvantage is that the preconnected sessions on the alternate node(s) are still consuming memory resources on the other nodes, and in most cases this might be considered a wasted resource since you would generally expect these sessions will not be needed.
Client-side load balancing is determined by configuring the tnsnames.ora with multiple addresses in the description, with the parameter LOAD_BALANCE=YES, as in this example:
GRID = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = rmscvip1.us.oracle.com)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = rmscvip2.us.oracle.com)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVICE_NAME = grid) ) )
In this entry, the addresses will be tried at random by the client, because the LOAD_BALANCE entry is set. So the first entry in the list will not always be the first one tried. The client will randomly pick an address to try, without regard to the availability of the node or to how heavily loaded the node may be. The choice is done at random, with the expectation that if all clients are randomly picking an address to connect to, the resulting distribution of the load should be fairly even.
Server-side load balancing (also known as listener load balancing), on the other hand, is a bit more intelligent. It is controlled by the REMOTE_LISTENER parameter in the server parameter file. The REMOTE_LISTENER parameter in the spfile should be pointing to a TNSNAMES entry on each server node, which in turn will list the IPs and ports of all available listeners in the cluster. The following is an example:
LISTENERS_GRID = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = rmscvip1.us.oracle.com)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = rmscvip2.us.oracle.com)(PORT = 1521)) )
The REMOTE_LISTENER parameter is then set to LISTENERS_GRID in the spfile:
As such, any connections coming in to a node can be redirected to another node by the listener, without any special client-side configuration. Since the load balancing is done on the server, the listener can take into account the load on the individual nodes, and direct new connections to the node with the least load. Client-side load balancing will not override this in any way. If REMOTE_LISTENER is defined as above, and combined with client-side load balancing, the client will first randomly pick an address from the client tnsnames.ora to connect to. Once that choice is made, and communication is initiated with the listener at that address on the server, the listener will then check the load (based on CPU load, by default) and may decide to either accept the connection or redirect it to another node with less load.
It is easy enough to say that the listener directs connections to the node with the least load, but how is that determined? The truth of the matter is that the algorithms used are not published, and are fairly complex. However, at a high level, the CPU runqueue length is used by default to determine the load. The rate at which PMON updates the listeners with this information varies, depending on whether meaningful changes in the load or number of sessions have occurred, or whether the state of one of the instances has changed. Changes in the load are only considered meaningful if they are proportionately great, based on the overall capacity of the system. These 'meaningful' changes may trigger PMON to refresh the information, but if not, it may be as long as 10 minutes between updates.
This may or may not be the best way to distribute the load, however. Therefore, it is possible to change that using an undocumented parameter in the listener.ora. By setting the parameter PREFER_LEAST_LOADED_NODE_<listener_name>= OFF, we suggest that the listener base its connection redirects on session_count, rather than the runqueue length. An example of this parameter is as follows:
PREFER_LEAST_LOADED_NODE_LISTENER_RMSCLNXCLU1= OFF PREFER_LEAST_LOADED_NODE_LISTENER_RMSCLNXCLU2= OFF
This parameter may be helpful in situations where multiple nodes in the cluster have just been rebooted, and thus have an equal load. Combine this with a mid-tier application that initiates many connections in a short period of time, and you may have a situation where PMON does not have time to update the runqueue information before all of the connections have come in. In this case, all or most of the connections may be established to a single node, instead of being properly balanced out. This is avoided by setting the above parameter to OFF, which not only causes us to look at the number of sessions as the first piece of the puzzle, but also allows the listener to utilize its knowledge of established session information in between PMON updates, so that the information is not outdated. Thus, if a login storm occurs, with many sessions connecting in the span of a minute or less, the listener will track these connections and will be able to distribute the load accordingly, based on current and accurate session information.
JDBC Connections, using both the Thin and Thick drivers, can take advantage of their own specific failover methods using 'Fast Connection Failover.' Fast Connection Failover works in conjunction with the Oracle Notification Services (ONS) in a RAC environment to allow for application_level connection retries and also load-balancing support. Though this is generally outside the scope of responsibility of the HA DBA, we refer you to Chapter 8 of the Oracle® Database JDBC Developer's Guide and Reference for an introduction to Fast Connection Failover.
|< Day Day Up >|| |