Managing WebLogic Server Applications


We are finally ready to talk about the toughest part of a WebLogic Server administrator s job ”how to manage WebLogic Server-based applications. While this coverage is by no means comprehensive, we hope to cover the most common problems encountered while managing WebLogic Server applications. We start off this section by discussing how to manage applications by touching on such topics as application troubleshooting and versioning, and we finish with a discussion of handling failure conditions.

Troubleshooting Application Issues

Application troubleshooting can take many forms. Sometimes, you need to figure out why your application is not performing as fast or scaling as well as someone thinks it should. At other times, the application is not functioning properly and you need to determine the root cause of the problem. In a distributed system, this means that you must consider the entire application environment from the client application and hardware, the network and network devices, the server hardware and operating system, the Web and application servers, the JVMs, the database and other back-end server hardware and software to the application itself. This can be a daunting task, and the possibilities are endless. While we cannot expect to cover all the possible problems or diagnostic approaches, we do hope to describe the use of some of the tools that you have at your disposal to make it easier to narrow down the possible causes of the problem.

When problems arise with a distributed system, people naturally suspect the component(s) of the system for which they have the least knowledge or trust. In many cases, this means that WebLogic Server gets the blame, and it is your job as WebLogic Server administrator to prove the problem lies elsewhere (if, in fact, it does). When you encounter a problem, it is important to get as much information about the symptoms of the problem as possible while trying to recognize that people s biases for what they believe to be the problem may cause them to lead you in the wrong direction. Although it is important to listen to all the evidence, it is also important not to jump to conclusions that are not backed up by the facts.

In almost every situation where you suspect a problem might be related to WebLogic Server, you should use the WebLogic Console to determine the health of the server. The last section discussed many of the WebLogic Console s most important monitoring capabilities. Before doing anything, you should look at the relevant WebLogic Server log files to see if they contain any errors that might indicate the cause of the problem. If the problem at hand is performance- related , looking at the relevant execute queue lengths and throughputs as well as the JVM s heap usage profile should be one of the first pieces of evidence to examine. If the execute queue is empty or no longer than normal (and garbage collection does not appear to be unusually frequent) even though the clients are experiencing a significant degradation in response time, you need to determine whether the problem is with the server or the components in front of or behind the server.

To narrow down the possible causes of a performance problem, it is useful to be able to run your client application and the weblogic.Admin tool s PING command from various points in your application environment. For example, let s say that your Internet users are complaining of very slow response time. By being able to run a browser on one of the Web server machines, you can determine if the cause of the slowdown is between the users and the Web server or somewhere starting at the Web server and going back into your application and database environment. By again moving the browser to an application server machine, you can isolate or eliminate the Web server and the network environment between as potential causes. If the application is not available, the admin tool s PING command can serve a similar purpose.

If you determine through testing that the problem appears to be that WebLogic Server is taking too long to process the requests (even though it is processing PING requests very quickly), the next step is to try to determine what is causing the application request processing to be so slow. Create a series of thread dumps over the span of a minute or so and look at the call stacks for the threads over time. This information will help you understand what the execute threads are doing and may tell you where they are spending most of their time. The optimum frequency and duration of the series of thread dumps depends on how long it takes to process an application request. For example, if a request is taking 15 seconds to process once it is picked up by an execute thread, taking thread dumps 60 seconds apart probably won t help you as much as taking them 5 seconds apart so that you can see if the same thread is in the same place in consecutive thread dumps.

Resource contention is a common cause of performance problems and can occur at many levels. Use the WebLogic Console monitoring tools to detect resource contention for things like JDBC connections and EJB instances. For other types of application- specific resource contention, thread dumps may be your only detection mechanism (outside of either a thorough understanding of how the application works or the use of profiling tools). Data access contention inside the database is best detected by database monitoring tools but may sometimes be seen in application thread dumps.

Best Practice  

Use a series of properly spaced thread dumps to gain insight into the possible causes of long-running requests.

Garbage collection is another common problem. While modern JVMs have much better garbage collection algorithms than their predecessors, these new garbage collectors can require much more tuning to get optimum, or even reasonable, performance. Most JVMs now have multiple garbage collection algorithms that allow a properly tuned JVM to minimize the number of full garbage collection cycles it runs. Typically, these full garbage collection runs must stop all other activity while the garbage collector scans the heap for unreachable objects, removes unreachable objects, and relocates reachable objects to compact the heap (which packs the reachable objects together so as to maximize the contiguous free memory space within the heap). By looking at the JVM heap usage profile (for example, via the server s Performance Monitoring tab), you can detect how often these full garbage collection scans are occurring.

Whenever the heap usage reaches a certain percentage of capacity, the garbage collector will perform a full GC to reclaim as much free memory as possible. The result is that users will see that requests that are in-flight during a full GC take longer to process. In extreme situations, full GC sweeps can occur multiple times in the life of a single request. Because most server-side Java applications tend to create a lot of transient objects (ones that are used for a very short time and thrown away), it is often possible to reduce the number of full GC sweeps significantly by tuning the garbage collector. For more information about garbage collector tuning, and performance tuning in general, see Chapter 12, the WebLogic documentation at http://edocs.bea.com/wls/docs81/perform/index.html, and your JVM documentation (for example, the WebLogic JRockit documentation at http://edocs.bea.com/wljrockit/docs81/tuning/index.html).

Best Practice  

Frequent spikes that indicate high JVM heap usage can have a significant effect on the user s experience. Adjusting the heap size and garbage collection tuning parameters can significantly reduce the frequency of full GC sweeps and improve the user experience.

Versioning Applications

Applications change. Deployment strategies for putting in new versions of already running applications vary. Certain application characteristics can make rolling out new versions of an application messy. The purpose of this section is to point out the issues around application versioning. We will not attempt to propose a solution because there are currently no good solutions that address every possible situation. Choosing the right strategy for an application involves analyzing the application requirements and the sorts of application changes occurring, then trading those off against the pros and cons of the available approaches.

WebLogic Server supports the notion of hot deployment of an application. This means that you can take a new version of an application and push it into a set of running servers without restarting the servers. What it does not currently mean is that you can prevent the redeployment of the new version of the application from affecting the user experience. If there are periods of time when the application is quiescent, rolling out new versions of the application can be as simple as hot deploying the new version, assuming that you have no changes that affect other systems (for example, database schema changes). Many applications, however, do not have a well-defined time when the system is quiescent. Hot redeploying an application when active users are using the site will not only cause the application to be unavailable for a short period of time but may also cause the users to lose transient state data that may exist in the memory. WebLogic Server 8.1 does provide improved support in this area through the retention of HttpSession data during a Web application redeployment, potentially allowing relatively transparent application upgrades if no conflicts with previous HttpSession data exist.

Things get even more complicated if you have Java application clients communicating via RMI. In many cases, the client application may need to be updated simultaneously with the server. This is nothing new, and it is something that we have seen in the client/server world for years . Even in cases where the client application does not require upgrading, the objects to which the client was talking will disappear. It may also mean that some of the dynamically generated RMI stubs the client is using will no longer be valid and may require the client application to be restarted.

In WebLogic Server s defense, no application server on the market today provides solutions to these fundamental problems. We hope that application server vendors will accept this as a challenge to realize that these issues are fundamental to providing an enterprise-class application server on which to build mission-critical, 24 x 7 applications.

Other versioning strategies exist that work well for certain types of applications. For example, a common solution to the versioning problem for Web applications is to use a parallel cluster to deploy a new version of the application. Hardware load balancers are now sophisticated enough to allow you to redirect new users of the application to the new version while existing users continue to use the old version until they end their current session. Of course, this solution requires a lot of administrative work. We expect future versions of WebLogic Server to simplify this versioning strategy.

Managing Failure Conditions

Failures happen. As an administrator, you want to make your system as resilient as possible. Sometimes it is possible to automate processes so that when a particular type of failure occurs, the system can take steps to recover; other times, it is not. WebLogic Server provides some built-in mechanisms to help make applications fault tolerant and transparently recoverable (for example, clustering, in-memory replication, database connection testing). Currently, though, certain situations require manual intervention from an administrator, or at least require the administrator to provide the logic to automate the process. In general, these situations have sufficiently complex conditions that make it difficult to provide a general-purpose failover mechanism. For example, when a WebLogic Server instance fails, how do you decide whether to migrate the JMS servers and JTA transaction service associated with the server? In many cases, the server may restart faster than the services could be migrated . In this section, we talk about several common failure scenarios and the mechanisms that WebLogic Server provides for recovering from these situations.

Database Failures

A very common scenario is that a database goes down (or is taken down) and restarted, either automatically by a high-availability (HA) framework or by the database administrator. When this happens, the connections in WebLogic Server database connection pools become invalid and the applications trying to use them will begin to fail. As we discussed earlier, WebLogic Server does provide mechanisms to allow the server to eventually recover from the situation without any intervention, although these mechanisms come at the cost of some extra overhead. Depending on the mechanism(s) chosen and the configuration, the application may continue to fail for an extended period of time after the database recovers. Fortunately, WebLogic Server also provides a manual mechanism to tell a server to reset a connection pool.

To reset a database connection pool, you can use the admin tool s RESET_POOL command. Using this command, it is possible to provide a script that can reset all of the connection pools associated with a particular database server. Once you have such a script, you need only to have the database administrator or the HA framework run the script whenever the database startup completes (to the point where it is accepting connections). The following example demonstrates manually resetting the BigRezPool on Server1 :

 > set URL=t3s://192.168.1.21:9002 > weblogicAdmin RESET_POOL BigRezPool 

The admin tool also provides other commands related to connection pools. Rather than covering them all here, we will refer you to the WebLogic Server documentation at http://edocs.bea.com/wls/docs81/admin_ref/cli.html. The DISABLE_POOL and ENABLE_POOL commands are ones that might prove useful in certain situations. DISABLE_POOL allows you not only to disable all access to the pooled connections but also to destroy all of the existing connections if you choose to do so. For planned database restarts, you might want to disable the pool and destroy the connections before shutting down the database. This might allow your application to trap the exception raised and display an error page indicating that the system is down. Once the database is back up, you can use the ENABLE_POOL command to re-enable the connection pool, causing the destroyed connections to be recreated.

Migrating the JTA Service

When machines fail, you need to be able to bring up services on other machines. Migrating the JTA Service can play a critical factor in recovery from a failure scenario. In-flight transactions can hold locks on the underlying resources. If the transaction manager is not available to recover these transactions, resources may hold on to these locks for long periods of time, making it difficult for the application to function properly. JTA Service migration is possible only if the JTA logs are accessible to the server to which the service will migrate. Once you guarantee this, migration is simple, although you must be careful how you share these files. Distributed file systems such as NFS typically do not provide the necessary semantics to guarantee the integrity and content of transaction logs. Typically, this means using some higher-end means of sharing the files, such as a multi-ported disk or storage area network (SAN).

Migrating the JTA Service from one server to another is as simple as using the host server s JTA Migrate Control tab to select the target server and cause the migration. You must stop the failing server before you can migrate the JTA service. When you migrate the JTA service, it will not accept any new work and will focus solely on the recovery of incomplete transactions. Once the destination server finishes recovering all of the incomplete transactions, it releases its claim on the migrated JTA service so that the original server can reclaim it once it restarts.

The admin tool also provides JTA service migration capabilities through its MIGRATE command. The basic syntax is as follows .

 MIGRATE [-jta] -migratabletarget [migratable_target_nameserver_name]      -destination [server_name] [-sourcedown] [-destinationdown] 

For example, migrating the JTA service from Server1 to Server2 when Server1 is already down requires invoking our weblogicAdmin script with the following command:

 > weblogicAdmin MIGRATE -jta -migratabletarget Server1 (migratable)     -destination Server2 -sourcedown Started attempt to migrate Transaction Recovery service(s) for Server1 to destination server Server2 ... Transaction RecoveryMigration succeeded. Ok 

Migrating JMS Servers

JMS servers are another important service that must be migratable. Again, any persistent JMS store must be accessible from a shared disk or database. If using JTA transactions with persistent JMS messages, you may need to migrate the JTA service to recover incomplete transactions. We suggest migrating the JMS servers first so that the JMS server s XA resource manager is available for transaction recovery.

Again, you can accomplish migration via the server s JMS Migrate Control tab in the WebLogic Console or via the admin tool s MIGRATE command, as shown here:

 > weblogicAdmin MIGRATE -migratabletarget Server1 (migratable)      -destination Server2 Started attempt to migrate  service(s) for Server1 (migratable) to destination server Server2 ... Migration succeeded. Ok 

In this example, we migrated the JMS server on Server1 to Server2 before we shut down Server1 .

Of course, once we are ready to let Server1 resume its duties , we must migrate the JMS server back to Server1 :

 > weblogicAdmin MIGRATE -migratabletarget Server1 (migratable)      -destination Server1 Started attempt to migrate  service(s) for Server1 (migratable) to destination server Server1 ... Migration succeeded. Ok 

Migrating the Admin Server

The last thing we want to discuss is how to handle admin server availability because the admin server is not currently clusterable. This means that if the admin server goes down, you cannot administer your WebLogic Server domain until you bring it back up. In most cases, you may not be too concerned if the admin server goes down because all you need to do is restart it. What happens if the machine where the admin server runs fails in such a way that you cannot restart the admin server? The answer is simple if you prepare for this unlikely event.

Proper operation of the admin server relies on several configuration files and any application files it controls. Typically, the best thing to do is to store the admin server s directory tree on a shared disk. As long as the configuration and application files are accessible, you can restart the admin server on another machine. It is up to you to make sure that you don t have more than one admin server running at a time. If the new machine can assume the original admin server s Listen Address (or if it was not set), you can simply start the admin server on the new machine without any configuration changes. Otherwise, you will need to change the admin server s Listen Address . The managed servers will learn about the new admin server when the admin server contacts them on startup. If this is a graceful shutdown and migration, use the WebLogic Console to change the Listen Address just before shutting down the admin server. If not, you will need to edit the config.xml file by hand to replace the old Listen Address with the new one. You also need to make sure that your node managers trusted hosts files include the DNS name or IP address for the new admin server. Typically, we recommend planning ahead so that everything you need is already in place to make admin server failover as painless as possible.




Mastering BEA WebLogic Server. Best Practices for Building and Deploying J2EE Applications
Mastering BEA WebLogic Server: Best Practices for Building and Deploying J2EE Applications
ISBN: 047128128X
EAN: 2147483647
Year: 2003
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net