Active Monitoring


In addition to reviewing the ColdFusion logs and Web server logs, it's helpful to have a good picture of how your Web server looks from outside the network (especially if you think you might have a network bottleneck). If you are managing your own Web server and it is located offsite, a good network-monitoring package will give you some perspective on server uptime, as well as any network latency coming to and going from your Web site. If you don't have a monitoring package yet, take a look at Mercury SiteScope (http://www.mercury.com/us/products/application-management/foundation/monitors/sitescope/). When you run SiteScope on a machine connected to a network other than the one hosting your server, SiteScope will check the health of your site at specific intervals. It provides a graphical dashboard of server activity, viewable through a Web browser. Besides SiteScope, a number of great open-source monitoring tools are available that will run on Linux.

If your server is managed by someone else or hosted at a co-location facility, the management company should have a monitoring tool in place. It's good practice to ask routinely for the server's uptime percentage, as well as time frames and explanations for any outages. Not only will you be checking up on the efficiency of your management company, but you might also get an idea of how traffic and usage affect site downtime.

For more about active monitoring, see Chapter 1, "Understanding High Availability."


Server Probes

ColdFusion applications are often used for serious enterprise applications that rely on a variety of things beyond just databases, including LDAP, SMTP, POP, Web Services, ERP systems, and others. Knowing what is happening with all these disparate systems can be crucial to successful trouble shooting in a timely manner. ColdFusion offers a method to do just thatthrough server probes. These not only monitor parts of your application, but recognize failure conditions, send alerts, and even resolve the situation (for example, restarting a service by running a batch file). The results of your probes are also logged to the Scheduler.log.

Setting Up a Probe to Verify Content

The first type of server probe you should set up is a simple content match. This probe loads the Web page at an interval you set. ColdFusion Application Server (CFAS) then attempts to match your specified content with the Web page content (provided that CFAS can view the content as part of the source). If your Web server is delivering the content as expected, the System Probes page displays the status as success. However, if the Web server is displaying anything other than the expected content (such as a ColdFusion error page), the System Probes page displays a status of Failed. ColdFusion gives you the option of sending an email notification, executing a program, and logging the error.

For example you could create a simple ColdFusion template called probe.cfm and put the word 'Alive' in it. Then you would setup a probe that would call this page, for example every hour, and have it look for the word 'Alive'. If Alive is not returned it will email you.

To set up a content match probe, follow these steps:

1.

In ColdFusion Administrator, select Debugging & Logging and choose System Probes in the Debugging and Logging category. If you haven't set up any probes yet, your System Probes menu will be similar to that in Figure 2.2.

Figure 2.2. The System Probes screen before you've configured any probes.


2.

Click the Define New Probe button to create a new probe.

3.

In the Probe Name box, enter the name of the probe (Figure 2.3).

Figure 2.3. The configured content-match probe.


4.

Enter the frequency with which you want ColdFusion to load the page. Set it to at least 60 seconds. Also make sure to set the 'Start Time' which is required. Optionally you can setup a 'End Time' which defines when your probe will stop running, much like a scheduled task, but usually you will not use this setting.

5.

In the Probe URL box, enter the URL you want ColdFusion to verify. In the example in Figure 2.3, the URL is http://localhost/probe.cfm, indicating that ColdFusion should check a page called probe.cfm. In some situations you may want to probe parts of your site that are behind some sort of secure area. In this case you can also pass a username and password in the user name and password fields.

6.

In the Timeout box, enter a timeout value of at least 30 seconds. If you have set ColdFusion in Server settings to time out requests after a certain number of seconds, you should use the same value here.

7.

Choose the Probe Failure settings. In this example, the probe will fail if the response does not contain the Alive string. What do you want ColdFusion to do if the probe indicates that it can't verify your content? You can choose to send an email notification, execute a program, or log the error to a specific log file. If the content you want to match contains spaces make sure to surround the text with quotation marks in your ColdFusion Administrator.

8.

Click Submit.

After you have set up the content match probe, when you click Submit and return to the System Probes page, it displays your content match with a status of Unknown. Test the probe by clicking its URL. If the probe succeeds, the status will be OK; if the probe fails, you'll get a Failed status. If ColdFusion displays a Failed status but you can verify that the site is functioning properly (in other words, you have set up a content match and the page is rendering correctly), edit the probe and verify all the settings (especially the search string). Often a simple typo will make the difference between success and failure statuses on a functioning site. However, if ColdFusion displays a Failed status, and the page doesn't render correctly or at all when you browse it, you know you have set up a successful content-match probe. Now it's time to fix the problem.

You have just set up a basic content-match probe, but you might want to monitor other components of your Web application, such as database connectivity, SMTP connectivity, and availability of external programs and processes.

Other Probes

There are several other probes to help you verify that all areas of your application are working properly. By writing a simple ColdFusion page, you can connect to a database and run a query, then return a specific recordset. If the recordset can be retrieved, you know the database server is working properly. Or you can write an extensive ColdFusion page that performs a complete check on your application's components. ColdFusion server probes offer you another great tool to monitor, inform, and even resolve issues as they happen.

Setting Up a System Probe to Verify External Connectivity

Let's look at how we can setup a system probe to check for database connectivity. The easiest way to do this is to just create a simple query; it does not even have to return anything, inside a CFTRY block. If the page can not connect to the database for any reason it will throw an exception and return 'Error' other wise it will return 'Success'. Listing 2.1 gives a example of a simple database probe.

Listing 2.1. probe.database.cfmSimple Database Connectivity Probe
 <cftry>      <cfquery name="probeDB" datasource="OWS">           Select contacts.FirstName               From contacts               Where  contacts.FirstName ='Ben'      </cfquery> <cfcatch type="database">      Error! </cfcatch>      Success </cftry> 

Save this probe somewhere in your webroot. You will need to make sure the datasource 'ows' is configured and setup in your ColdFusion Administrator. Now just follow the same steps as you would to configure the content probe. Here is a short summary again:

1.

Click the Define New Probe button to create a new probe.

2.

Enter the frequency with which you want ColdFusion to load the page. For this example an hour is sufficient. Then add a star time and if you want a finish time.

3.

In the Probe Name box, select a unique name for this probe that describes what you are testing. For example owsdatabaseprobe. In the URL box, select the path to the probe.database.cfm file.

4.

Set the 'Probe Failure' settings to fail if it does not contain the string 'Success', then select the e-mail notification.

5.

Click Submit Changes. You have now set up the probe.

You should now see your probe under the list of System Probes in the ColdFusion Administrator. You should also see four icons next to the name of your system probe. The second icon allows you to immediately run the probe so you can test it and when you move your mouse pointer over it you should see 'Run Task'. Select this icon to run the probe. If you have configured everything correctly you should not get a notification. If anything wrong with the database connection then you will.

You can configure all system probes to send emails when a probe fails. Monitoring the System Probes page in ColdFusion Administrator at all times is virtually impossible. Setting up email alarms is an essential way to remain up-to-date regarding the availability of your Web servers. It also helps you gather trend information to make educated choices on strengthening site availability. In the System Probes page, enter a list of email recipients to receive probe notifications separating each email using a semi colon, and then click Submit Changes.

By combining different kinds of probes with email alarm notification, you can get a pretty good idea of your Web applications' availability in real time. After you start to notice performance trends, you are ready to start looking for server bottlenecks.

NOTE

All probes run as scheduled tasks, so creating too many probes or setting them with high frequencies (such as every second) will adversely affect the system's performance.


System Monitors

System monitoring provides real-time statistics on your operating system, Web server, and application server. This can be invaluable information for diagnosing bottlenecks and system crashes. In this section you will learn some typical methods for monitoring performance in real time.

ColdFusion and Microsoft Windows

If you are running your ColdFusion servers in the Windows operating system, the two best places to find historical information about possible ColdFusion issues are the ColdFusion server logs and Windows Performance Monitor (perfmon). To utilize Performance monitoring, you must first turn this feature on in ColdFusion Administrator under Debugging Settings. Check both Enable Performance monitoring and Enable CFSTAT. See Figure 2.4 for an example of configuring performance monitoring and CFSTAT.

Figure 2.4. Configuring Performance Monitoring and CFSTAT.


ColdFusion and Solaris, Linux, or the HP/UX Operating Systems

If you are running Solaris, Linux, or the HP/UX operating systems, the Application.log file remains the same as in Windows; however, you will use different performance-monitoring tools. In most Unix environments, to collect performance-specific information you could set up cron jobs that run a program such as vmstat to display CPU, memory, and a page file. Scheduling these jobs and reviewing the information gives you a good idea of what is happening on your servers.

If you are running Solaris, you'll want the SE Performance Toolkit. It gives you the same information as vmstat, and some additional information in the form of a GUI. In addition, it has a great scheduling feature that dramatically cuts down the time you would spend setting up cron jobs. You can download Toolkit from Sun at http://www.sunfreeware.com/. Sun makes the disclaimer that the SE Toolkit is unsupported Sun software; however, it really is an excellent tool. Before placing it on a production system, though, install it and play with it on a test server to ensure that it will meet your needs.

Another monitoring tool for Linux is LogTrend. This tool can be used to monitor a Linux server, including CPU load, memory, swap, disk space and several processes. It can be found at www.logtrend.org.

Finally, there are many enterprise-monitoring tools, such as Computer Associate's Unicenter TNG (www.cai.com) and BMC Software's BMC Patrol (www.bmc.com), which can provide explicit, detailed information on server utilization through specialized agents.

You can also run CFSTAT to see ColdFusion performance information (see Figure 2.4 earlier).

Monitoring ColdFusion Using Perfmon and Settings

Performance-monitoring tools enable you to watch your site's performance in real time or to record data for later analysis. After looking over the ColdFusion logs, you should gather information on server performance trends to correlate the errors you see in the logs with specific system events that may indicate a bottleneck. You can gather this information in various ways, depending on what platform is running ColdFusion Server. The following section discusses two tools: Windows NT Performance Monitor for Windows-based servers, and the CFSTAT tool for Windows and Unix-based machines.

Using Windows Performance Monitor

Windows NT and 2000 both provide a graphical tool called Performance Monitor (perfmon) to watch server performance over time. It's an especially good way to watch the use of server resources such as memory, processors, and drive space. Beginning with ColdFusion 4.0, Macromedia added several perfmon counters specific to the ColdFusion object that let you monitor various ColdFusion-specific statistics.

Perfmon Basics

If you've used perfmon before, you can skip this section and go straight to the section "The ColdFusion Server Object."

Perfmon is located under Administrative Tools in Windows 2000's Control Panel. When you open perfmon, you see an empty workspace.

In the perfmon Console, you can view the system monitor and set performance logs and alerts. The system monitor shows you a moving graph of performance information, updated at a regular interval that you specify. Use the system monitor to watch your server's current performance. By watching the system monitor closely over a period of time, you can get a feel for how your server performs under normal conditions. The system monitor also shows you symptoms of a performance problemfor example, an unusually large number of queued ColdFusion requests.

Here are some of the system monitor tools for setting performance logs and alerts:

  • The counter log allows you to store perfmon data for later analysis. You can open a log file in perfmon and review the data at any time. This view is useful if you want to study your server's performance over a known time periodsay, right after market close for a financial site.

  • The trace log monitors trace data. This log differs from the counter log in that it monitors data continuously instead of at intervals.

  • Using alerts, you can set perfmon to take action when your server's statistics pass specified thresholds. Alerts let you know when your server has performance problems without requiring that you monitor it manually. You can even set perfmon to do system administration functions based on alert criteria. These alerts are logged to the Windows application log.

There are three different ways to view your system statistics and two ways to work with them. There arechart view, histogram view, report view, managing alerts, and logging performance data. They are discussed below.

In the chart view (or in the histogram view), you can add a statistic to perfmon by clicking the plus (+) button on the button bar. The Add Counter dialog window appears (Figure 2.5). Here, you can choose statistics to watch.

Figure 2.5. The Add Counters dialog box.


This dialog box contains a lot of information. At the top is the Uniform Naming Convention (UNC) name of the computer for which you want to see counters; by default, this is the local machine. You can type a UNC to monitor another machine, or click the drop-down to select a computer. Below the computer name is a drop-down lookup list of the objects available to perfmon; each object is a collection of statistics, called counters, which you can view using perfmon. The list of counters is located immediately below the Object lookup list.

On the right side of the screen is a list of instances of each object. Many objects have only one instance, in which case this box is blank. Other objects have many instances. For example, the Process object has as many instances as you have processes running on your server. If your server has multiple processors, each processor shows up as an instance of the Processor object.

By default, perfmon reads counters once per second. (You can change this setting via Data, from the Properties menu.) If you select the % Processor Time counter from the Processor object and let perfmon run for a minute or so, you should see a screen similar to the one in Figure 2.6. Take some time to explore the available objects, instances, and counters. If you haven't used perfmon before, you might be surprised at how many useful statistics are available. Perfmon counters are generally not well documented. If you want to get the most out of perfmon, you need to explore the objects provided by the applications you use.

Figure 2.6. A sample Performance Monitor chart, showing about a minute's worth of data.


If you get to know your server's behavior well, you can probably identify some warning signstoo many ColdFusion threads, unusually high processor utilization, and so on. You can add perfmon alerts to let you know when these warning signs are present. By setting up a perfmon that alerts you when a threshold is passed, you can go about your work confident that you'll be notified of a problem in time to take preventive action. You can even tell perfmon to run an external program if a counter passes a threshold you specify.

TIP

In Windows 2000, alerts continue to work behind the scenes even if perfmon is closed and they log all information to the Windows application log. You can view this log with the Event viewer, in the Control Panel under Administrative Tools.


To create an alert, right-click the Alerts item under the Performance Logs and Alerts menu, and choose New Alert Settings or New Alert Settings From. Specify a unique name for the alert, or choose a saved counter file from the Open dialog and click OK. Next, add counters.

When you add a counter to an alert, you see some options that are different from the other views (Figure 2.7). You can choose to receive an alert if your selected statistic goes over or under your chosen threshold. See the "ColdFusion Server Object" section for descriptions of the ColdFusion Application Server object counters.

Figure 2.7. The Alert Properties dialog box.


In the Action tab (Figure 2.8), you can select actions to perform if an alert condition occurs. For example, you can have a command-line email program, such as blat.exe (http://www.blat.net/194/), to send yourself a message if ColdFusion's average request time counter goes over a certain value, which would indicate that your server is having a performance problem.

Figure 2.8. The Alert Properties Action tab.


You can also have perfmon pop up a message box on the screen when an alert condition occurs. You do so by selecting Alert from the Options menu in Alert view. Check the Send Network Message check box, and enter your computer name in the Net Name field. In the Alert dialog box, you can also write events into the machine's application event log.

In addition to using perfmon for real-time monitoring, you can let it run for hours or days and have it save its data in a file for later review and analysis. Logging performance data stores counter statistics for a specific length of time. Perfmon can only write data at the object level (it writes every instance and counter for a given object to disk). So perfmon log files can get very big, very fast. If you intend to log perfmon data for any substantial length of time (more than a few hours), you probably need to reduce the sample frequency to once a minute or even once every 10 minutes. Although logged data can be useful for averaging and detecting general trends, the spiky and fast-changing nature of the ColdFusion load means you'll lose many of the useful details that can be gleaned by watching perfmon directly in chart or histogram view.

Perfmon log files can become corrupted if a process or machine you're running crashes or otherwise has problems. Corrupted files can seriously limit the usefulness of perfmon logging because it's hard to get a log file that shows both the events leading to a crash and the crash itself.

You must use perfmon to view a perfmon log file. You do so by selecting the View Log File Data from the button bar. In the Select Log File dialog, select the log file and click Open. You can then view your stored data in chart, histogram, or report view. Realistically, only the chart and report views are useful for reviewing logged data.

To create a new log file, right-click Counter Log under the Performance Logs and Alerts menu and choose New Log Settings or New Log Settings From. Specify a unique name for the log file or choose a saved counter file from the Open dialog, and click OK. Next, add counters.

You then see the dialog box shown in Figure 2.9. When you choose to add a counter, you get the same Add Counter dialog box shown previously in Figure 2.5. You can select multiple objects, but remember that the more objects you choose to log, the faster your log file will grow in size.

Figure 2.9. The Log Properties dialog box.


TIP

Be sure to set the schedule for logging perfmon statistics to a short interval. Otherwise your log files will become very large and consume too much hard drive space.


Report view is similar to chart view in that it provides a view of current performance statistics. Instead of a chart, however, report view simply displays a list of counters and their corresponding values. To use report view, select View Report from the button bar (Figure 2.10). Otherwise, report view functions almost identically to chart view, minus the options for chart formatting.

Figure 2.10. Report view.


The ColdFusion Server Object

The ColdFusion Server performance monitor object provides several useful counters. To view the CF performance monitor object, go to Debugging & Logging> Debugging Settings in ColdFusion Administrator and check the Enable Performance Monitor check box. (Refer to Figure 2.4 for configuring Performance Monitor.) You might need to restart the ColdFusion server after this step to register the object with Performance Monitor.

All processing of ColdFusion relies upon the underlying Java run-time engine, either the embedded JRun engine or an alternative Java application server.

The ColdFusion performance monitor object provides ten counters:

  • Avg DB Time (msec). This is the average of the amount of time in milliseconds a database operation, initiated by ColdFusion MX 7 Server, took to complete.

  • Avg Queue Time (msec). This provides an average of the amount of time in milliseconds that requests waited in the input queue before ColdFusion Server began to process them.

  • Avg Req Time (msec). This is the average of the total amount of time in milliseconds ColdFusion Server took to process a request. In addition to general page-processing time, this value includes both queue time and database processing time.

  • Bytes In/Sec. The number of bytes received per second by the ColdFusion Server.

  • Bytes Out/Sec. The number of bytes returned per second by the ColdFusion Server.

  • DB Hits/Sec. The number of database operations performed per second by the ColdFusion Server.

  • Page Hits/Sec. This represents the number of Web pages processed per second by the ColdFusion Server.

  • Queued Requests. This describes the number of requests currently waiting to be processed by the ColdFusion Server. These numbers can vary depending on traffic, but as a rule, if you consistently have more than five queued requests, you might have a bottleneck.

  • Running Requests. This is the number of requests the ColdFusion Server is currently actively processing.

  • Timed Out Requests. This is the total number of requests that timed out while waiting to be processed by the ColdFusion Server. You should investigate any number of timed-out requests, especially if they happen consistently.

In addition to the counters associated with the ColdFusion object, you can monitor a few other aspects of ColdFusion performance via the Process object. Select the Process object, and then select the Jrun process (or other underlying Java application server) from the Instance menu. Two particularly useful counters are % Processor Time and Thread Count.

The % Processor Time counter reports JRun's total processor utilization. This counter is an important performance indicator. If you set ColdFusion's maximum simultaneous requests too high, JRun can exhaust the server's processor resources trying to handle all the requests at once. If your ColdFusion pages are fairly simple (say, running a database query without much subsequent CFML processing), you might be able to handle a much higher number of simultaneous requests without a problem. Watch this counter as you adjust the number of simultaneous requests, and you should be able to tune your ColdFusion server to use processor resources effectively without overloading the machine.

The Thread Count counter reports the total number of threads in use by the JRun process. This number includes active request threads, queued request threads, and several utility threads that are always present. As such, it doesn't give you as much detail as the Queued Requests and Running Requests counters of the ColdFusion object.

By default, Performance Monitor caps all percentage statistics at 100 percent. If you're monitoring % Processor Time and you have only one processor in a machine, capping the statistics makes a lot of sense. If you have multiple processors in your Web server, however, how do you measure a single process's utilization? Does 100 percent mean the process is using 100 percent of each processor? No, because Performance Monitor counts multiple-processor utilization cumulatively. If a process is using, say, 75 percent of each processor, Performance Monitor reports that process's utilization as 150 percent. You can view processor utilizations over 100 percent by changing this Registry entry:

 HKEY_CURRENT_USER\Software\Microsoft\PerfMon\CapPercentsAt100 from 1 to 0. 

Configuration Options

Next, let's examine a few tricks to make Performance Monitor even more useful.

You can monitor multiple and remote servers by entering the UNC of the server in the "Select counters from computer" drop-down in the Add Counter dialog box (refer to Figure 2.5). You need to have administrative privileges to see counters for the server. If you enter one UNC and select some parameters, then enter another UNC and select some more parameters, you can set up a single Performance Monitor workspace that monitors multiple computers. For example, you could use a Performance Monitor workspace to show the Thread Count and % Processor Time for all the servers in your server farm, including your database server.

By selecting Save As from the Console menu, you can preserve your hard-earned Performance Monitor settings to use again. This will save the settings as a Microsoft Management Console file. You can even pass Performance Monitor configurations from machine to machine by copying the workspace files. You can also save a snapshot of your perfmon as HTML by right-clicking in the monitor space and choosing Save As. Enter a filename and choose Save.

Monitoring Performance on Unix and Linux Servers

All the Unix installations of ColdFusion server (including Linux) come with a program called CFSTAT. Similar to Windows Performance Monitor, CFSTAT displays performance statistics for the ColdFusion server. Schedule CFSTAT to run as a cron job to monitor performance-related statistics over time.

CFSTAT is located in the /CFusionMX7/bin subdirectory of the directory in which you installed ColdFusion. Be sure to log in with root privileges. You can also run cfstat.bat from the command line in Windows.

Running CFSTAT provides the following metrics with both current statistics (at the time of CFSTAT execution) and maximum statistics:

  • Pg/Sec. This statistic shows how many .cfm files ColdFusion is processing per second. Use it to determine your ColdFusion server's efficiency. Although every ColdFusion application generates a different range of numbers for this counter (especially depending on server traffic at a given point), pay specific attention to the range of numbers CFSTAT displays over time. If you notice that ColdFusion is servicing fewer requests than it should be, and you can confirm that no major code changes have occurred, it's time to start looking at your traffic statistics to determine whether a spike in traffic has occurred. You should also check CPU utilization to determine whether the CPU is over utilized. Look at the Hi number (highest number of requests serviced) and the Now number (current number of requests serviced) to compare server efficiency.

  • DB/Sec. This measures the number of database accesses ColdFusion makes per second. Like the first counter, it provides information about the efficiency of the interaction between ColdFusion and your database server. Again, look for performance trends over time to determine whether the application has become less efficient. If you notice that at a specific point in time the number of database accesses ColdFusion made per second has decreased, examine any code changes you made during that time period (specifically related to database queries). By looking at the performance statistics on your database server (CPU, memory, and so on), you can determine whether your database server is over utilized or whether the bottleneck lies in the realm of ColdFusion.

  • CP/Sec. This counter, which measures ColdFusion template cache pops per second, is application specific. If you haven't made any major code changes, a decrease in this number could indicate a hardware-related bottleneck.

  • Req Q'ed. This measures the number of ColdFusion requests waiting in the queue. This one's important: If you have a lot of queued requests waiting for processing by the ColdFusion server, it means those users are waiting for some response from ColdFusion. Examine the server's CPU and memory utilization. If the CPU utilization is very high, you might have a CPU bottleneck. Because each application is different, it's tough to say what a high number of queued requests would be; however, if your server has a sustained number of requests greater than five, you might have a bottleneck.

  • Req Run'g. This counter lets you know how many active requests ColdFusion is currently processing. Keep an eye on the Req Q'ed and the Req Run'g numbers over time to see how ColdFusion is dealing with queued requests that build up in high-traffic periods. If ColdFusion is not dealing with these requests, and you notice an increasing number of Requests TO'ed (see the next counter), examine your server's CPU and memory utilization.

  • Req TO'ed. This counter lets you know the total number of client requests generating a server timed-out message. If you are getting timeouts and your page timeout setting in ColdFusion Administrator is set to an acceptable number (for most applications, at least 30 seconds is recommended), this counter may indicate a hardware-related bottleneck during high-traffic times.

  • AvgQ Time. This counter does what it suggests: It gives you an average of how long requests are waiting in the queue for processing. If you notice this number increasing, especially during high-traffic times, and you haven't made any code changes, you might have a hardware-related bottleneck.

  • AvgReq Time. This counter lets you know the average amount of time ColdFusion spends processing requests and indicates the server's efficiency.

  • AvgDB Time. This counter lets you know the average amount of time ColdFusion spends performing database-related activities. It is useful for isolating database-related communication from other ColdFusion activity to determine whether a bottleneck is database related.

  • Bytes In/Sec. This is not an average number, but the actual number of bytes ColdFusion read in the last second.

  • Bytes Out/Sec. This is not an average, but describes the actual number of bytes ColdFusion wrote out in the last second.

See Figure 2.11 for sample output in Windows.

Figure 2.11. CFSTAT output in Windows after running for 60 seconds.


By using the CFSTAT help switch, you can display definitions for each of the previously listed counters from the Unix command prompt. There are a few switches that can be used to format the CFSTAT output and for setting display output time.

TIP

Use the # switch by specifying the number of seconds to run. CFSTAT will run for this many seconds and then display the results.


The best way to maximize your use of CFSTAT is to configure a cron job to execute CFSTAT during regular intervals and log the results into a log file. In addition, you should set up another cron job to run a performance statistics program, such as vmstat, to measure CPU, memory, and system I/O information.

Pay specific attention to the requests queued, requests running, requests timed out, and average queue time during high-traffic periods. If you notice a high number of requests queued or requests timed out, or a longer-than-normal average queue time, look at the vmstat statistics to obtain CPU and memory utilization during that period. If you notice a high CPU utilization or a lack of free memory, you have discovered your bottleneck.

JVM Monitoring

The Java Virtual Machine is a critical part of your ColdFusion-based application. One way to think of JVM is as an "execution engine" that runs the byte code in the Java classes on your server's CPU. Various things can negatively affect the JVM's ability to effectively and efficiently do its job and thereby radically influence the performance of your ColdFusion application server. Heavily affecting the JVM are the amount of memory assigned to it, and how it goes about garbage collectiona special term referring to the JVM's cleanup of unused objects in memory.

The JVM creates objects and assigns them memory, and when it destroys an object it retrieves that memory to be used by other objects. The JVM does this in a variety of ways. In part, it makes assumptions about your application's needs, and it looks at specific attributes you set either in the ColdFusion Administrator or the JVM.config file. Simple ColdFusion applications that do not experience much traffic will rarely encounter JVM-related problems. On the other hand, large, complex applications that move a high load may have JVM issues, or could perform better if the JVM they are using is tuned for a specific application or server.

To be able to effectively troubleshoot your ColdFusion applications and even know when the systems problem are actually because of the JVM or some other problem you have know be able to monitor it. Many times what appears to be problems with your ColdFusion application are actually issues with the JVM but this can be hard to determine just from the normal ColdFusion log files. For this reason it is extremely useful to know how to monitor the JVM.

NOTE

The JVM, its memory usage, and garbage collection are complex topics. One of the best resources to understanding these issues is Sun's article on garbage collection and tuning, available at http://java.sun.com/docs/hotspot/gc1.4.2/. Another good resource is http://www.artima.com/insidejvm/ed2/gc.html.


There are a variety of ways to monitor your JVM's actual memory utilization under load, which can help you figure out whether you are having Java-related memory problems. One method is to log the JVM's garbage collection. You can do this by either going to the ColdFusion Administrator then selecting 'Java and JVM' and editing the JVM arguments there , or to the jvm.config file found in your ColdFusion root \runtime\bin\jvm.config and editing the JVM attributes. If you use the ColdFusion Administrator all you need to do is add these attributes:

 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC 

right after the '-server' in the 'JVM Arguments' field then click on 'Submit Changes'. You should see something like Figure 2.12. You will have to restart ColdFusion MX 7 for the changes to take effect. If you decided to edit these arguments in the jvm.config file directly then you need to add these attributes right after 'java.args=-server' and save the file. You will also have to restart the ColdFusion MX 7 service for these changes to take effect.

Figure 2.12. ColdFusion JVM Arguments as seen through the ColdFusion Administrator.


These attributes turn on Verbose Garbage Collection which will now be output to ColdFusion root \runtime\logs\coldfusion-out.log. A typical output would look like this:

 01/13 13:39:23 user ColdFusionStartUpServlet: ColdFusion MX: application services are now available  {Heap before GC invocations=55: Heap  PSYoungGen      total 4352K, used 3319K [0x10010000, 0x10580000, 0x138f0000)   eden space 3136K, 99% used [0x10010000,0x1031ffb8,0x10320000)   from space 1216K, 15% used [0x10450000,0x1047e010,0x10580000)   to   space 1216K, 0% used [0x10320000,0x10320000,0x10450000)  PSOldGen        total 8960K, used 5625K [0x138f0000, 0x141b0000, 0x30010000)   object space 8960K, 62% used [0x138f0000,0x13e6e480,0x141b0000)  PSPermGen       total 21504K, used 16105K [0x30010000, 0x31510000, 0x38010000)   object space 21504K, 74% used [0x30010000,0x30fca538,0x31510000) 148.758: [GC 8945K->5961K(13312K), 0.0056180 secs]  Heap after GC invocations=55: Heap  PSYoungGen      total 4352K, used 312K [0x10010000, 0x10550000, 0x138f0000)   eden space 3008K, 0% used [0x10010000,0x10010000,0x10300000)   from space 1344K, 23% used [0x10300000,0x1034e020,0x10450000)  to   space 1024K, 0% used [0x10450000,0x10450000,0x10550000) PSOldGen        total 8960K, used 5649K [0x138f0000, 0x141b0000, 0x30010000)  object space 8960K, 63% used [0x138f0000,0x13e74480,0x141b0000) PSPermGen       total 21504K, used 16105K [0x30010000, 0x31510000, 0x38010000)  object space 21504K, 74% used [0x30010000,0x30fca538,0x31510000)}  

As you can see, this provides detailed information about how exactly the JVM is allocating memory at any point in time, as well as how long cleanup took.

If you see that, over time, the percentage of memory used for any specific space is over 90 percent, and that garbage collection is taking a relatively long time (1 second would be considered a long time), then you most likely have a problem. For example, the example shown just above is taken from a system under almost no load. Notice the line Heap before GC invocations=55, which lets you know that the following data is how the JVM has assigned memory before garbage collection. Then look at one of the generations (these are essentially buckets of memory where certain types of Java objects are stored, related to their frequency of use), PSOldGen. Notice it is using 62 percent of the memory available to it. PSOldGen is the name for the "old generation," which is where longer-term Java objects (such as cached templates, or data in session variables) are stored. After garbage collection, there is little change in the amount of memory used. In this case, we should consider increasing the maximum Heap size, since under load we can expect the PSOldGen to grow and perhaps lead to the JVM's running out of available memory.

This data is undeniably hard to read, and many people prefer some sort of visualization of memory usage rather than looking at log files. There are a number of tools to do this, such as JProbe or AppPerfect, that allow very sophisticated analysis of your J2EE application (in this case, Cold Fusion MX 7), and of memory usage and garbage collection. A simple free tool from Sun called Visual GC, which is part of the jvmstat toolkit, allows you to visualize garbage collection as it is actually happening. You can use Visual GC to easily confirm in real time memory leaks, memory problems (such as not enough memory), problems with JVM Garbage Collection, and other issues with out much of the effort and drudgery of reading log files.

To use Visual GC you need to get the jvmstat toolkit, just download it from http://java.sun.com/performance/jvmstat/ and unpack it. (At this writing, jvmstat 3.0 requires that you download and install the JDK 5.0.) . For the examples here, we will use the c:/jvmstat directory, but you can use any directory you wish.

Follow these steps to use Visual GC from the jvmstat toolkit with ColdFusion for monitoring the JVM:

1.

Start ColdFusion as the user you are logged in as. The easiest way to do this on Windows is to stop ColdFusion and then start it from the command line, using the cfstart.bat file found in the cfusionmx7/bin directory.

2.

Next, you need to get the ID for the ColdFusion JVM, which you will supply to jvmstat. Open up a command shell and navigate to /jvmstat/bat and then type jps. You will see the ID for the ColdFusion JVM (this ID will change each time you restart ColdFusion).

3.

In the same command shell, type the command visualgc theID. This launches Visual GC's GUI (Figure 2.13).

Figure 2.13. ColdFusion JVM as seen through Sun's Visual GC (jvmstat).


As you can see in Figure 2.12, all the information about the JVM's memory usage is captured here. You can also see garbage collections happen in real time. A great way to use this tool is when you are initially testing an application, to monitor the application's JVM under load. You can also use this data to troubleshoot your system, watching to see if garbage collections are taking too long and to get visual confirmation when you suspect that the JVM is running out of memory. In Chapter 4, we will discuss how to use this information for tuning the JVM to increase performance and system stability.



Advanced Macromedia ColdFusion MX 7 Application Development
Advanced Macromedia ColdFusion MX 7 Application Development
ISBN: 0321292693
EAN: 2147483647
Year: 2006
Pages: 240
Authors: Ben Forta, et al

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net