4.5 CQS and System Logger performance considerations

< Day Day Up >

4.5 CQS and System Logger performance considerations

System Logger performance is addressed in general in Chapter 8., "System Logger performance and tuning" on page 263. This section addresses, mostly in review of earlier discussions, the performance considerations unique to IMS shared queues and CQS.

The primary performance consideration for CQS with respect to the System Logger is how often a structure checkpoint is taken. Unless the System Logger is performing exceptionally poorly, it is unlikely that the writing of log records will represent any measurable impact on the overall performance of the IMS shared queue environment. However, a structure checkpoint is required to establish a new recovery point for the message queues and to delete log records that are no longer required from the log stream before the log stream gets too large and the directory entries in the LOGR policy are full (message CQS0350W).

Before a structure checkpoint can be taken, all CQSs in the shared queues group must be temporarily quiesced (no message traffic to or from IMS). Since only one of the CQSs performs the structure checkpoint, it is necessary for that CQS to notify others to quiesce, and then to know that they have done so. This is done by updating the CFRM CDS and requires several I/Os per CQS. This is what typically takes the most time for this process - not the actual reading of the message queues into the dataspace.

Because there is a small impact to the IMS service every time a structure checkpoint is taken, you do not want to do this too frequently. On the other hand, if structure recovery is required, you do not want to have to reapply gigabytes of log data, so you do not want to run for too much time between structure checkpoints.

4.5.1 Recommendations

In order to be able to minimize the frequency of structure checkpoints, you should have sufficiently large offload data sets (LS_SIZE in the LOGR policy) to hold all the data between structure checkpoints. The amount of offload DASD space will depend on how much data you log between structure checkpoints, which is a factor of transaction volumes and data logged per transaction. For example, a system doing 1000 transactions per second, and logging an average of 2000 bytes per transaction would generate 2 MB of log data per second, as below:

    1000 transactions per second times 2000 bytes per transaction = 2 MB per second

At this logging rate, you will generate 7.2 GB of log stream data per hour. Since you will have all the log data for up to two structure checkpoint intervals, this is 14.4 GB of log stream data per hour between checkpoints. Each shared queues environment will have different transaction rates and message sizes. Each installation will have different amounts of DASD available for offload data sets. And each application will have different pain levels for the quiesce time to take a structure checkpoint. All these factors must be considered when making your decision about how large the offload data sets should be.

To arrive at a "happy" frequency for your environment, we recommend that you take a checkpoint at a busy time and measure the impact. Similarly, you should test recovering a structure to determine how many seconds it takes to reapply each hour's worth of log data. When you have this information, you can decide how often you want to take structure checkpoints.

4.5.2 Tools

Some tools can be used to help size the log stream and logger structure. Others help analyze the performance of the log stream.

IMS shared queues log stream

Approximately 80–90% of the log volume generated is the logging of the input and output messages. To make estimates of the size of these messages, analyze the IMS logs to determine the average size of a message (x'01'and x'03 for full function - x'5901' and x'5903' for fast path EMH). There are several utilities and tools to help analyze the log stream and report on the number and size of messages, including the IMS Statistical Analysis Utility (DFSISTS0), the Fast Path Log Analysis Utility (DFSULTA0), and the IMS Performance Analyzer (program number 5655-E15).

Administrative Data Utility (IXCMIAPU)

This system utility (see Example 4-6 on page 128) can be used to determine the current size of the log stream. Specifically, it identifies the LOGR policy parameters for each log stream and tells you how many offload data sets are in use for each log stream.

SMF and RMF™

The System Logger writes type 88 (x'58') SMF records which can be used to monitor how much data is being written to, and deleted from, the log streams, broken down into interim storage (logger structure) and to the offload data sets. IXGRPT1 in SYS1.SAMPLIB can be used to report on these type 88 SMF records. See Appendix A, "Rebuilding the LOGR Policy" on page 297 for a detailed description of this report.

Coupling Facility structure activity is recorded by SMF in type 74 (x'4A') records. RMF Monitor III can be used to report on these activities.

4.5.3 Warning signs

A structure checkpoint will cause all message activity to quiesce until the message queues have been read into a dataspace. If you see periods of several seconds where there is a sudden increase in all response time, this may be due to structure checkpoint. If this is unacceptable, you may need to reduce the frequency of, or reschedule, your structure checkpoints.

The IXGRPT1 report described in 4.5.4, "SMF 88 fields of interest" on page 133 also provide warning signs in the form of type 3 writes.

4.5.4 SMF 88 fields of interest

The System Logger will generate SMF type 88 log records if requested in the SMFPRMxx member in SYS1.PARMLIB. These log records can be gathered and printed using a sample program and JCL found in SYS1.SAMPLIG (IXGRPT1 or IXGRPT1J). A sample output listing, and a description of the output fields, can be found in "IXGRPT1 Field Summary and IXGSMF88 Cross Reference" on page 283.

For CQS, or any log writer, the first goal is to not let the logger structure fill up. This means that offload performance must be adequate to relieve structure constraints before they become a problem. The report (RPT1) shows different types of writes, shows structure full conditions, and average buffer size. All these numbers are for the SMF monitor interval only - they are not cumulative, so they may change significantly as different workloads become prevalent.

A type 1 write means that CQS wrote a log record while the structure was below its high threshold. These should be the most common.
A type 2 write means that CQS wrote a log record while the structure was at or above the high threshold. This is common and doesn't not indicate any problems. An offload should already be in progress - probably just started.
A type 3 write indicates that CQS wrote a log record while the structure was more than 90% full. This should not be common and may indicate that your HIGHOFFLOAD value is too high. You are close to filling up the structure. This may be a problem with the offload process not performing well enough to free up space. Data set allocation can sometimes slow it down. If you are getting many type 3s, lower the high threshold and increase the LS_SIZE (fewer allocations).
Structure full is bad. This means that CQS tried to write a log record but couldn't because there were no more entries or data elements in the structure. The write will fail and CQS will put out message CQS0350W. These are just an extension of the type 3s and you should take the same actions.
Average buffer size is the total number of bytes written by IXGWRITE users divided by the number of writes invoked. This is the number used by the System Logger to recalculate the entry-to-element ratio, although the System Logger uses a 30 minute interval.

The SMF 88 records do not show the second important goal — not letting the log stream fill up (that is, all the offload data sets and the logger structure are full). This occurs when there are no more offload data sets available. This can be tracked using IXCMIAPU as described in 4.3.4, "Monitoring for offload conditions" on page 128.

4.5.5 Sizing log streams

A lot has already been said about sizing the log streams and logger structures. The main criteria here is to not run out of offload data set space. You always need enough room on DASD to hold all the log data that is generated through two structure checkpoint intervals. The size of the logger structure is not as critical. A smaller structure just invokes the offload process more often. A good target is to make the structure large enough so that, whatever your HIGHOFFLOAD threshold is, offload processing will not be invoked more frequently than about once every 30–60 seconds during peak transaction processing times.

4.5.6 Threshold recommendations

The only thresholds of interest to CQS are the HIGHOFFLOAD and LOWOFFLOAD thresholds in the log stream definition in the LOGR policy. Since CQS rarely needs to read these logs after they are written, the LOWOFFLOAD threshold should always be set to zero (0%) — that is, when offload processing is invoked, always offload as much as possible. The HIGHOFFLOAD threshold should be set between 50 and 80 (%). We recommend 50% since this leaves plenty of room in the structure in case offload processing gets delayed.

< Day Day Up >