4.1 Functional description

< Day Day Up >

This section provides an overview of the shared queues components of an IMSplex environment.

4.1.1 Overview of shared queues in the IMSplex

The term IMSplex is used by IMS to describe an IMS environment invoking Parallel Sysplex services for sharing data, message queues, and other resources. When used to share the message queues, IMS uses the Common Queue Server (CQS) to put messages on, and retrieve messages from, shared queue structures on a Coupling Facility. CQS uses the System Logger to log this activity. These log records are required only for shared queue structure recovery. Each IMS system continues to log its own information in the online log data sets (OLDS) unique to each IMS.

Figure 4-1 shows a high level view of the IMS components required for two IMSs to share a common set of message queues.

click to expand
Figure 4-1: IMSplex shared queues component architecture

IMS subsystems and the OLDS

Each IMS subsystem (IMS1 and IMS2 in Figure 4-1) continues to provide connectivity to its logged-on end users. In a non-shared queues environment, IMS logs input messages received from these end users in its OLDS and queues the messages on a local transaction queue in its message queue pool (QPOOL). Since it resides only locally, only that IMS has access to the message (transaction). When the transaction is scheduled (locally) and a response returned, IMS logs the response message on the OLDS and queues it in the same message queue pool. These log records can be used to recover the message queues should an error occur resulting in the loss of the queues.

To share these message queues across multiple IMSs in the IMSplex, IMS execution parameters are set up to define a shared queues group to which each IMS belongs. In a shared queues environment, IMS still logs the input message on the OLDS as before, but instead of queuing it in a local message queue pool, IMS issues a PUT request to CQS to queue the message on a transaction queue in a Coupling Facility list structure. Once CQS has accepted the message, IMS deletes it from its local queue pool. Note, however, that these IMS log records cannot be used to recover the shared queues.

Common Queue Server

The Common Queue Server (CQS) functions as a server between IMS and the shared queue structures. IMS does not connect directly to these structures, but uses CQS services to put messages on the shared queue, to read messages from the shared queue, to move a message from one queue to another, and to delete messages when processing is complete. CQS is required on each z/OS image where an IMS control region sharing the message queues is running.

CQS is responsible for the integrity of these queues - not IMS. When you are not using shared queues, IMS recovers local message queues by using a copy of the message queues on the IMS logs (for example, a SNAPQ or DUMPQ) and then applying later-logged changes to that copy. When you are using shared queues, CQS follows an equivalent process. A copy of the contents of the shared queues structure is written periodically to a data set called the Structure Recovery Data Set (SRDS). Changes to the shared queues are logged by CQS. However, CQS does not have its own internal logger like IMS does. Instead, CQS uses the System Logger to log updates to the shared queues. Until CQS has successfully logged these changes, no updates are made to the shared queue structure.

Structure recovery data sets (SRDSs)

There is one set of SRDSs for the entire shared queues group. They are shown in Figure 4-1 on page 112 as SRDS1 and SRDS2. These establish a base point from which the log records can be applied to recover the shared queues if they are lost. The process of making the copy is called a structure checkpoint and is usually initiated by an IMS command, as follows:

    /CQCHKPT SHAREDQ STRUCTURE structure-name

Each structure checkpoint uses the older of SRDS1 and SRDS2. When it completes successfully, CQS issues a call to the System Logger to delete those log records which are older than the older of the two SRDSs. By taking frequent structure checkpoints, you can limit the amount of log data kept by the System Logger in the log stream. We'll address this in more detail later in this chapter.

Note that, no matter which CQS invokes structure checkpoint, the same set of SRDSs are used. A method of serialization is provided to prevent two CQSs from taking a structure checkpoint at the same time.

CQS checkpoint data sets

Each CQS also takes a system checkpoint periodically. CQS system checkpoint data is written to the log stream and the log token associated with that log record is written to the checkpoint data set and to the shared queue structure. The checkpoint data is used when CQS is restarted to resynchronize CQS and IMS. Note, however, that this system checkpoint data may be deleted from the log stream if it is older than the older of the two SRDSs. If this happens, the next CQS restart will be a cold start. There is, however, no loss of message queue integrity, and no loss of messages, since IMS has enough information on its logs to resynchronize the queues.

Shared queue structures

IMSs in a shared queues group can share a common set of messages by putting them in a shared queue list structure in a Coupling Facility. You will note in Figure 4-1 on page 112 that there are two of everything - one labeled MSGQ and the other EMHQ. These represent messages queued by IMS as full function messages, or fast path Expedited Message Handler (EMH) messages. EMH messages are defined to IMS separately and treated differently than full function messages. They have their own set of local and shared message queue structures, and CQS uses a separate log stream to log changes to these queue structures. The rest of this section, and this chapter, addresses full function message queues (MSGQ) only, but be aware that everything said also applies to the fast path EMH message queues (EMHQ).

Primary structure

This is the only shared queue structure that is always required in a shared queues group. It is implemented as a list structure and is the primary repository for IMS full function messages (and some control information). Within this structure there are list headers for input transactions, output messages, remote MSC destinations, and several special purpose list headers which are not of interest here.

In normal operation, there will be very few messages in the primary structure. Messages only reside in the structure for the life of the transaction. As soon as the transaction completes, the messages will be deleted.

Overflow structure

This list structure is optional. If it is defined, it is used to prevent the primary structure from filling up. Individual queues (for example, a queue of messages to a printer which is out of paper) may get long and utilize a lot of space in the structure. These queues are identified by CQS and moved from the primary structure to the overflow structure, reclaiming space in the primary structure and allowing normal processing to continue. Any more messages for those queues are put directly on the queues in the overflow structure instead of the primary structure. If the overflow structure fills up, only those queues in the overflow structure are affected.

System Logger

As already mentioned, CQS does not have its own internal logger. It uses the services of the System Logger (IXGLOGR) to log any changes to the shared queue structures. The System Logger is a component of z/OS and executes in a single address space in each z/OS image. All users of System Logger services (for example, CQS) use the same copy. When CQS wants to write a log record for a shared queue structure change, it issues an IXGWRITE request to the System Logger, which then externalizes the log record in the log stream identified on the write request. In an IMS shared queues environment, this log stream always exists in a logger structure in the Coupling Facility. Other users of the System Logger's services (for example, CICS) may elect to save their log stream on a staging data set. When this is done, the log stream cannot be shared across multiple z/OS images in the Parallel Sysplex.

Details of the System Logger can be found in Chapter 2, "System Logger fundamentals" on page 7 and in z/OS MVS Setting Up a Sysplex, SA22-7625.

Log streams

A log stream is a sequential series of log records written by the System Logger at the request of a log writer such as CQS. They are written in the order of their arrival and may be retrieved sequentially, forward or backward, or uniquely, by log token (key). For an IMS shared queues group, there may be two log streams - one for full function and another for fast path EMH. This is true even if the shared queues are split across a primary and an overflow structure. The more recent log stream records will reside in interim storage in a Coupling Facility list structure (the logger structure). Older records may be offloaded to permanent storage on DASD (offload data sets). Eventually CQS will delete log records from the end of the log stream when they are no longer needed for message queue recovery. When this occurs, the logger structure space, or the DASD space, used to hold these log records is freed up.

System Logger structures

IMS shared queues always uses Coupling Facility list structures for its log streams. A DASD-only log stream is not an option for the interim storage of the log stream since the staging data sets cannot be shared across the sysplex by other members of the shared queues group. There is one logger structure for the full function log stream and another for the fast path EMH log stream (if it exists). The logger structure is defined in the CFRM policy and is allocated as a persistent list structure. Persistence means that the logger structure is retained even when there are no connected users. Although log streams can share a logger structure, IMS does not recommend placing the full function and fast path EMH log streams in the same structure. Each log stream would be allocated one half of the available space in the structure and it is not likely that the characteristics of the full function and EMH workloads would be similar.

Offload data sets

When the logger structure reaches a user-specified (or default) high offload threshold, one of the System Loggers in the sysplex will offload all or part of the log stream from the logger structure to permanent storage on an offload data set. The offload data sets are VSAM linear data sets. By default, up to 168 offload data sets may be allocated per log stream. Once in an offload data set, the log records remain there until deleted by the CQS. Since the log stream can potentially become quite long, you may define, in the LOGR policy, multiple extents to contain the offloaded log stream. Each extent consists of another 168 data sets. It is very unlikely that CQS would require more than a single extent of 168 offload data sets.

Staging data set or dataspace

The System Logger on each z/OS image will always duplex the interim storage portion (that part in the logger structure) of a log stream. One copy of interim storage goes in the logger structure. You have a choice about where the second copy goes. It may go to a dataspace, or you may choose to keep the second copy in a data set called the staging data set. All log records which exist in the logger structure also exist in one of these two places. As soon as they are deleted from the structure, or offloaded from the structure to an offload data set, they are also deleted from the staging data set or dataspace. Note that log write requests are synchronous and do not complete until the log record is written to both copies of the interim storage. The use of staging data sets for the duplexed copy of interim storage may inhibit performance due to the synchronous DASD I/O.

Each copy of the System Logger has its own staging data set or dataspace in which it keeps a copy only of what that System Logger put in the structure. However, if there is only one CQS active in the shared queues group, or only one is generating a significant amount of log data, the log data will have been written by one System Logger, so the staging data set size (defined in the LOGR policy) should be large enough to hold all the log records in the structure. Because of the way space is used in the staging data set, the data set should be much larger than the structure.

Log stream size and relationships

Figure 4-2 on page 116 shows that portions of the log stream may exist in both the logger structure and on the offload data sets. The beginning of the log stream (the oldest active record) is on the offload data set containing the older of the two structure checkpoints. The end of the log stream is in the logger structure. When CQS takes a structure checkpoint, it will delete log records from the beginning of the log stream, establishing a new beginning just after the older structure checkpoint. The first log record at the new beginning will be a "structure checkpoint begin" (type x'4201'). This will be for the older of the two structure checkpoints.

click to expand
Figure 4-2: Where is the log stream?

When sizing the logger structure (defined in the CFRM policy) and the offload data sets (defined in the LOGR policy), you must allow for enough space in the log stream to contain all the log record entries since the older structure checkpoint. The size of the structure determines how frequently the System Logger will offload, but the size of the offload data sets determines how much data can exist in the log stream.

The log stream shown in Figure 4-2 currently exists in the structure and four offload data sets. The next offload(s) will complete filling offload data set "5" and then allocate number "6". Data sets "0" and "1" contain data that has been logically deleted, however the data sets will not be physically deleted until the System Logger goes through allocation processing for a new offload data set. There can be only 168 offload data sets allocated at one time, although the sequence numbers in the data set names do not reset to "0".

4.1.2 How CQS uses the System Logger

IMS may receive messages from several sources, including the network or an application program inserting to the message queues. Before processing that message, IMS must first queue it in the shared queue structure where it is available to all IMSs in the shared queues group.

Message logging

Figure 4-3 on page 117 shows how message and log data flow during the receiving and enqueuing of an IMS input transaction. Similar functions occur for output messages. Every activity that changes the shared queue structure is logged by CQS.

click to expand
Figure 4-3: Flow of message and log traffic

When these messages are received in the local IMS message queue pool (QPOOL), they are logged to the IMS OLDS, after which a request is made to CQS to PUT the message on the shared queue. That IMS task (ITASK) then waits for CQS to accept the message and respond.
When CQS gets the PUT request from IMS to put a message on the shared queue, CQS creates a log record in a local CQS buffer and issues an IXGWRITE request to the System Logger to write the log record to the appropriate log stream. CQS then waits for acknowledgement that it has been logged before continuing with the original PUT request from IMS.
When the System Logger gets the IXGWRITE request from CQS, it writes the log record to the dataspace or staging data set (3a), then writes it to the LOGR structure (3b). When it successfully completes this process, it acknowledges positively to CQS.
When the System Logger acknowledges to CQS that it has successfully logged the data, CQS PUTs the message on the shared queue structure and responds to IMS. When CQS responds that it has successfully queued the message, IMS deletes that message from its local QPOOL.

Shared log stream

There are multiple IMSs and multiple CQSs in a shared queues group. But there is only one shared queue structure, and therefore only one log stream with updates to that shared queue structure. Every CQS in the IMSplex must use the same log stream. Each CQS writes update log records to that log stream and the System Loggers interleave those log records on the log stream in the order in which they are received—that is, in time sequence.

Log writes

Figure 4-4 on page 118 shows how log records are interleaved on the log stream by the System Logger in the order in which they are written by the CQS clients.

click to expand
Figure 4-4: Shared log stream,

Log reads

The log records are needed by CQS when it detects that a shared queue structure has failed. When this happens, the first CQS to detect the failure is responsible for recovering the structure using the SRDS and all the log records from the time the most recent complete structure checkpoint. Figure 4-5 on page 119 shows that one CQS can recover the message queue structure by reading the merged log stream. All log records from all CQSs are returned to CQSA.

click to expand
Figure 4-5: Reading the log stream

When CQS discovers that a shared queue structure needs to be recovered, it allocates a new structure in the Coupling Facility, reads the SRDS, and restores the structure to its condition at the time the structure checkpoint was taken.
CQS then issues IXGBRWSE requests to the System Logger to read the log stream backwards beginning with the first log record created after the structure checkpoint was taken and continuing to the most recent log record. As each log record is read, it is used to update the shared queue structure. Although part of the log stream may be on DASD, a CQS reading the log stream merely has to issue IXGBRWSE requests to the System Logger to retrieve all required log records.

Log records

In a shared queues environment, both IMS and CQS create log records with similar functions. For example, IMS logs an input message in a type x'01' log record, and an output message in a type x'03' log record. When IMS requests CQS to put either type message on the shared queue, CQS logs it as a type x'07' log record. When IMS deletes a message from its local queues, it writes a type x'36' log record. CQS writes a x'0D' log record when it deletes a message from the shared queue. Both IMS and CQS logging occur in a shared queues environment.

Log record types

There are 13 CQS log record types, each with one or more subtypes, that CQS uses to manage itself and the message queue structures. For a description of the CQS log record types, see IMS Common Queue Server Guide and Reference, SC27-1292. The CQS log record DSECTs can be obtained by assembling macro CQSLGREC TYPE=ALL from SDFSMAC (IMS macro library). For the IMS log record DSECTs, assemble macro ILOGREC RECID=ALL in the same library. A description of these log records can be found in IMS V8 Diagnosis Guide and Reference, LY37-3742.

Printing the log stream

The IMS program File Select and Formatting Print Utility, DFSERA10, can be used to print a CQS log stream. Example 4-1 shows a sample job that will print all the log records in the log stream. The CQSERA30 exit formats the log records for you. Since the log stream is probably protected by RACF, the userid will have to be given access. See 4.2.3, "Security definitions" on page 127 for an example of how this is done. The utility is documented in IMS V8 Utilities Reference: System, SC27-1309.

Example 4-1: Sample job for printing the log stream

 //CQSERA10 JOB .... //STEP1    EXEC PGM=DFSERA10 //STEPLIB  DD   DSN=IMSPLEX0.SDFSRESL,DISP=SHR //SYSPRINT DD   SYSOUT=A //TRPUNCH  DD   SYSOUT=A,DCB=BLKSIZE=80 //SYSUT1   DD   DSN=MSGQ.LOG.STREAM,SUBSYS=(LOGR,IXGSEXIT),DCB=BLKSIZE=32760 //SYSIN    DD   * CONTROL CNTL H=EOF OPTION PRINT EXITR=CQSERA30 END /*

4.1.3 System Logger offload processing

Eventually the logger structure would become full, since IMS never stops putting messages on the shared queue and CQS never stops writing log records. There is a way to shorten the log stream by deleting log records no longer needed, which we will discuss later. But since this is not usually done frequently enough to keep the log stream small enough to fit in a logger structure, the System Logger must periodically copy the older log records to a DASD data set and delete them from the logger structure. This process is called offload processing and is driven based on a user-defined high offload threshold percentage.

Figure 4-6 on page 121 shows, at a high level, the offload process. Note that this is strictly a System Logger process and that CQS is not involved in any way. CQS and System Logger activity is not quiesced during the offload process.

click to expand
Figure 4-6: Offloading the log stream

When the System Logger detects that the logger structure has reached the highoffload threshold, it reads enough log records from the log stream to reduce the logger structure utilization to a user-defined lowoffload threshold. Before deleting them (freeing up the space in the structure), they are written to one (or more) DASD logger offload data sets. They will also be deleted from the duplex copy (dataspace or staging data set).

4.1.4 Offload data set deletion

An offload data set is eligible for deletion when it contains no active log record entries, that is, when CQS has deleted all log records in that data set. However, physical deletion does not occur until the next offload data set allocation. If this never occurs, then those data sets eligible for deletion may never get deleted, occupying DASD space without performing any useful function. A more likely scenario is that CQS deletes enough log records to bring the log stream entirely within the logger structure. This makes all offload data sets eligible for deletion. If, however, due to low activity or frequent structure checkpoints, allocation of another offload data set is never done, or not done for a long period of time, those data sets will not be physically deleted, or not for a long time. It is therefore of some importance to set the size of these data sets (LS_SIZE parameter in the LOGR policy) large enough so that you never run out of available extents (168), but not so large that a lot of DASD space is consumed with deleted log data.

4.1.5 Importance of the log stream data

Unlike other subsystems like CICS, CQS only requires the log stream to recover the shared queue structure if it fails (structure failure, Coupling Facility failure, or loss of connectivity from all systems). It may be used by CQS to restart, but it is not required. If the logger structure fails, or becomes full, or an IXGWRITE request to the System Logger fails, CQS immediately initiates a structure checkpoint, eliminating the need for any log records. Of course CQS cannot perform any more shared queue structure services until the log stream is available, but there is no loss of message queue data unless there is a double failure - that is, if both the log stream and the shared queue structure fail at the same time. Even though the System Logger has its own means of log stream recovery, it is usually wise to keep the two structures on different Coupling Facilities to avoid a single point of failure.

< Day Day Up >