The movement to a DB2 data sharing environment can be done in two ways. A new install of DB2 can be done, giving you the opportunity to start with a clean subsystem to move applications into. This also makes the monitoring of initial data sharing performance easier, and new naming standards could be implemented at this time. So, although a new install is less painful, it is not often practical.
The other option, of course, is to migrate existing subsystems together by enabling a DB2 subsystem as a data sharing subsystem and then adding members to the group. This is much easier for the movement of very large applications and has less impact on distributed processing. It is also the more common scenario for moving to a data sharing environment.
The complications come with the catalog merge process and the measure of application performance as the migration occurs. Whether you decide to do a new install or enable/migrate of existing subsystems, there will be issues of you can effectively measure the performance of data sharing and the impact it is having on your system. Keep in mind that not all applications belong in a data sharing environment. Some applications will still benefit from isolation.
Application analysis, or selection, is the process of evaluating which applications will benefit from data sharing and so belong in a data sharing environment. You will need to determine the application objectives for data sharing in order to set performance objectives. Ask such questions as the following.
These are just a few of the questions that should be addressed in order to implement data sharing with maximum performance-benefit achievement in mind.
Current Environment Evaluation
Evaluate your current DB2 environment in terms of both system and applications before moving to data sharing. Even a movement to one-way data sharing can expose missed performance problems, although few because interaction with the coupling facility is minimal, and two-way data sharing can further magnify them. The time to fix known application/system performance issues is prior to any movement into the data sharing environment. Of course, these same items will still need to be investigated as workload and other factors change in the new data sharing environment. Such items to evaluate are locking activity, application commit frequency, bind parameters, use of standard rule-of-thumb recommendations, DSNZPARMs, software maintenance schedule/hiper application, buffer pools, and recovery/restart procedures.
Despite the many issues when migrating to a data sharing environment, careful planning and testing the process will help things go smoothly.
Merging existing subsystems is most common and has various pros and cons. The advantages include easier movement of large applications and less distributed processing implications. However, the disadvantages include complications with the catalog merge process, owing to the fact that no automated tool is available to help with this process. Depending on the number of objects and methods, this process can be laborious and error prone. You will also have to deal with naming convention issues.
You do not want to merge subsystems that do not need shared data; nor do you want to merge test and production subsystems. Rather, you would want to merge subsystems if they are split out only because of capacity constraints, if they need common data, or if they rely on distributed connections or replication to satisfy needs that could be resolved by moving to a data sharing environment. When merging subsystems be sure to evaluate the security schemes for both subsystems and ensure that the same level of security will be in place when they are merged.
Migration of the catalog is not too bad, compared to migration of the data. You first decide which catalog to migrate all other objects to, taking into consideration the preceding items. Query the catalog to determine which databases, table spaces, and indexes must be defined in the target system. Then use DDL to define the objects in the target catalog. Depending on the number of objects, the creation of DDL for this could become a cumbersome process, especially if your DDL is not current, or you do not have a product or process to re-engineer the DDL from the DB2 catalog.
The establishment of a flexible naming convention is the most important planning event in the process of migrating to a data sharing environment. When done properly, it will reduce operational and administration errors by reducing confusion and will allow easy extension to the sysplex. You will want to plan names carefully because some cannot be changed at all, and any changes are difficult and often error prone.
The names need to be unique within the sysplex, and several categories of names need to be decided on: group-level names, DB2 member level names, IRLM group and member names, subsystem data sets (DB2), and ICF catalog names. Many may have to change from current naming convention. Of course, this will depend on migration options, especially if you are starting a data sharing group from scratch or migrating existing systems into the group. For a complete list of items to be named, refer to the DB2 UDB for z/OS Version 8 Data Sharing Planning and Administration Guide.
Workload Management and Affinity Processing
One of the biggest advantages of the data sharing environment is the ability to balance your workload across the members in the group. The DB2 subsystems work very closely with the workload manager (WLM) component of z/OS, which allows users to optimally balance incoming work across the subsystems in the group. WLM will manage workloads on the basis of predefined parameters, according to the characteristics and priorities of the work.
This allows you to give more priority and resources to OLTP transactions over long-running batch queries. WLM balances these workloads by ensuring that the long-running batch queries do not consume the resources needed by the online transactions, allowing the OLTP transaction to achieve the desired response times.
Deciding how to distribute the workload across the data sharing group will affect the sizing of coupling facility structures and affect certain aspects of hardware configurations. With data sharing, you can move parts of a DB2 application workload across processors in the sysplex. All the processors in the sysplex will have efficient and direct access to the same DB2 data. It is up to you to decide how that workload is going to use its resources.
Data sharing allows you to move processing away from processors with capacity constraints. This is probably one of the best and quickest benefits that can be realized by an organization that is constrained and having problems completing its workload because it has simply outgrown its processor capacity.
You can have all members concurrently processing the same data by allowing a transaction to run on any member in the group or only one member accessing data. This decision will directly affect the amount of data sharing overhead you will experience, owing to the control of inter-DB2 read/write interest among the members to maintain coherency and concurrency. By using affinity processing, you can force an application to run on only one DB2 subsystem; perhaps you could run OLTP on one subsystem, batch on another, and ad hoc on a third.
The DB2 data sharing group configuration can be transparent to SQL users and programs; they are not aware that a DB2 sysplex exists and that the system can select which DB2 member processes the SQL request. The group-generic, member-specific, and hard-coded methods support this.
Sysplex Query Parallelism
With sysplex query parallelism, a single query can be split over several members, providing a scalable solution for complex queries for decision support. Sysplex query parallelism works in much the same multitasking way as CPU parallelism but also enables you to take a complex query and run across multiple members in a data sharing group, as shown in Figure 9-8.
Figure 9-8. Sysplex query parallelism
Good candidate queries for sysplex query parallelism are long-running read-only queries; static and dynamic, local and remote, private and DRDA protocols; table space and index scans; joinsnested loop, merge scan, hybrid without sort on new tableand sorts. Sysplex query parallelism is best used with isolation level uncommitted read to avoid excess lock propagation to the coupling facility.
A query is issued by a coordinator, who then sends the query to the assistant members in the group. The data is then processed and returned to the coordinator either by a work file, whereby the coordinator reads each of the assistant's work files, or by XCF links, if a work file was not necessary.
Sysplex query parallelism uses resources in the subsystems in which it runs: buffer pools and work files. If these resources are unavailable or not sufficient, the degreenumber of members used to process the querymay be decreased for the query.