The New RECOUP - Part II
At the first two task force meetings in Poughkeepsie, the members agreed on objectives and requirements for the new package. Originally we had hoped that one of the members would have the package which would meet everyone else's requirement. It became clear rather soon, that was not the case. What we had discovered is that 8 different members had devised 8 radically different approaches to improving file maintenance. Some of these approaches were truly impressive, and the task force decided that they would choose the best approaches and/or enhancements that addressed the requirements. The task force also agreed that new development would be kept to a minimum, since it appeared that most of the pieces have been developed and in production.
Once the contracts were signed and code began to be shared, a small dose of reality set in. The effort required to integrate the unique enhancements was a major undertaking, as the code had to be brought up to IBM standards and made to function for "all" environments. The package had to be made complete, i.e. pieces such as online deactivation had to be integrated with the online recoup and PDU phases. Every time we finished integrating large pieces together and using the package, we would discover that more development would be needed to make the implementation as clean as possible.
For example, chain chasing multiple ID's on multiple processors required additional controls to be added, in case users require that certain ID's be chain chased first, last, on which CPU; this then led to the concept of versions of each chain chase ID, with the appropriate modifications to several different parts of the package. Finally, the vanilla package is so old that we ran into numerous limitations that required re-design: timeout handling, error handling, fallback, bottleneck removal, etc.
The seven requirements listed above encompass about a total of two dozen requirements. The task force package meets 95% of them, and then some. Performance and integrity have been fully addressed. Only items 1) and 7) are "partially" unfulfilled. The complete elimination of tapes was not possible for all circumstances. For example, when running a recoup in 1052 state there is no pool available for logging broken chains and lost addresses, and we did not want to use fixed records for that. From a logging point of view, many users still prefer to keep chain chase data and/or erroneous/lost pool data in hardcopy format. Offline software is therefore sometimes needed. The tapes if used function only as output devices; it is up to the user. Overall, this package is orders of magnitude faster than the vanilla package, it is safer to use, it is easier to operate, and it will lower cost of operations by reducing downtime and daily tape handling.
Online Chain Chase
The purpose of the online chain chase is to eliminate the need for RCP/RCI tapes, and to eliminate the offline phase 2 process entirely. Much of what will be discussed next was developed at US Airways, and other parts from British Airways and Worldspan. The RCP tape contains 3 pieces of information:
1) Pool ordinal and type
2) Pool record ID, and
3) Broken chain information.
The pool ordinal and type are used by DYOPM to set a bit to in use in a new set of pseudo directories (PD). This has been replaced online with the PD residing as a 1K record in VFA (delay file candidate). Each pool record chain chased will also cause an update to a VFA delay file record, which (if VFA is large enough) should be a non-I/O update to VFA, and is fast. The number of 1K records in VFA will be about as many are #SONRI records. Once chain chase is complete, Phase 2 is effectively complete as well.
The pool record ID has been replaced with a set of ID counts records. The ID counts records are SSU unique, and while an ID is being chain chased the counts are updated in core. When the ID is complete, the counts are written to file. The counts have to be 100% accurate, and the package has been designed to keep counts 100% accurate across soft-IPL's, RCI processing, and if there is a selective recoup (ZRECP SEL) done after chain chasing. The ID counts will only lose their accuracy should there be a hard-IPL, in which case they will still be approximate. The broken chain information is written to an online database as well as the RCP tape. There are comprehensive displays that can be done against this database. In addition, chain chase timeouts and fixed record errors are also written to this online database. There is also a way to limit the amount of errors per chain chase item to prevent too many errors being logged for a troublesome database.
All of the above is performed for both TPFDF recoup and traditional recoup. The RCI process has been eliminated from both as well. The new RCI process uses the VFA resident PD to determine if a chain has been chain chased, and if so, does not chain chase further. This ensures that pool records are found once and only once. This process is designed to 1) Ensure that ID counts are accurate, 2) Pool records are retrieved only once, 3) Detect broken chains properly, and 4) Maintain integrity if there was an IPL. All of this has been accomplished. The RCI process is basically controlled via the descriptors (or DBDEFs) and is very efficient and fast.
With the online recoup we have eliminated the need for the RCP, RCI, RPE tapes. Users may still prefer to use the RCP tape for logging and migration purposes, and that support remains. The BKD tape has also been eliminated. Descriptor records are now loaded as 4K programs via the loader package. This allows users to embed SVC's, macros, ENTRC's inside of the descriptors. Once loaded, a new functional message (ZRBKD) allows users to move the descriptor from the program area (which is unique to each TPF image) to a fixed file area. The ZRBKD tool has some features, like allowing users to load or fallback a single descriptor, and it has a display capability. TPFDF DBDEF's are loaded as programs already, and this was not changed.
Recoup is started via a ZRECP START. A display of the current recoup options is shown. There are several user-customizable options that allow users to change the way recoup is run:
The next command is a ZRECP RECALL, which starts the directory capture, following that, chain chase is automatically initiated. At this point, the user can choose to start chain chase on other loosely coupled processors as well via a ZRECP PROC x START command. Chain chase will complete and then the user can process any fixed errors that occur with a ZRECP SEL command. ZRECP SEL processing has been enhanced with some safety checking to prevent an operator from accidentally entering the wrong information. Once all fixed errors are processed (or ZRECP CONT is entered), a second directory capture is performed, and online recoup phase 2 is automatically initiated. Online phase 2, which used to run offline on MVS for several hours, now runs online in about 10 minutes. Once that completes, the user has:
Because the PD is maintained online, a selective database protection feature has been added to the package. Should a bad roll-in occur due for example to an incorrectly coded descriptor, the operator can load the new descriptor, and selectively chain chase only that database. When chain chase completes, the operator can then apply the PD to the online pool directories and protect those pools found to be in use.
The online process eliminates the need for tapes and general files. This helps avoid human error, eliminates operator logging and tracking of various tapes (RTA, RCP, etc), and eliminates the long running offline Phase 2 operation. The online chain chase does not by itself improve Phase 1 performance. Phase 1 performance is significantly improved with a multi-processor and multi-group chain chase.
Multi-processor and Multi-Group Chain Chase
The purpose of the Multi-processor and Multi-Group chain chase is to significantly speed up the Phase 1 run time. It'll be any day now, any day,... any day. Really, we are not kidding. The vanilla package would have phase 1 running over 24 hours at some large installations. Phase 1 performance is one of the most pressing issues for the task force. One significant speed-up is being able to chain chase up to 8 chain chase items simultaneously on a given processor. This is known as multi-group and it is always turned on.
This piece was developed by Galileo and US Airways. Multi-group is important because certain databases have very long chains, and these chains take awhile to chain chase. Multi-group allows other items to be chain chased in conjunction with the chain chasing of those long chains. The idea is simple. Whenever an item has kicked off its last fixed record ordinal, another item can be started. This is how it is designed for both tradition and TPFDF recoup. Each item has its own timeout, ID counters, and broken chain counters. Another important speed up was to re-write the chain chase algorithms as well. Amadeus and Galileo contributed heavily towards this re-write. There are no longer any pauses and waits every so often for a given item. Each item runs flat out. Each item has an intelligent restart, so if there is a soft-IPL, it restarts where it left off. The number of ECB's that a processor could run with was raised to up to 999, and IBM LODIC macro is used throughout the package to prevent working storage depletion. In order to reduce the load on the main I-stream, this, and all other parts of the package run on all I-streams. All this put together should see phase 1 reduced by about 3X over the standard vanilla package with uniprocessor operation.
For loosely coupled customers, significant reduction in phase 1 time is needed and achieved by running a multi-processor chain chase. Enabling chain to run multi-processor was a very large effort and it was developed by Amadeus, with parts from EDS. As a result, it is possible to achieve an additional 3X improvement, for an overall phase 1 improvement of 9X over the standard vanilla package (author estimate). This is true parallel processing of Phase 1, and it will most likely result in the following additional Phase 1 savings for large installations (author estimate) - 2 processors: Phase 1 time reduced 50% - 3 processors: reduced 65% - 4 processors: reduced 70%.
The benefits diminish with more processors due to DASD overload, or due to one processor chain chasing an incredibly large database. Some members have listed their performance numbers in this article. Most of these numbers in the company sections reflect the new package versus a significantly modified user package, hence, it appears as if the Phase 1 time improvement (although very impressive) is < 9X. However, if the members were running with an unmodified vanilla recoup package, 9X would probably be achieved.
A primary and secondary processor concept is introduced. The primary processor is the one where recoup is started and rolled in on. The secondary processors are merely helpers for Phase 1. The primary processor can never be moved. It must remain the primary throughout. Secondary processors can be Started, Stopped, Exited, Restarted, or Leveled, via a command from the primary. Leveled means that the ECB's used for Phase 1 on that processor can be increased or decreased. Stopped means that the processor is made to stop chain chasing, and all of its work is valid. Exited is the same as Stopped, but that all its work completed is no longer valid and must be redone by another processor. The primary processor cannot be stopped or exited, but it can be restarted or leveled. The primary processor has the responsibility of ensuring that all traditional and TPFDF ID's have completed chain chase. If not, the primary processor will do so in lieu of the other processors.
Therefore, the primary processor will be the last processor to complete traditional recoup and then complete TPFDF recoup. It is usually a good idea to put the longest running traditional chain chase ID's on the primary because of this. The secondary processors are able to start chain chasing TPFDF structures even if other processors are still chain chasing traditional structures. Each processor performing chain chase has its own recoup keypoint, core and file copies of chain chase ID counts, RCP tape, and its own VFA-resident PD. Multi-group chain chase occurs on each processor as well. All processors share the broken chain database, descriptors, and DBDEFs. All tables necessary for chain chasing (such as a table used to convert file addresses to #SONRI directory ordinals) are kept in core to ensure fast access and throughput.
When chain chase is complete on all processors, the primary processor then will perform the Phase 2 operation. Since there are multiple sets of PD, one for each processor, these must be merged together to form one common PD for Phase 3. Also the ID counts records for each CPU are summed up to form one unified set of ID counts. Throughout this process integrity checking is performed to ensure that the VFA resident PD are accurate, since it is possible to lose VFA delay files under adverse circumstances. The final ID counts are added to a historical database where up to 10 runs worth of counts are stored.
Multi-processor chain chase tremendously speeds up Phase 1, however, there is one intractable problem. Chain chase items that are chain chased on different processors but that point to the same pool records will cause the ID counts for those pool records to be artificially inflated (doubled). This is because the PD are processor unique. Therefore, a practical solution was devised. Those ID's that were double counted are reported during online Phase 2. Before the next time recoup is run, the user should add controls to the descriptors/DBDEF's to force cross-chained databases to be chain chased on the same processor. This alleviates the problem. With multi-processor and multi-group chain chase, additional controls on descriptors and DBDEFs have been devised. There are controls to dynamically assign groups of chain chase items to be chased on the same CPU. There are controls to run one ID only when another has completed, or to run a series of ID's in succession, either in parallel or one-at-a-time (suspend multi-group). There are controls for the new RCI processing. Its just a bit more complicated, but 90% of the chain chase items do not need any controls. Because of all this, we have had to add version support to DBDEF's and descriptors. Version support is used to distinguish chain chase items with the same 2-char record ID from each other.
Multi-Group and Multi-Processor chain chase are possible with traditional and TPFDF databases. We have also had to make sure that selective recoup is smart enough to keep ID counts 100% accurate as well. Lastly, there had to be no CPU's cycled down for pool directory capture or roll in, so a smarter directory capture had to be implemented. All processors can now always remain in NORM state. Overall, what Multi-Group and Multi-Processor chain chase is providing is an as fast as possible chain chase. For phases 1/2 the command sequence is as follows:
(all done when PHASE 2 COMPLETE message comes out)
Online Phases 3/4/5
The purpose of having phases 3/4/5 entirely online is to eliminate tape/general file mishandling, simplify and speed-up the roll-in process, and allow for greater control and provide more information to the operator making the critical roll-in decision. That said, the entire process was re-designed and developed from the ground up. US Airways, British Airways, and Japan Airlines contributed to this effort.
For phases 3/4/5 the command sequence is as follows:
This will use the captured sets on online directories and the merged PD to determine if any addresses are erroneously available. When this pass completes the user can display online a list of erroneously available addresses by ID. If there are erroneously available addresses the user should enter ZRECP PROTECT. If there are none, ZRECP IGNORE will suffice. ZRECP PROTECT will update the current online directories and mark all erroneously available pool addresses as in use to prevent database damage. This is also called the first roll-in and is basically a subtraction of available pool. The RESUME and the PROTECT run in minutes, and it was designed this way so that erroneously available addresses could be removed as soon as possible.
After the ZRECP PROTECT is kicked off, the lost address pass begins. All lost addresses are retrieved and once this (long running) step completes, the user can display online a list of lost addresses by ID. Because the number of lost addresses may be quite large, users can instruct recoup to randomly only select a subset of the lost pool addresses to find and report, so as to speed up the lost address process. Once the lost address process is complete, the operator can decide if the recoup should be rolled in or not. He has all of the necessary information to make the call (Chain chase counts, Lost/Erroneous address data, Broken Chains). Should there be any question about a particular database, the operator can enter a ZRECP ADD id id id etc, followed by a ZRECP REBUILD. If this is done, the resultant roll-in directories are adjusted to not roll in any lost addresses whose record ID matches the ones specified with ZRECP ADD. If all is well, the operator only has to enter a ZRECP NOREBUILD instead.
One note about ZRECP REBUILD. This command is only possible if the user instructed recoup to find all lost addresses. Those users using the subset portion of the lost address processing would have to reset this option and rerun the lost address pass, which is possible. Following the REBUILD or NOREBUILD, the user may enter ZRECP PROCEED, which is the second roll-in. This is not the same roll-in as the old recoup roll in. The online pool directories are not replaced. Pools are added just like an ordinary PDU. This second roll-in is basically an add. There is no need to perform a ZDUPD following this roll-in like the old package used to require. This 2nd roll-in will roll-in all pool addresses that are either lost or have been released by applications, prior to the 2nd directory capture. That's it! Depending on how long the lost address pass runs for, a medium size shop can expect the run time for phases 3 & 4 to be about 45 minutes. Phase 5 (ZRECP DUMP) is optional, it causes all lost and erroneously available pool record data to be dumped to the RTL tape. Some users prefer to do this for historical tracking purposes. Phase 5 does not require an ADR tape anymore. It uses the online recoup structures.
Finally, in case the roll-in is bad, the operator can issue a command to fallback the pool directories rolled in. This automatically takes into account directories dispensed from in the interim and zeros-out those directories. Online phases 3/4/5 run on the primary processor, and are designed to run on all I-streams. All operations are multi-threaded for performance. All phases of recoup can also be run in 1052 state as well.
That wraps up Part II of our series. In the September/October issue of ACPTPF Today we'll continue our RECOUP update with Part III of our series which will cover Online Pool Directory Update, and Generation/Reallocation/Deactivation.