(TPF Persistent) Collections Corner

(TPF Persistent) Collections Corner
by Michele Dalbo, IBM TPF ID Core Team, and Daniel Jacobs, IBM TPF Development

Just as we promised, we are back with the second in a series of articles about persistent collections. If you have forgotten what a persistent collection is, or you missed our first article, you might want to review "Persistent Collections 101," which was published in the May/June 1998 issue of ACP/TPF Today. Persistent Collections 101 provides a high-level overview and a figure showing just how these persistent collections might look on a TPF database.

OK, so you have reviewed our last article and you are ready for some interesting summer beach reading. Just tuck this issue under your arm along with your beach towel and read on because this time we are going to talk about two very similar collections: log and keyed log.

Log
Characteristics:
You might recall the two collections that we previously discussed: arrays and binary large objects (BLOBs). Well, logs are also very similar to these two collection types. As in arrays and BLOBs, a log is a collection of ordered elements that are a fixed length (up to a maximum length of 4000 bytes) and are accessed by relative position (index) in the log starting with index 1. Multiple elements can contain the same data. But that is where the differences begin. In arrays and BLOBs, you can place data at any position within the collection and expand the collection if necessary. With logs, the elements are ordered by arrival sequence. When the collection is full, instead of expanding, the collection wraps and starts overlaying elements at the start of the collection. The first element is always the oldest entry still in the collection, and the last element is the element most recently added to the collection. Elements cannot be removed from a log collection.

Advantages:
A log collection is perfect if you only need to keep the last “x” number of units, or elements, of data. The fact that older data is automatically replaced with newer data is transparent to the application.

Example:
Let's assume you are a banker and you need a simple program that will track the last 10 transactions for a customer account and the account's closing balances for the past 90 days. Customers from this account want to access your bank from home or the office and check their recent account activity and ending balance. A log collection would allow your customers to see the last 10 transactions posted to their accounts. Another log, with a size of 90 elements, could be used to track the closing balances of an account. If one entry is made to this log at the end of each day, 90 entries will represent 90 days. Because the last element (transaction) is the newest added to the collection, customers could quickly check the last transaction posted to their account, and find their last daily ending balance. After the first 90 days, the collection becomes full and new transactions will begin overlaying old transactions at the start of the collection. This process continues throughout the year allowing quarterly statements to be generated and sent to your customers.

Keyed Log
Characteristics:
A keyed log collection is the same as a log collection except that each element consists not only of data, but also a key. The key can be used to access specific elements within the collection by search value in addition to being able to access elements by position. There is no order to the keys of the collection, and keyed logs do not support duplicate keys (but, as in a log collection, element data fields may be non-unique). The elements can be a maximum length of 4000 bytes, and the maximum key length is 256 bytes. As with logs, the first element in a keyed log is always the oldest entry still in the collection, and the last element is the element most recently added to the collection.

Advantages:
If you have data that you need to retrieve by a certain value, a keyed log collection will work for you. Just like a log collection, when a keyed log collection is full, the collection wraps and overlays the oldest elements.

Example:
Let's take the example we used for a log collection and expand it for customers who want more than the ability to just check their daily transactions and ending balances. These customers want to be able to access their accounts and retrieve transaction and ending balance information for specific dates from home or the office. Maybe they need to know if a specific check cleared on a particular date, or if a deposit dropped off at a night deposit box was posted to their account. A keyed log collection provides the ability to retrieve information this way. The key would be the date, and the element data might consist of the check number or deposit, for example.

Comparing the Log and Keyed Log Collections
Because these two collections are closely related, it might be helpful to show the differences:

Characteristics	Log	Keyed Log
Elements ordered by arrival sequence	Yes	Yes
When a collection is full, the collection wraps and elements are overlaid	Yes	Yes
Element fixed length	Yes (maximum length is 4000 bytes)	Yes (maximum length is 4000 bytes)
Accessed by relative position (index) starting with index 1	Yes	Yes
Access by key	No	Yes
Duplicate keys	Not Applicable	No (keys must be unique)
Maximum key length	Not Applicable	256 bytes

So now you have learned about two more collections in collection summer school. We bet that you cannot wait for our upcoming Fall article where we will continue this series! But enjoy your break while you can... the next article promises to be huge!