RelationManager Part 1

by David Bayliss

Published 1999-09-14    Printer-friendly version

In my series of articles on the FileManagerclass I explained that the FileManager was logically there to embellish the underlying file drivers with information from the Clarion dictionary. The RelationManager class takes this dictionary embellishment one stage further to add the notion of related files. Currently there are three features this brings to the table:

  • Referential Integrity. It is quite possible for a file to be physically correct, pass the file level validation constraints, and yet still not correctly relate to the other files. The RelationManager therefore duplicates a number of file access functions, and the use of the RelationManager versions of these functions ensures that the file is correctly linked to other related files.
  • File Unification. This allows primary files which are linked to secondary children to be treated as a logical unity. This is a concept I occasionally refer to as BILF management (BILF stands for Bloomin' Irritating Little Files). A primary file could contain 100 fields, 10 of which are linked to children. Yet those child files don't actually mean anything; they are just created as part of the data normalisation process. It is really ugly if every time you use a BILF you have to go throughout your code opening it, preserving it, etc. The RelationManager therefore replicates some FileManager functions where the only service it performs is to perform the action upon all the related files in the tree.
  • Information provision. Other parts of ABC sometimes need to know information about relations (notably linking fields and keys). The RelationManager provides a portable interface to this information.

Considerations

To some extent all the considerations mentioned in the FileManager articles apply to the RelationManager, although less so. The RelationManager is built on top of the FileManager; specifically there is a one-to-one instance link between RelationManagersand FileManagers. As such the RelationManager always tries to use a FileManager function for a given activity if it can. This is not sheer laziness. By utilising the FileManager, any overriding of the FileManager automatically works for code using the RelationManager.

There were a couple of new issues too. One was sheer complexity (and thus the need for safety). The legacy relational integrity (RI) code went through at least a couple of iterations and to this day it still falls over some cases and corrupts file buffers at will. For ABC we wanted an RI system that was rock solid, but also efficient. Legacy had another problem that for large dictionaries (especially heavily related ones) the code bloated horribly, and we wanted to reduce that drastically.

Further we wanted (in the future) to be able to extend the system to allow one-to-one and many-to-many relationships. Finally we wanted the RI code to simply drop away if it is handled by the back end (usually on an SQL database). That's a pretty long shopping list!

As I head through the code overview I will warn you that the RI methods are by far the most complex procedures in the whole of ABC. They are an interesting example of my belief that you should isolate complexity. Don't smear it throughout code (where everyone can stumble over it) but focus it into a small space that you can approach with caution. Well, here are six small procedures (the largest is 60 lines) that get the Bayliss classification of ice pack jobs. It is my job to make them clear enough that everyone (at least everyone who is prepared to try) can understand them. I hope I succeed. For the sake of brevity I shall assume that you have read the FieldClass design documents.

Coffee... Icepack .... Action .... (On the plus side, if you can handle this then you are over the ABC learning curve. From here it is just more, not harder).

I strongly urge you to have the source code to hand whilst going through this article; it really will make everything a bit clearer.

Initialisation

The file drivers have no knowledge of the relationships provided in the dictionary; for this reason all the relation information has to be provided by the templates to the base classes. This is done by the templates overriding the .Init method and making a succession of Addxxxxx calls.

AddRelation PROCEDURE(RelationManager RM),PROTECTED

A Clarion relation can be viewed from either end and it is not enforced that both directions have a key (although you do need a key both ways for RI). This AddRelation method is called when the file being initialised is related to the file being passed in but where there is no linking key on the file being passed in. You may prefer to look at this as saying "he is related to me."

AddRelation PROCEDURE(RelationManager
RM,BYTE UpdateMode,BYTE DeleteMode, KEY His),PROTECTED

This method gives the ability to note a fully fledged relationship. The RelationManager passed in denotes the related file, His is the key you fill to get at his data. UpdateMode and DeleteMode specify the action to be taken upon a potential RI violation.

This AddRelation method has an interesting side effect: it primes the object to start accepting AddRelationLink method calls. There are OOP purists I know well (some I work with) who frown upon this kind of state within an object (the problem for the purists being that AddRelation must be called before AddRelationLink), but pragmatically it is efficient and encourages the object user to write readable code. What is actually happening is that this AddRelation creates a BufferedPairsClass which will then be filled with the linking fields of the relation.

AddRelationLink PROCEDURE(*? Left,*? Right),PROTECTED

There are two other AddRelationLink functions besides this one, but the variations are simply there to save code size. (A *? parameter takes about 50 bytes of code to pass, *LONG parameters take four bytes, *STRING parameters take six. Given that LONG and STRING cover 90% of all linking fields this efficiency is worth having.) What is going on here is simple, but needs grasping. This method is called from the templates with something like:

Relate:File1.AddRelationLink(File1.KeyField1,File2.KeyField1)

The *? parameter means the address of these fields is passed in and squirreled away for future use. Once this has been done for all the linking fields it is possible to assign from one set of linking fields to another using a single statement.

Init PROCEDURE(FileManager FM,BYTE UseLogout=0)

The base Init method simply ties in the FileManager this RelationManager is based upon. It also creates a queue for the relations and sets an internal property to denote whether transactions are to be framed within LOGOUT/COMMIT sections. Remember however that in template usage the Init method will typically be derived (in generated source) and the derived method will be full of calls to AddRelation to describe the dictionary fully within ABC.

If used fully this approach gives tremendous flexibility. It is quite possible to add files into the RI tree/or cut them out dependent upon system configuration. For example, you could have a file that is only shipped to certain customers but which is in an RI chain if it is shipped.

Kill PROCEDURE,VIRTUAL

This method simply steps through the relation queue, killing off any FieldPairs classes that have been created (for the RelationLinks) and then disposing them.

SetAlias PROCEDURE(RelationManager RM)

This method is used to specify that the current RelationManager is managing an alias of the passed in RelationManager. This method doesn't really do anything; it is simply there to enable the AliasFile property to be private. I didn't want the property public as I expect it to die when the FileClass comes along.

FileManager Replacements

These are substitutes for the FileManager equivalents. As such their basic semantics are the same. The difference is the related files are taken into account. For ease of explanation I am not tackling these in alphabetical order.

CancelAutoInc PROCEDURE(),BYTE,PROC,VIRTUAL

This method enables the form to readily tackle the problem of orphaned child records. (See FileManager III). The form can simply call the RelationManager equivalent (you should always consider Relate:File.Thing as "Access:File.Thing(Taking into account related files)"). The RelationManager calls down into the FileManager (passing in itself) to ensure children are taken care of.

Close PROCEDURE(BYTE Cascading=0),BYTE,PROC,VIRTUAL

This method simply issues a FileManager close on the current file, and all the child files, grandchild files etc. You would think this is quite easy, and in principle it is, but there is one little gotcha that makes the code quite complex. First consider the logical implementation. To Relate-Open file Fred you first open Fred then you open all of Fred's children. Then somehow you need to get the children to open their children.... Hang on, that's easy. Instead of opening Fred's children, you Relate-Open them and it all works. So a simple recursive solution would be:

RelationManager.Close PROCEDURE
I BYTE,AUTO
  CODE
  ASSERT(NOT SELF.Relations &= NULL)
  SELF.Me.Close()
  LOOP I = 1 TO RECORDS(SELF.Relations)
    GET(SELF.Relations,I)
    SELF.Relations.File.Close(1)
  END

Beautiful, elegant, efficient and liable to lock your machine the first time you try it. Imagine you have relationships A <- >> B < - >> C <->>D and A <->>D. Technically this is illegal in the Clarion paradigm (you need an alias for the second usage of D) but in practise you can usually get away with this (few procedures will have A, B, C and D all populated) and peoples dictionaries are littered with cyclic dependancies.

Now the recursive solution dies horribly. Suppose you close A. This closes B which closes C which closes D which closes A which closes B which closes.... You get the picture.

There are many sophisticated and elegant algorithms for detecting loops in graphs; we opted for a simple one. The idea is roughly this: when you get the first (top-most) call to close then you note the time. You then recurse as before but when you do the close you note inside the RelationManager the time you did the close. Then when you call a RelationManager to close it, you see if it has been closed since (or at) the top-most call. If it has then you have already been here before so you exit without recursing. You can actually implement this using CLOCK but there is one more little trick to spot. You don't have to use real time; any time will do. So for efficiency I made my own time stored in the Epoc variable. This time only ticks when the top-most call is made.

Here's a look at the code. First I check the cascading flag. This flag is purely there to indicate whether this is the "top" of the tree. If it is the top of the tree (cascading false) then I increment the epoc timer, if not then I check if for a touch in this "time-zone." If there has been a touch then the code returns; if not then I update the "last-touched" to prevent further recursion. Then it is just a case of closing this file, and then stepping through the children closing them. One extra tweak is an early out mechanism. Essentially if any of the FileManager.Close calls fail the tree walk stops. This is not particularly useful in the Close case but in general a FileManager method returning an error could easily have put up an error message to the user. If that has happened once the last thing the user wants is to step through error messages for each of the 150 related files as well.

Open PROCEDURE(BYTE Cascading=0),BYTE,PROC,VIRTUAL

The Open code is actually very similar to Close. I'm surprised I didn't use a parameterised private method-watch this space, as it is possible Open and Close will both have become shells for an OpenCloseServer by the time you read this. As an aside, I wonder if that seems unprofessional to you? Making mistakes, owning up to them and go fixing them? I never cease to be amazed by the people who write their code badly and then consider it inviolable. Encapsulation, a key feature of ABC, enables us to get the code right. Not OK, not working but right.

The one tweak is the LazyOpen mechanism. The FileManager has an attitude that says it won't actually open a file just because you asked it to. However we felt is reasonable that the primary file should be opened straight away so if this is the top of the open call tree (and cascade is thus 0) we call UseFile to force the file open.

Delete PROCEDURE(BYTE Query=1),BYTE,VIRTUAL,PROC

This method is the first of the nasties. Delete is really just there to delete the primary record. There are two main complications: the first is the need to check that you can delete the primary record (i.e. there are no RI constraints), and the second is the need for transaction framing (the ability to abort the delete process halfway through if something goes wrong and you need to undo all the mess you made).

First is a fairly simply query as to whether or not the user actually wants this record deleted. One little trick is the use of the guard flag on the left hand side of the AND and the Throw on the right. This relies upon the fact that the compiler does short-circuit evaluation of logical conditions. In other words the compiler guarantees that if it knows the result of a logical expression simply by evaluating the left hand side then it will not evaluate the right. So if query is zero the Throw will not be done.

Next is the LOOP that operates the "Retry the delete?" message if the first attempt at deleting failed. Then the position of the record to be deleted is taken and is TryFetched. This is because the record needs to be full and accurate to allow the child links to be found and I cannot assume someone has made a record accurate just to delete it. Between the position and TryFetch is a block inside an IF SELF.UseLogout. This code is a horribly complex way of doing a simple thing. LogoutDelete (documented in part two of this article) simply finds out which files may be altered by this delete and adds them to the transaction frame.

Following this code is the main loop, which steps through all the relations calling DeleteSecondary for all files which are related with some form of constraint on the delete. (In C5 the LocalAction function filters out the RI done upon the server which does not require assistance from ABC). Note that DeleteSecondary is a method in the related RelationManager. This is a vital point! You do not go around deleting other RelationManagers' records; you ask them to do it for you. What gets passed in is the key of the His that this RelationManager is related to, the FieldClass containing the list of linking fields, and the action mode to say whether restriction, cascading or deleting is called for.

How does this function work? From the perspective of the current RelationManager, the answer is "Don't know, not my problem," but it does matter that I know if it worked. If it didn't I must stop processing myself. Note the little CheckError routine calls are pernicious: they can cause the whole method to be aborted. This code assumes the DeleteSecondary will have issued the ROLLBACK if required.

Assuming the children were OK then the RelationManager deletes its own record and handle any errors (including transaction rollbacks of child deletes if required).

Update PROCEDURE(BYTE FromForm=0),BYTE,VIRTUAL,PROC

The update code is very similar to the delete code so I'll focus on the differences. There is no need for the "Are you sure?" query. There's also no need for the Position/Reget as the code can assume someone doing an update has valid records in the buffer! Because updates cannot be restricted it's okay to update the primary record before cascading to the children. Again any errors are handled.

NAME="Update"The real interest (and new code) comes in the secondary loop. Note the call to EqualLeftBuffer. When an update is commenced in a form the RelationManager's Save method is called which snapshots all of the values of the linking fields of the relations into the Buffer portion of the linking fields BufferPairsClass. Thus at the update it's possible to compare the left (primary) record with those stored values. If they haven't changed (even if the record has) then there isn't anything to cascade.

Suppose the cascade fails. Now there's a primary record (in memory, the disk image will have been rolled-back) with linking fields that now don't point to the children. Yuk! So upon failure the code copies the linking fields from the child back into the parent to tie the records together again.

Halfway There

This ends the discussion of the methods that are clearly and logically related to each other. There are a number of methods that don't fall into as clear a classification, and I'll cover those next month in the second part of the RelationManager discussion.


David Bayliss is a Systems Architect for The TopSpeed Development Center. He has worked upon TopSpeed's compiler and was the chief architect of the Application Builder Classes.

Printer-friendly version

Reader Comments

To add a comment to this article you must log in.

 
 

Search

 

Advanced Search
Topical Index

Related Articles

Subscribe to
ClarionMag

One year: $189

(includes all back issues since '99)

Renewals from $139

Two years: $289

Renewals from $239

More Info

Subscribe Now!

ClarionMag Blog

RSS Feeds

Updates via Email

Enter your Email


Powered by FeedBlitz

Quick Links