David Bayliss On The FileManager Part 2

by David Bayliss

Published 1999-06-14    Printer-friendly version

(Part 2 of 3. Read Part 1 and Part 3)

This article is part two of the ABC design document on the FileManager class, and you really should read the first part first as this article simply continues the FileManager code overview. I hope I have also provided enough hyperlinks that people returning to the article for reference will be able to dig out the information they require. In this article I'll look at some of the administration functions, the error handling and the snapshot mechanism.

Whilst some of the aims of the class can be gleaned simply from reading this article, most fruit is available to those that actually settle down and read the ABC code along with the corresponding comments. This is actually something I would always encourage you to do. The backbone of ABC amounts to around 4,000 lines of code, so if you aim to master 100 lines a day you will understand the basic ABC paradigm completely within eight weeks! The FileClass amounts to 25% of that work.

Administration

This section details those methods provided almost entirely as wrappers upon internal information for the benefit of higher level methods and/or methods outside of the FileManager.

ClearKey PROCEDURE(KEY K,BYTE LowComp,BYTE HighComp,BYTE High)

This method is there to provide a shortcut for a piece of template code that occurred very frequently. Essentially it handles the problem of a multi-component key where you wish to perform a SET(KEY,KEY) but you only know the major components. In order to ensure the SET(KEY,KEY) gets you to the start of all the records you require you need to clear the low order key components. Clarion does not have a CLEAR(KEY) so the FileManager provides one for you.

Rather than clear the whole key the routine allows you to specify the low (majormost) and high (minormost) components you want cleared. This is to allow minor component clearing to happen when the major components have already been filled in.

The method works by first performing aSetKey so that the current record of the FileKeyQueue holds information for the current key. Then the key components are stepped through from low to high and the KeyFieldQueue is fetched to retrieve the information for the current component (see AddKey in the previous article). The GET is error trapped with a simple return if the component doesn't exist. This is to allow the HighComponent to be specified as 255, meaning "to the end."

The XOR logic illustrates a useful trick (and hides a complexity!). Remember that as you are trying to get all the records in a SET(KEY,KEY), you might assume that means you just CLEAR all the key values low. But wait a minute; suppose you are about to do a PREVIOUS rather than a NEXT. Then you need to clear all the values high. This works in the common case of ascending key components, but remember it is possible to have descending key components. Worse yet you can mix ascending and descending in the same key. If you sit down with pencil and paper (you may be able to do this in your head, but I needed pencil and paper) you will find you need to clear a component low if it is ascending and you are clearing low, or if it descending and you are clearing high. This can be expressed using a disjunction (OR) of conjunctions (AND) but the XOR operator wraps it up perfectly and goes down to one machine instruction. I could have coded this more tightly still as:

CLEAR(SELF.Keys.Fields.Field,CHOOSE(~(SELF.Keys.Fields.Ascend XOR High)))

But I thought that might be just a little too scarey.

GetComponents PROCEDURE(KEY K),BYTE

This simple little method simply returns the number of components in a key. It usesSetKeyand the fact that there is one KeyFieldQueue record for each component of the key.

GetEOF PROCEDURE,BYTE

The FileManager has a very specific meaning for EndOfFile: it means the last attempt to NEXT or PREVIOUS a record failed because the end of file has been reached. Specifically, if you have a file with 10 records EOF is true after the 11th NEXT, not the 10th. As such GetEOF is really just a short hand to detect a specific error condition.

The functionality could almost certainly be achieved by looking at the return code from NEXT/PREVIOUS and then delving to see what the error identifier was. Again this is a situation where the FileManager does work simply to reduce the amount of coding required by users of the object.

GetField PROCEDURE(KEY K,BYTE Component),*?

This method is used to return an ANY variable corresponding to a given component of a key. I didn't want to have to protect the rest of my code against GetField returning a null so the procedure ASSERTs that the incoming component will be found. In other words, GetField gracelessly handles out of range components.

This does illustrate another agenda within ABC: offensive programming. Defensively I would have coded so that an out of range value returned a null, which would take two lines of code. Then on the receiving end nulls would have been handled, presumably in some "see if we can still keep going" fashion.

There are four calls to GetField in abfile (i.e., this method is relatively underused). Each would have had to temporarily store the GetField result, test for the null and do something smart with it. This might have taken five lines of code each (one for the declaration, one for the extra assign, two for the null test, one to handle the null case). In total I would now need 22 lines of code to handle something that should never happen as opposed to the one line of code used in ABC. Doing that throughout a heavily integrated file like ABFILE could turn 2000 lines of code into 40,000 lines of code 95% of which would be rarely executed and thus minimally tested. QED.

GetFieldName PROCEDURE(KEY K,BYTE Component),STRING

This is really there for the benefit of methods using the PROP:Filter technology on a view. It provides the BIND name of a give key component.

GetName PROCEDURE,STRING

The FileManager has to cope with two possibilities for the name of a file. It may either be a constant or a variable (the latter corresponds to the case where the NAME attribute on a file contains a string variable). GetName is there to encapsulate this dilemma from the rest of the class. If a variable file name has been assigned then it returns that, otherwise it assigns the constant provided to it by the driver itself.

KeyToOrder PROCEDURE(KEY K,BYTE
        MajorComp),STRING

This method really takes GetFieldName one logical step further. Rather than just return a field name corresponding to a key component, this method returns an ORDER clause (in Clarion syntax) that is equivalent to this key starting at component MajorComp. A value of one thus gives the whole key as an order clause, two skips the leading component etc.

Note that the null key case is defended against. This is because it is totally reasonable to have a null key specified as the sort key of an object (corresponding to not specifying a key in the file schematic).

The only real complexity is in the RetVal assignment. The first CHOOSE is there to prepend the field name with a comma only if the string being built up is non-null. The second CHOOSE is there to place a leading '-' before a descending key component (the view driver treats -string as a descending string, it does not convert it to a number as the language would).

SetKey PROCEDURE(KEY
        K),PROTECTED

SetKey is used to fetch the correct record within the FileManager key queue for the usage of a key passed in to it. You cannot sort a queue on a reference field so the method has to loop through the queue finding a match. Files don't have that many keys so this should not be too onerous. I could start the method with a check to see if the current record value already matches as a kind of first level cache, but the downside is that this would hide a raft of bugs where people had not done a PUT after modifying the key information.

The loop illustrates an interesting and occasionally useful quirk of Clarion. You can have loop head and loop tail conditions (WHILE and UNTIL) in the same loop. The conditions are tested (and code body executed) in the order they appear lexically.

Again note the assert. A failure to set the key throws an error; see the discussion in GetField.

SetName PROCEDURE(STRING Text)

This method is a counterpart to GetName; it only allows the name to be assigned if there is an underlying variable for the NAME attribute of the file. By having the GET/SET in the FileManager the burden of tracking the global variable name disappears (it simply becomes the province of the dictionary). This makes it far easier to have an automated path assignment system built in.

Error Handling

The action of the ErrorClass has already been covered, however each FileManager re-vectors the error manager calls through its Errors property. This serves one main purpose: it allows a global object to be referenced from within base class code. The secondary purpose is to make the error handler used by the FileManager re-assignable. This is useful as the file system is one of the major generators of errors and the file calls are usually out of the direct control of the programmer. The ability to intercept errors on a file by file basis allows fine grain recovery mechanisms to be written. In addition to having a single vector point, the FileManager has a small suite of routines through which all FileManager/ErrorClass interaction is managed. Again the purpose is to make errors and recovery mechanisms overridable with a minimum of effort.

GetError PROCEDURE,SIGNED

The FileManager stores the last file error thrown within it. The number is the ErrorClass number, and it has nothing to do with ErrorCode or Error. It should be noted that ErrorCode et al are not valid upon return from FileManager methods. In particular it is quite probable that the FileClass (coming in a future major release) will not utilise ErrorCode and Error in normal operation and thus the FileManager will not even have error codes available. The error suite is one of the instances of the FileManager trying to smother an encapsulation leakage coming from underneath.

SetError PROCEDURE(USHORT Number)

This method separates out the recording of an error condition from the Throw (or exception) that the error could raise. Occasionally this is used to simplify internal coding, but more usually it is used in the TryAction methods so that they can return an error signal and leave the ErrorClass able to Throw the error if the caller requires.

Throw PROCEDURE(USHORT  ErrorNumber),BYTE,PROC,VIRTUAL

This function is purely a syntactic convenience. It is equivalent to a SetError followed by a Throw.

Throw PROCEDURE,BYTE,PROC,VIRTUAL

This routine takes the last error number (as recorded by SetError) and simply forwards it to the ErrorClass stored in SELF.Errors. The main purpose of this routine is simply to provide a common focus point (and thus override point) for the FileManager error handling. The return value comes from the ErrorClass and denotes the severity level as attached by the error class. This could be used to provide a sophisticated error recovery mechanism, by default most Throws are considered fatal and this facility is not used.

It is worth nothing that although Throw does not pass on the file label at this point, the ErrorClass does have access to the file name as this has been set up by the SetThread method as detailed in the previous article.

ThrowMessage PROCEDURE(USHORT ErrorNumber,STRING Text),BYTE,PROC,VIRTUAL

This is a simple extension to Throw to allow an extra message to be passed on to the ErrorClass.

Snapshots

The snapshot interface's purpose is to allow file state and buffer contents to be saved and restored by anyone without them having to know the structure of the file. The routines all use a handle to denote a particular state. This handle is undefined (presently it is an ID number within a queue.) Eventually these routines will become vectors for fresh instances of the file class to be created and destroyed.

The words buffer and file have specific meanings. Buffer means the contents of the current record; that is the record buffer but also the memo contents. Blob contents are not stored as the overhead is potentially too onerous. File means buffer plus additional file state information such as Held, Watched, auto-increment done etc. For this reason all of the Buffer methods are fairly cheap involving only memory copies, while the File methods also involve disk access.

EqualBuffer PROCEDURE(*USHORT Handle),BYTE,VIRTUAL

This method is used to check if the current record contents differ from those when the snap shot (denoted by the parameter) was taken. For example, this might be used to see if a cancel on a form should be allowed to happen without user intervention.

First the Handle is looked up in the buffer queue; this gives the previous contents of the record buffer which can be compared byte for byte against the present values (this function is boolean - it doesn't say how the two buffers differ). If the two record buffers are the same the routine steps through the memos of the file seeing if they differ. The stored memo buffers are (by convention) stored consecutively in the queue following the record buffer. The present contents of the memos are retrieved by using MyFile{PROP:Value,-memonumber} (the negative number indicates this is a memo). This was necessary as it is not possible to store ANY references to memos as memos are created on the heap at file open time (on each thread) and are thus highly treacherous when involved with references.

RestoreBuffer PROCEDURE(*USHORT Handle,BYTE
        DoRestore=1)

This routine is used to restore the contents of the file buffer to the point they were when the SaveBuffer was called. If you pass in a zero as the second parameter then no restoration is done but the memory is freed. Commencing with C5EEA this routine actually becomes a shell that calls into RestoreBuffer(handle,filemanager,byte).

RestoreBuffer  PROCEDURE(*USHORT Handle,FileManager FM,BYTE DoRestore = 1),PRIVATE

This routine allows the contents of a buffer to be restored to the present file buffer from contents snapshotted by the passed in FileManager. Now in general restoring to a file other than your own is a dangerous, unmaintainable and generally very stupid thing to do (this is why the only public interface to RestoreBuffer passes in SELF). However in the particular case where the "other" file is absolutely identical structurally to your own, and is guaranteed to be so, it does give an extra degree of flexibility. We use this facility when dealing with aliases. However when reading this code you should generally assume that Frm and SELF are the same thing (if you're writing it, the distinction is vital of course!) Other than that, this code is essentially analogous with EqualBuffer, the only extra being KillBuffer which first frees them memory used for the buffer contents and then kills the queue record.

RestoreFile PROCEDURE(*USHORT Handle)

This is used to restore a file to the state it was in when the snap shot was taken. The current file position, sort sequence, held and watch state are all recorded along (since C5EEA) with the auto-increment state. Note that additionally the record contents are restored after the file position. This is to allow for instances where the current record had begun to be modified at the point the snap-shot was taken.

As with RestoreBuffer, RestoreFile has been split out to aid the use of aliases, or more specifically, to allow FileManagers of aliased files to re-vector their methods through the Filemanager of the actual file without corrupting the current state of the actual file.

RestoreFile PROCEDURE(*USHORT Handle,FileManager FM),PRIVATE

The file state (as opposed to record contents) is retrieved from the Saved queue. The Saved.Key element is the key number of the key active when the snapshot was taken. If this is non-zero then the key reference is found from the file driver and used in the RESET (otherwise the File is used). Because Watched and Held are read-only properties in the file driver they have to be restored by re-arming them and applying a NEXT. Having performed the NEXT (and thus "corrupted" the buffers) the buffers are restored. The auto-increment state is then put in place. Note the PUT on the SELF.Info to store that information for the current thread.

Actually this raises a slight cheat. Many of the file methods need to start with a SetThread for reasons previously described in FileManager I. Many then needed a UseFile to prime the lazy open. UseFile also needed to do a SetThread, so SetThread was often called twice. This is clearly inefficient so we cheated and allowed an information leakage that stated that UseFile does, and will always, perform an implicit SetThread. Again we find that ABC is not just about science, it is also about engineering. We allowed for one assumption and removed 15 lines of code and an efficiency drag on most of our core functions, and also lost some conceptual purity.

SaveBuffer PROCEDURE,USHORT

This code snapshots the current contents of the record buffer; most of the code is analogous to EqualBuffer. The interesting piece is the allocation of the Id to act as a handle to the outside world. At first sight you can simply get the number of records in the queue, add on one and you have your new Id. Better yet, you don't need to store the Id in the queue; you simply use the Id as a record number.

Further, in just about all the testing you ever do, it will work beautifully. But sometimes, somehow, it will corrupt when the users use it. The reason is that simply counting the records only works if Save/Restore pairs are performed in a stack-wise manner. If a deletion from the queue has happened in the middle then the next RECORDS will return a value lower than the current highest Id. Actually it will even work if the restores are not done stack-wise provided the result has been stack-wise by the time you do the next Save. If you do the Save/Restores in an unpaired way you will actually get the identifiers duplicated in the queue and havoc ensues. The solution is that you get the final record in sorted order and then add one on to whatever you receive back. DupString is a private member function used to allocate heap for, and copy the value into, a temporary string (like strdup in C++).

SaveFile PROCEDURE,USHORT

This method is the mirror of RestoreFile. Note that rather than replicating the buffer storage code, SaveFile simply calls on to SaveBuffer and stores the result. It is worth mentioning that the handles returned from SaveFile have no relation to those returned by SaveBuffer. You cannot SaveFile / RestoreBuffer or vice versa.

One slight tweak is the storage of the current key. You cannot simply save a key reference as that will not work when you are restoring to a different FileManager. Instead you have to store an ordinal number corresponding to the declaration order of the key. That number is computed using the loop. Note too the usage of a cast from CK (which is a long) to a key reference:

K &= (CK)

The rule is that a numeric value can be assigned in place of a valid reference of the right type. CK in itself is not a value (it is a variable) so the parenthesis is used to form a value. This form of casting (which can be used in conjunction with ADDRESS and references) allows all of the (horrible) type conversion common to C++. It should be used extremely sparingly, but when needed it is brilliant.

A Thought

If you have been following this article in the source code, reading and understanding as you went, it is quite probable that by this point you are thinking. Hey! This stuff is all obvious; what is all the fuss about? If so, this article has worked. If not it may be worth your while backtracking to see where the confusion enters. Object systems are hierarchical, one layer builds upon another. Therefore, comprehension of object systems tends to be hierarchical. If one layer doesn't make sense it typically means you didn't quite catch hold of the layer underneath. Happy hunting...

(Part 2 of 3. Read Part 1 and Part 3)


David Bayliss is a Systems Architect for The TopSpeed Development Center. He has worked upon TopSpeed's compiler and was the chief architect of the Application Builder Classes.

Printer-friendly version

Reader Comments

To add a comment to this article you must log in.

 
 

Search

 

Advanced Search
Topical Index

Related Articles

Subscribe to
ClarionMag

One year: $189

(includes all back issues since '99)

Renewals from $139

Two years: $289

Renewals from $239

More Info

Subscribe Now!

ClarionMag Blog

RSS Feeds

Updates via Email

Enter your Email


Powered by FeedBlitz

Quick Links