David Bayliss On The FileManager Part 2
Posted June 14 1999
(Part 2 of 3. Read Part 1 and Part 3)
This article is part two of the ABC design document on the FileManager class, and you really should read the first part first as this article simply continues the FileManager code overview. I hope I have also provided enough hyperlinks that people returning to the article for reference will be able to dig out the information they require. In this article I'll look at some of the administration functions, the error handling and the snapshot mechanism.
Whilst some of the aims of the class can be gleaned simply from reading this article, most fruit is available to those that actually settle down and read the ABC code along with the corresponding comments. This is actually something I would always encourage you to do. The backbone of ABC amounts to around 4,000 lines of code, so if you aim to master 100 lines a day you will understand the basic ABC paradigm completely within eight weeks! The FileClass amounts to 25% of that work.
Administration
This section details those methods provided almost entirely as wrappers upon internal information for the benefit of higher level methods and/or methods outside of the FileManager.
ClearKey PROCEDURE(KEY K,BYTE LowComp,BYTE HighComp,BYTE
High)
This method is there to provide a shortcut for a piece of
template code that occurred very frequently. Essentially it handles
the problem of a multi-component key where you wish to perform a
SET(KEY,KEY) but you only know the major components.
In order to ensure the SET(KEY,KEY) gets you to the
start of all the records you require you need to clear the low
order key components. Clarion does not have a
CLEAR(KEY) so the FileManager provides one for
you.
Rather than clear the whole key the routine allows you to specify the low (majormost) and high (minormost) components you want cleared. This is to allow minor component clearing to happen when the major components have already been filled in.
The method works by first performing aSetKey so that the current record of the
FileKeyQueue holds information for the current key.
Then the key components are stepped through from low to high and
the KeyFieldQueue is fetched to retrieve the
information for the current component (see AddKey in the
previous article). The
GET is error trapped with a simple return if the
component doesn't exist. This is to allow the
HighComponent to be specified as 255, meaning "to the
end."
The XOR logic illustrates a useful trick (and hides a
complexity!). Remember that as you are trying to get all the
records in a SET(KEY,KEY), you might assume that means
you just CLEAR all the key values low. But wait a
minute; suppose you are about to do a PREVIOUS rather
than a NEXT. Then you need to clear all the values
high. This works in the common case of ascending key components,
but remember it is possible to have descending key components.
Worse yet you can mix ascending and descending in the same key. If
you sit down with pencil and paper (you may be able to do this in
your head, but I needed pencil and paper) you will find you need to
clear a component low if it is ascending and you are clearing low,
or if it descending and you are clearing high. This can be
expressed using a disjunction (OR) of conjunctions (AND) but the
XOR operator wraps it up perfectly and goes down to one machine
instruction. I could have coded this more tightly still as:
CLEAR(SELF.Keys.Fields.Field,CHOOSE(~(SELF.Keys.Fields.Ascend
XOR High)))
But I thought that might be just a little too scarey.
GetComponents PROCEDURE(KEY K),BYTE
This simple little method simply returns the number of
components in a key. It usesSetKeyand the fact that there is one
KeyFieldQueue record for each component of the
key.
GetEOF PROCEDURE,BYTE
The FileManager has a very specific meaning for
EndOfFile: it means the last attempt to
NEXT or PREVIOUS a record failed because
the end of file has been reached. Specifically, if you have a file
with 10 records EOF is true after the 11th
NEXT, not the 10th. As such
GetEOF is really just a short hand to detect a
specific error condition.
The functionality could almost certainly be achieved by looking
at the return code from NEXT/PREVIOUS and then delving
to see what the error identifier was. Again this is a situation
where the FileManager does work simply to reduce the amount of
coding required by users of the object.
GetField PROCEDURE(KEY K,BYTE
Component),*?
This method is used to return an ANY variable
corresponding to a given component of a key. I didn't want to
have to protect the rest of my code against GetField
returning a null so the procedure ASSERTs that the
incoming component will be found. In other words,
GetField gracelessly handles out of range
components.
This does illustrate another agenda within ABC: offensive programming. Defensively I would have coded so that an out of range value returned a null, which would take two lines of code. Then on the receiving end nulls would have been handled, presumably in some "see if we can still keep going" fashion.
There are four calls to GetField in abfile (i.e.,
this method is relatively underused). Each would have had to
temporarily store the GetField result, test for the
null and do something smart with it. This might have taken five
lines of code each (one for the declaration, one for the extra
assign, two for the null test, one to handle the null case). In
total I would now need 22 lines of code to handle something that
should never happen as opposed to the one line of code used
in ABC. Doing that throughout a heavily integrated file like ABFILE
could turn 2000 lines of code into 40,000 lines of code 95% of
which would be rarely executed and thus minimally tested. QED.
GetFieldName PROCEDURE(KEY K,BYTE Component),STRING
This is really there for the benefit of methods using the
PROP:Filter technology on a view. It provides the
BIND name of a give key component.
GetName PROCEDURE,STRING
The FileManager has to cope with two possibilities for the name
of a file. It may either be a constant or a variable (the latter
corresponds to the case where the NAME attribute on a
file contains a string variable). GetName is there to
encapsulate this dilemma from the rest of the class. If a variable
file name has been assigned then it returns that, otherwise it
assigns the constant provided to it by the driver itself.
KeyToOrder PROCEDURE(KEY K,BYTE
MajorComp),STRING
This method really takes GetFieldName one logical
step further. Rather than just return a field name corresponding to
a key component, this method returns an ORDER clause
(in Clarion syntax) that is equivalent to this key starting at
component MajorComp. A value of one thus gives the
whole key as an order clause, two skips the leading component
etc.
Note that the null key case is defended against. This is because it is totally reasonable to have a null key specified as the sort key of an object (corresponding to not specifying a key in the file schematic).
The only real complexity is in the RetVal
assignment. The first CHOOSE is there to prepend the
field name with a comma only if the string being built up is
non-null. The second CHOOSE is there to place a
leading '-' before a descending key component (the view
driver treats -string as a descending string, it does
not convert it to a number as the language would).
SetKey PROCEDURE(KEY K),PROTECTED
SetKey is used to fetch the correct record within
the FileManager key queue for the usage of a key passed in to it.
You cannot sort a queue on a reference field so the method has to
loop through the queue finding a match. Files don't have
that many keys so this should not be too onerous. I could
start the method with a check to see if the current record value
already matches as a kind of first level cache, but the
downside is that this would hide a raft of bugs where people had
not done a PUT after modifying the key
information.
The loop illustrates an interesting and occasionally useful
quirk of Clarion. You can have loop head and loop tail
conditions (WHILE and UNTIL) in the same
loop. The conditions are tested (and code body executed) in the
order they appear lexically.
Again note the assert. A failure to set the key throws an error;
see the discussion in GetField.
SetName PROCEDURE(STRING Text)
This method is a counterpart to GetName; it only
allows the name to be assigned if there is an underlying variable
for the NAME attribute of the file. By having the
GET/SET in the FileManager the burden of
tracking the global variable name disappears (it simply becomes the
province of the dictionary). This makes it far easier to have an
automated path assignment system built in.
Error Handling
The action of the ErrorClass
has already been covered, however each FileManager re-vectors the
error manager calls through its Errors property. This
serves one main purpose: it allows a global object to be referenced
from within base class code. The secondary purpose is to make the
error handler used by the FileManager re-assignable. This is useful
as the file system is one of the major generators of errors and the
file calls are usually out of the direct control of the programmer.
The ability to intercept errors on a file by file basis allows fine
grain recovery mechanisms to be written. In addition to having a
single vector point, the FileManager has a small suite
of routines through which all FileManager/ErrorClass
interaction is managed. Again the purpose is to make errors and
recovery mechanisms overridable with a minimum of effort.
GetError PROCEDURE,SIGNED
The FileManager stores the last file error thrown
within it. The number is the ErrorClass number, and it
has nothing to do with ErrorCode or
Error. It should be noted that ErrorCode
et al are not valid upon return from FileManager methods. In
particular it is quite probable that the FileClass
(coming in a future major release) will not utilise
ErrorCode and Error in normal operation
and thus the FileManager will not even have error codes available.
The error suite is one of the instances of the FileManager trying
to smother an encapsulation leakage coming from underneath.
SetError PROCEDURE(USHORT Number)
This method separates out the recording of an error condition
from the Throw (or exception) that the error could
raise. Occasionally this is used to simplify internal coding, but
more usually it is used in the TryAction
methods so that they can return an error signal and leave the
ErrorClass able to Throw the error if the
caller requires.
Throw PROCEDURE(USHORT ErrorNumber),BYTE,PROC,VIRTUAL
This function is purely a syntactic convenience. It is
equivalent to a SetError followed by a
Throw.
Throw PROCEDURE,BYTE,PROC,VIRTUAL
This routine takes the last error number (as recorded by
SetError) and simply forwards it to the
ErrorClass stored in SELF.Errors. The
main purpose of this routine is simply to provide a common focus
point (and thus override point) for the FileManager error handling.
The return value comes from the ErrorClass and denotes
the severity level as attached by the error class. This could be
used to provide a sophisticated error recovery mechanism, by
default most Throws are considered fatal and this
facility is not used.
It is worth nothing that although Throw does not
pass on the file label at this point, the ErrorClass
does have access to the file name as this has been set up by the SetThread
method as detailed in the previous article.
ThrowMessage PROCEDURE(USHORT ErrorNumber,STRING Text),BYTE,PROC,VIRTUAL
This is a simple extension to Throw to allow an
extra message to be passed on to the ErrorClass.
Snapshots
The snapshot interface's purpose is to allow file state and buffer contents to be saved and restored by anyone without them having to know the structure of the file. The routines all use a handle to denote a particular state. This handle is undefined (presently it is an ID number within a queue.) Eventually these routines will become vectors for fresh instances of the file class to be created and destroyed.
The words buffer and file have specific meanings. Buffer means the contents of the current record; that is the record buffer but also the memo contents. Blob contents are not stored as the overhead is potentially too onerous. File means buffer plus additional file state information such as Held, Watched, auto-increment done etc. For this reason all of the Buffer methods are fairly cheap involving only memory copies, while the File methods also involve disk access.
EqualBuffer PROCEDURE(*USHORT Handle),BYTE,VIRTUAL
This method is used to check if the current record contents differ from those when the snap shot (denoted by the parameter) was taken. For example, this might be used to see if a cancel on a form should be allowed to happen without user intervention.
First the Handle is looked up in the buffer queue;
this gives the previous contents of the record buffer which can be
compared byte for byte against the present values (this function is
boolean - it doesn't say how the two buffers differ).
If the two record buffers are the same the routine steps through
the memos of the file seeing if they differ. The stored memo
buffers are (by convention) stored consecutively in the queue
following the record buffer. The present contents of the memos are
retrieved by using MyFile{PROP:Value,-memonumber} (the
negative number indicates this is a memo). This was necessary as it
is not possible to store ANY references to memos as
memos are created on the heap at file open time (on each thread)
and are thus highly treacherous when involved with references.
RestoreBuffer PROCEDURE(*USHORT Handle,BYTE
DoRestore=1)
This routine is used to restore the contents of the file buffer
to the point they were when the SaveBuffer was called.
If you pass in a zero as the second parameter then no restoration
is done but the memory is freed. Commencing with C5EEA this routine
actually becomes a shell that calls into
RestoreBuffer(handle,filemanager,byte).
RestoreBuffer PROCEDURE(*USHORT Handle,FileManager FM,BYTE DoRestore = 1),PRIVATE
This routine allows the contents of a buffer to be restored to
the present file buffer from contents snapshotted by the passed in
FileManager. Now in general restoring to a file
other than your own is a dangerous, unmaintainable and generally
very stupid thing to do (this is why the only public interface to
RestoreBuffer passes in SELF). However in
the particular case where the "other" file is absolutely identical
structurally to your own, and is guaranteed to be so, it does give
an extra degree of flexibility. We use this facility when dealing
with aliases. However when reading this code you should generally
assume that Frm and SELF are the same
thing (if you're writing it, the distinction is vital of
course!) Other than that, this code is essentially analogous with
EqualBuffer, the only extra
being KillBuffer which first frees them memory used
for the buffer contents and then kills the queue record.
RestoreFile PROCEDURE(*USHORT Handle)
This is used to restore a file to the state it was in when the snap shot was taken. The current file position, sort sequence, held and watch state are all recorded along (since C5EEA) with the auto-increment state. Note that additionally the record contents are restored after the file position. This is to allow for instances where the current record had begun to be modified at the point the snap-shot was taken.
As with RestoreBuffer,
RestoreFile has been split out to aid the use of
aliases, or more specifically, to allow FileManagers
of aliased files to re-vector their methods through the
Filemanager of the actual file without corrupting the
current state of the actual file.
RestoreFile PROCEDURE(*USHORT Handle,FileManager FM),PRIVATE
The file state (as opposed to record contents) is retrieved from
the Saved queue. The Saved.Key element is
the key number of the key active when the snapshot was
taken. If this is non-zero then the key reference is found from the
file driver and used in the RESET (otherwise the
File is used). Because Watched and
Held are read-only properties in the file driver they
have to be restored by re-arming them and applying a
NEXT. Having performed the NEXT (and thus
"corrupted" the buffers) the buffers are restored. The
auto-increment state is then put in place. Note the
PUT on the SELF.Info to store that
information for the current thread.
Actually this raises a slight cheat. Many of the file methods
need to start with a SetThread for reasons previously
described in FileManager I.
Many then needed a UseFile to prime the lazy open.
UseFile also needed to do a SetThread, so
SetThread was often called twice. This is clearly
inefficient so we cheated and allowed an information
leakage that stated that UseFile does, and will
always, perform an implicit SetThread. Again we find
that ABC is not just about science, it is also about engineering.
We allowed for one assumption and removed 15 lines of code and an
efficiency drag on most of our core functions, and also lost some
conceptual purity.
SaveBuffer PROCEDURE,USHORT
This code snapshots the current contents of the record buffer;
most of the code is analogous to EqualBuffer. The interesting piece
is the allocation of the Id to act as a handle to the
outside world. At first sight you can simply get the number of
records in the queue, add on one and you have your new
Id. Better yet, you don't need to store the
Id in the queue; you simply use the Id as
a record number.
Further, in just about all the testing you ever do, it will work
beautifully. But sometimes, somehow, it will corrupt when the users
use it. The reason is that simply counting the records only works
if Save/Restore pairs are performed in a stack-wise manner. If a
deletion from the queue has happened in the middle then the
next RECORDS will return a value lower than the
current highest Id. Actually it will even work if the
restores are not done stack-wise provided the result has been
stack-wise by the time you do the next Save. If you do
the Save/Restores in an unpaired way you will actually
get the identifiers duplicated in the queue and havoc ensues. The
solution is that you get the final record in sorted order
and then add one on to whatever you receive back.
DupString is a private member function used to
allocate heap for, and copy the value into, a temporary string
(like strdup in C++).
SaveFile PROCEDURE,USHORT
This method is the mirror of RestoreFile. Note that
rather than replicating the buffer storage code,
SaveFile simply calls on to SaveBuffer
and stores the result. It is worth mentioning that the handles
returned from SaveFile have no relation to those
returned by SaveBuffer. You cannot
SaveFile / RestoreBuffer or vice
versa.
One slight tweak is the storage of the current key. You cannot
simply save a key reference as that will not work when you are
restoring to a different FileManager. Instead you have to store an
ordinal number corresponding to the declaration order of the key.
That number is computed using the loop. Note too the usage of a
cast from CK (which is a long) to a key reference:
K &= (CK)
The rule is that a numeric value can be assigned in place of a
valid reference of the right type. CK in itself is not
a value (it is a variable) so the parenthesis is used to form a
value. This form of casting (which can be used in conjunction with
ADDRESS and references) allows all of the (horrible)
type conversion common to C++. It should be used extremely
sparingly, but when needed it is brilliant.
A Thought
If you have been following this article in the source code, reading and understanding as you went, it is quite probable that by this point you are thinking. Hey! This stuff is all obvious; what is all the fuss about? If so, this article has worked. If not it may be worth your while backtracking to see where the confusion enters. Object systems are hierarchical, one layer builds upon another. Therefore, comprehension of object systems tends to be hierarchical. If one layer doesn't make sense it typically means you didn't quite catch hold of the layer underneath. Happy hunting...
(Part 2 of 3. Read Part 1 and Part 3)
Article comments
Search ClarionMag
From the archives
Unit Testing Webinar Workshop Takes On Dates/Times
3/31/2011 12:00:00 AM
Recently John Hickey and David Harms hosted a webinar workshop on unit testing, using Pierre du Toit's article on Clarion and Excel dates and times as a source for a utility class. John and Dave learned a few things about the process, and hopefully the participants did too.
