![]() |
|
Published 1999-06-14 Printer-friendly version
NOTE: To avoid confusion with ABC terminology the file
called Classes in the previous article in this series has been
renamed to Courses. The prefix for this class has been changed from
CLS to CRS.
In the previous article in this series I began developing a data dictionary and application to track information about students attending a university or college. I began by defining a student file as well as an address file, on the premise that students may have one or more addresses. This dictionary design reflects a one-to-many relationship between students and addresses, or, if you look at it from the other side, a many-to-one relationship between addresses and students.
This example relationship is obvious and easy to understand. Real data, however, is often a bit more complex. Sometimes any number of records from one file can be related to any number of records from another file. These many-to-many relationships are quite common in databases, and most likely you'll have to know how to handle them. In this article I'll examine such a relationship and outline how it can be defined in the data dictionary.
It's now time to expand the demonstration application to handle not just students and addresses, but the courses for which students can register. You can probably guess at some of the fields required to describe courses:
As you create the data file keep in mind any fields which are defined in pool data and consider whether any other fields should be added to the pool. In the case of the Courses file the ID can come from the pool data (where it has the Do No Populate flag set). There are two date fields in the file and you will want them to have a standard date format, and you may wish to make them spin boxes as well. In either case a pool Date field is a good idea. Figure 1 shows the Courses fields in the data dictionary, and Figure 2 shows the Courses keys.


Now you have a way to store courses and a way to store students. How do you link students to courses? The problem is that one student can take a number of courses, and any course can be taken by a number of students. The relationship is many-to-many, diagramed in Figure 3. The two triangles indicate the "many" aspect of the relationship.

If you examine the data dictionary's relationship editor you won't find any options for creating a many-to-many relationship, because it's almost impossible to create using just two files.
The problem is that from a student perspective you need to store an unknown number of course IDs, and from the course perspective you need to store an unknown number of student IDs. Some developers approach this the same way they do many-to-one relationships: they use arrays.
As I indicated in the previous article, arrays are generally a bad idea for linking files, and they're even worse in a many-to-many situation. Arrays by definition limited in size (at least in Clarion), which means you have to make the array size as large as the highest possible value, thereby wasting a lot of space. Furthermore you cannot use arrayed fields in keys. That's not a problem in a many-to-one where only one side needs to be keyed, but in a many-to-many both sides need to be keyed. You need to see which courses a given student takes, and which students are in a given course.
The answer is to use a file as an intermediary between Students and Courses, as shown in Figure 4.

In Figure 4 the single triangles indicate the "one" side of the relationship and the double triangles indicate the "many" side. As this diagram shows, the many-to-many has been broken down into two many-to-one relationships. This is the standard approach to handling many-to-many situations.
The linking Registrations file is simplicity itself - it needs to contain only three fields: a unique autonumbered Registration ID, a student ID and a course ID. (You might want to add several additional fields, however, including the date the registration was taken.)
NOTE: In a TPS or other flat-file (i.e. non-SQL) database you can get away without the unique autoinc key for this record, but it's a good idea to have it anyway as you may wish to link other records to the registration record.
Figure 5 shows the fields used in the linking file, and Figure 6 shows the keys.


StudentCourseKey and CourseStudentKey both have the course and student IDs as elements, and if you look at the data dictionary you'll see that they have the unique attribute as well. This prevents a student from being registered for the same course twice.
Now define the relationships in the dictionary editor. Although both the keys in Registrations have two elements, you only want to make the links on the first elements in the key in each case. Figure 7 shows the relationship between Students and Registrations.

A similar relationship exists between Courses and Registrations, as shown in Figure 8.

If at this point you use the Browse wizard to create a browse and update form for Courses, you'll find a browse of Registrations records on the Course update form. That may not be the final place you'd like to have it, but it will show you how the link works. (Another option is to populate a browse of Registrations records below the Courses browse, and restrict its records based on the relationship with Courses.)
You'll want to display the student name on the Registrations browse rather than just the ID. Highlight Registrations in the file schematic and click Insert. Select the Students file from the related files list. This will ensure that the related Students record is retrieved for each Registrations record (see the accompanying application for an example).
Since the only fields in the Registrations file are two
LONGIDs, you'll need to set these values when you
add a registration. Whether you're updating Registrations
records from a browse on the Courses update, or from a child browse
you've placed on the Courses browse, you know that the current
Courses record is in memory. Use Field Priming on Insert on the
Registrations update form to preload REG:CourseID with
CRS:ID. Figure 9 shows the appropriate Field Priming
on Insert setting.

Since you already have REG:CourseID there's no
need to display it on the window: delete the entry field. Now all
the user has to do is choose a student. You probably don't
want the user entering in the student's ID directly, so you
should provide a lookup on a list of students. One way to do this
is to set up the REG:StudentID field's actions to
do a lookup, as shown in Figure 10.

After entering the settings in Figure 10 and saving the changes,
populate the FieldLookupButton template on the window and on its
Actions tab select ?REG:StudentID. Now you have a
button you can use to force a lookup on the Students browse.
Next bring up the property window forREG:StudentID.
Check the Hide box so the ID won't be visible. In the
form's file schematic, click on the Registrations file and
choose Students from the Related Files tab. Populate two string
fields on the form and on the string Properties window, Use field,
enter REG:FirstName for one field and
REG:LastName for the other (or use the field lookup
button to the right of the Use prompt). The relation manager will
take care of looking up the student record from the ID and will
display the name of the student associated with the
registration.
NOTE: This is just one approach to selecting records. Another would be to use the FileDrop template (though this code is due for major revision and may not be a safe choice). You may also want to consider a third party solution like ProDomus' highly-regarded PDLookup templates which add Quicken-style incremental lookups.
If you look at the data dictionary in the accompanying example you'll see that the Registrations file has a Registrants field which is defined as a SHORT variable. Assuming that a successful link is created between courses and students it should be possible to calculate this value, so it isn't strictly necessary to keep it on the course record.
In fact, good relational design suggests that this is unnecessarily duplicated data and should not be stored. In reality there are often tradeoffs involved in designing a database. Your aim should be to avoid situations where conflicting data can exist, but sometimes performance requirements must also be addressed. If one of the requirements is to display the number of registrants on a browse of courses, and you don't keep a total on the Courses record, then you'll have to loop through the registrant information for each record.
One easy way to do this is to use a hidden browse with the child
records and a total (count) field. If you take this approach make
sure that you set the child browse's
ActiveInvisible property to True as the default
behaviour for browse objects is to become idle when hidden.
If there are a large number of child records or you frequently run reports or other processes where the number of registrants is required then it may be to your advantage to count the registrations whenever one is added or deleted and update a Registrants field with this value.
You may think at first glance that a BYTE value
would suffice for Registrants, but for a particularly large lecture
course there might be more than 255 students. Should that happen
with a BYTE variable the numbers would wrap around and
the displayed count would be incorrect. If there's any
possibility that you'll overrun the variable's capacity
then use the next larger variable. A SHORT has a
maximum value of 32767, and while course overcrowding is often a
problem, its not likely to get that bad.
In my initial design of the course file I made the assumption that there would be only one instructor per course. This isn't necessarily the case. There may be multiple instructors, or an instructor and teaching/lab assistants who should also be associated with the course. A simple many-to-one relationship would be too restrictive. To handle this I'll need to create a linking file just as I did for the course registrations. As with the Registration file, this linking at a minimum would need only two IDs.
The only difference between this many-to-many and the registrations many-to-many is that this one is a bit more difficult to recognize. In the software requirements (if a formal document existed) you would probably come across the term "Course registration," which might have led you to think about storing this information in its own file.
What if a suitable term hasn't been defined for the link between instructors and courses? You can discover the need for such links by examining each file relationship carefully and asking yourself if the relationship will satisfy all the needs of the application.
By now you should be starting to see some other possible links
between data. If you're storing information about instructors,
then perhaps there's some possible duplication of the kind of
information you keep about students. Perhaps a generalized contacts
file would be better, with a Type field to
differentiate between students and instructors (and perhaps staff
as well). This way you can use the same approach for instructor
addresses as for student addresses.
Telephone numbers (and email addresses and the like) are another potential area for generalization. Rather than having two or three telephone number fields in the names file you might want to go with a Phones or other kind of contacts file. Remember that you can assign field pictures at runtime, so you could even use a simple string field and use various pictures to format the string for phone numbers, email addresses, and so forth. You might want to keep this kind of configuration information in an INI file (not a great idea) or a single-record control file in your database (a better option as it allows for encryption).
When I analyze or create a database design I'm constantly asking myself how the data will be used given the current requirements and what I anticipate the future requirements will become. Few of us ever actually finish a software project. We may leave it or hand it off to someone else, but there's always work to be done.
I find that the better I am at anticipating what the user will want to do with the program, the more robust my database design is likely to be. In the next article in this series I'll look at "use case" analysis, a text-only tool that I find helpful in fleshing out a design and discovering hidden requirements.
David Harms is an independent software developer and the editor and publisher of Clarion Magazine. He is also co-author with Ross Santos of Developing Clarion for Windows Applications, published by SAMS (1995), and has written or co-written several Java books. David is a member of the American Society of Journalists and Authors (ASJA).
Copyright © 1999-2008 by CoveComm Inc. All Rights Reserved. Reproduction in any form without the express written consent of CoveComm Inc., except as described in the subscription agreement, is prohibited.
Clarion Magazine ISSN 1718-9942
One year: $189
(includes all back issues since '99)
Renewals from $139
Two years: $289
Renewals from $239