The Novice's Corner - Designing Databases

by Dave Harms

Published 1999-05-03    Printer-friendly version

In the first two articles in this series I examined the underlying philosophy of developing software in Clarion, and reviewed the all-important role of Topspeed's template technology in creating applications with a minimum of coding.

All of that background information is, I think, vital to being productive with Clarion. On the other hand proficiency in Clarion is only part of the equation. You may understand all there is to know about the development environment, and you may be able to write Clarion code so pure it'll bring tears to David Bayliss's eyes, but that isn't enough to write a great application.

Software development is a bit like building a house. Many traditional programming tools correspond to hammers, nailers, saws, drills, and other power tools. In this analogy the tools help you shape the raw material into a structure. Newer software development tools, including Clarion, do the software equivalent of prefabricating large sections of a building. And very sophisticated tools like the Wizards and Wizatrons let you create the equivalent of an entire building with a certain look and feel.

This sort of technology is a great boon to many developers. Still, even the Wizatrons can't (yet) read your mind and know what kind of an application you'll want. They work on the basis of the application style you choose (or create) and the data dictionary you create (or use, if one already exists).

From the Wizatron's perspective application creation seems quite simple. All the Wizatron needs to know is the expected appearance and behaviour of the application and the data structures. Simple, right? All the developer has to do is choose how the application should look, how it should work, and what data it should store, and writing software becomes (in theory) a license to print money.

This is, of course, the reason the world still needs software developers as much as it needs software development tools. And as tempting as Wizzy technology is, there's still a lot of work for the developer to do before any given application development process hits its automation phase.

Begin At The Beginning

If you've accepted the basic assumptions of the Clarion way of building applications (MDI frame and windows, the browse/form paradigm), then you're probably not going to quibble with the fundamentals of your application's appearance. You are going to be concerned about the data dictionary (which in Clarion will determine a lot of the application's behaviour). Therefore much depends on the quality of the database design.

But how do you arrive at your database design? Unless you're into randomware, you begin with some sort of requirement, whether that's a formal document or just some half-baked ideas rattling around in your head. From this requirement you can begin to formulate some possible approaches. The possibilities you come up with will depend a lot on your past experience with application development, your familiarity with existing, similar applications, and your knowledge of what options Clarion makes available.

Application design, which ultimately includes database design, is itself a rather large subject and something I hope to touch on in future articles. I'm partial to something called "use case" analysis, but there are any number of techniques you can use to try to flesh out what the application will need to do. And you can be sure that whatever the initial requirement describes, the final product will be something more complex.

A Student Tracking Application

Somewhat irresponsibly I'm going to leave aside the principles and techniques of overall application design and focus on the data dictionary. My assumption is that you're comfortable with the kinds of applications Clarion creates by default (MDI, browse/form), and that the user's requirements very conveniently do not deviate in any significant way from the functionality provided by Clarion 5 with the ABC templates.

Imagine an application which will be used to track university students and the courses for which they register. The application should be able to generate reports showing students in a given course and the courses taken by a given student, and should provide standard student information such as address and phone number.

It's vital in any development effort to determine how to store the data in the most efficient, reliable, and useful way. As of this writing, for Clarion developers, this means using some sort of relational database. (If you're thinking about object-oriented databases, or object-relational databases, that's still in the future for us and for a lot of other developers too.)

Relational Databases

Clarion is built around the concept of the relational database. As you might guess from the name, a relational database deals with groups of information which are related in specific ways. These groups are typically called tables if you're dealing with an SQL database, or files (at least in the Clarion world) if you're using a non-SQL database. For the rest of this article I'll use the term file or data file to describe such a group, since I'll be creating a Topspeed database.

Each data file can contain zero to many records (rows in SQL parlance) of information. For instance a file of employee information would contain one record for each employee, and within that record there will be individual fields, such as one for name, another for telephone number, and so on.

One key rule of relational databases is that each record in a table must have a field which is a unique identifier. In Clarion this value is usually defined as a LONG integer and its value is typically created automatically (by the ABC class library or by code generated by the templates). This field should also not normally be available to the user to change.

Another of the rules which describe good relational database design says that data duplication must be kept to an absolute minimum. This rule is closely related to the first rule, as you will see.

Armed with these two rules you can avoid a lot of problems in your database design.

In the student tracking application you might begin with a table to hold student information. Fields could include FirstName, MiddleName, LastName, as well as fields for the address and telephone numbers.

Before your rush to the dictionary editor and begin creating the Students file (or any other file) ask yourself if you really know the extent of the data. Will the fields you propose be sufficient? Will each student, for instance, have only one address? University students are notoriously mobile. My personal best was 13 addresses in three years (and no, I wasn't running from the law). Often students will have a permanent or home address and a residence address, and perhaps a third address when away on a work term. If you don't have the familiarity with the data to know these sorts of things then you must rely heavily on those who do. This can be trickier than it sounds because oftentimes people who are familiar with the data will omit details that to them are second nature but non-obvious to anyone else.

Avoiding Data Duplication

Clearly you'll need to store more than one address per student. If you know the exact number of addresses, you can simply add extra fields to the file. But if some students will have one address, and others two then there will be a lot of wasted space in the database. And if some students have more than two addresses you won't be able to accommodate the extra information.

Some developers use arrayed fields in these kinds of situations. Arrays are a convenient solution from a coding perspective because you can use a consistent set of labels when processing the data. All you need to do is change the subscript. But arrays don't solve the problem of wasted space, and they tend to introduce a lot of other problems when it comes to sorting data or using the fields in list boxes. Arrays do have a place, but they should be used only when they are the best solution.

In the case of student addresses, a better solution is to have a separate address file, and link it to the Students record by using a unique identifier.

Listing 1 shows record structures for the student and address files as created in the data dictionary.

Students             FILE,DRIVER('TOPSPEED'),PRE(STU)
                    ,CREATE,BINDABLE,THREAD               
StudentIDKey             KEY(STU:StudentID),NOCASE,OPT,PRIMARY
Record                   RECORD,PRE()
FirstName                   STRING(30)
MiddleName                  STRING(30)
LastName                    STRING(30)
StudentID                   LONG
                         END
                     END                       

Addresses            FILE,DRIVER('TOPSPEED'),PRE(ADD)
                    ,CREATE,BINDABLE,THREAD              
AddressIDKey             KEY(ADD:AddressID),NOCASE,OPT,PRIMARY
StudentIDKey             KEY(ADD:StudentID),DUP,NOCASE,OPT
Record                   RECORD,PRE()
AddressID                   LONG
StudentID                   LONG
Address1                    STRING(30)
Address2                    STRING(30)
City                        STRING(30)
StateProv                   STRING(2)
                         END
                     END                      

You can see that both files have a field called StudentID which is a LONG. This field serves two purposes. It uniquely identifies each student record, and it also provides a link between the student record and one or more address records.

TIP: For fields that you'll be using in more than one file, create a file in the data dictionary of type "Pool" and enter the base field types there. When you need a field of a type you've defined, click on the button to the right of the Derived From field in the New Field Properties window and choose from the Pool file. Do this before you enter a value in the Field Name field or you'll also have to reenter the Prompt Text and Column Heading values.

Simply having a linking field isn't sufficient. You also need to declare a key on the linking field in each file so that the linked record can be quickly retrieved. In the Students file the StudentIDKey is a unique key (it lacks the DUP attribute). You set this option on the Key Properties dialog, on the Attributes tab, as shown in Figure 1.

Figure 1. The key properties Attribute tab for StudentIDKey in the Students file.

novicefig1.gif (4734 bytes)

StudentIDKey in the Students file has several other attributes. It's a primary key, which means that the key is unique and contains a value for every record in the databases. You cannot have a null value in a field that is in a primary key. The primary key ensures that there is always a way to retrieve a specific record in a file. This is essential in an SQL environment where there is normally no equivalent to a record number as there is in a flat file database.

This key also has the Auto Number box checked. Checking Auto Number won't make any difference to the way the file is declared, but it will cause code to be generated that creates a unique sequential number for the key value for every record added to this table.

(Note: QuickLoad now has an Auto-Increment Key option that will create an appropriate auto-numbering key.)

StudentIDKey in the Address file is slightly different. Here the StudentID field will contain a value obtained from the Students file, so rather than a primary key this is a foreign key. The Address file does have its own primary key in AddressIDKey. In an SQL database every file (table) should have a primary key.

Just as creating fields doesn't automatically give you a key, creating keys doesn't automatically give you a link between files. This has to be defined in the Data Dictionary so that the appropriate code can be generated to manage the relationship. In the case of an SQL database, you have the option of also defining the relationships in the database itself in which case the database server can enforce any constraints. In either case, you should have the relationship defined in the dictionary.

This particular link on StudentID is a one-to-many relationship between the student and address records. Any one student can have from one (or I suppose zero) or more addresses.

To define this relationship highlight the Students file in the dictionary main window and click on the Add Relation button. The Relationship properties window appears, and it has one field filled in, Relationship to Students, which defaults to 1:Many. (The only other option is Many:1 which you could choose if you wished to create or edit the relationship from the perspective of the Addresses file.)

Choose StudentIDKey as the primary key, Addresses as the related file, and StudentIDKey as the foreign key.

You'll see the key fields listed under Field Mapping. You can choose to automatically map the fields by name or order (either will work in this case) or you can double-click on the individual fields to specify which field is the linking field. This is more of an issue when creating relationships with multi-component keys.

All that's left is to choose update and delete constraints, which determine whether certain actions involving related files will be allowed. The options are No Action, Restrict, Cascade, and Clear. The last three options are repeated with the annotation (Server) meaning that the restrictions exist but will be managed by the SQL database server rather than by Clarion code.

In a Restrict constraint you may not change or delete a primary key value if a related record exists. In a Cascade constraint any change to the primary key value will result in the change being made in any related records, and on a delete the related records will also be deleted. A Clear constraint clears any foreign key values on a change or delete of the primary.

In general you don't want to give your users access to linking fields anyway, so Update constraints shouldn't be a major issue. Delete constraints require more thought, and usually the choice is between Restrict and Cascade. If someone deletes a student record you're probably going to want to delete all the address records as well.

Now that you've determined the relationship you can go ahead and generate an application using the Application Wizard or Wizatron. Choose only the Students file when asked which files you want to be able to browse. Figure 2 shows the resulting application tree.

Figure 2. An application which browses the Students file.

novicefig2.gif (9217 bytes)

Note that although you only specified that you wanted to browse the Students file Clarion has created a an UpdateAddresses form, called from the UpdateStudents form. On the UpdateAddresses form, on the second tab, you'll see a browse for the Addresses file, as shown in Figure 3.

Figure 3. The Addresses tab on the UpdateStudents form.

novicefig3.gif (4508 bytes)

The Addresses browse exists only because of the relationship defined in the Data Dictionary between Students and Addresses. The Wizard or Wizatron simply populated a browse control template on the tab and set it to show only those address records which are linked by StudentID to the current student record.

Although Clarion by default puts the address browse on the student form, you can also place an address form on the student browse window if you prefer. You'll have to do this manually as the templates don't have this option. I'll have more about this in next month's article.

A Course, Of Course

This application now has a minimal capability of handling student information. The other main requirement is to track which courses students take.

It's easy enough to define a record to represent a course. You might want fields such as CourseName, StartDate, Instructor, Location, and so forth. But think about the actual data. You have many students and many courses. In other words this is a many-to-many relationship, not a one-to-many relationship. How can you represent this information in the most compact way possible? And does the Data Dictionary support many-to-many relationships? Tune in next month to find out!

Download the source code


David Harms is an independent software developer and the editor and publisher of Clarion Magazine. He is also co-author with Ross Santos of Developing Clarion for Windows Applications, published by SAMS (1995), and has written or co-written several Java books. David is a member of the American Society of Journalists and Authors (ASJA).

Printer-friendly version

Reader Comments

To add a comment to this article you must log in.

 
 

Search

 

Advanced Search
Topical Index

Related Articles

Subscribe to
ClarionMag

One year: $189

(includes all back issues since '99)

Renewals from $139

Two years: $289

Renewals from $239

More Info

Subscribe Now!

ClarionMag Blog

RSS Feeds

Updates via Email

Enter your Email


Powered by FeedBlitz

Quick Links