Refining the Tables | Guerrilla Oracle: The Succinct Windows Perspective

When we left off in Chapter 4, we had defined eight tables (soon to be ten!) that we would need for our waiver exam database. Now our first step is to make sure we have all the items we will need for each table. Remember, these items are the columns of the tables; although we can add columns later, modify columns, and actually drop columns, it is really bad practice to have to make changes. Let's start with the STUDENTS table.

Chapter 4 defined just enough data elements so that we could do the data normalization necessary. However, suppose that you show your table design to some of the instructors, as a good database designer would, and their suggestion is to add a couple more items. The instructors want to know what department each student is in or, at a university level, what the student's major is. They believe that they'll be able to get some interesting demographic information once they have an operational database. This is the kind of thing researchers and marketing types are always looking for, and your proposed database is a natural. The instructors also mention that the TEST_HISTORY table should include the place where the test was taken, primarily for the sake of completeness.

As if this were not enough, the instructors tell you that they need to store the passing grade for each test. "Ah," you think, "simple." But then they say that sometimes the questions will have different weights. "Enough!" you think, as you take out another pencil to replace the one you just snapped. However, they show you the formula to use, and it becomes clear that all you have to do is add another column to each question on a particular test that will store the weight (with a default value of 1, since most exams are not weighted). As you're getting up to leave, another instructor stops you and asks the simple question: "When your system is done, can I get a list of all the questions by topic? Can I get all the ones for software or hardware or networking on a report? You know, things keep changing in the industry, and I want to make sure that what we ask is accurate and not outdated ."

You go back to your office thinking that you now have a couple of important changes to make: You need to be able to give them the report by type of question, and you have to store information on place, passing grade, and weight. The last three are easy, so you sit at your terminal and make some changes, and the tables now look like this:

STUDENTS Table	TEST_HISTORY Table
Student_ID	Test_ID
F_Name	Student_ID
M_I	Score
L_Name	Date_Taken
SSNum	Start_Time
B_Date	End_Time
Sex	Location
Street1
Street2
Town
State
Country
Zip
Level
Date_Created
Created_By
Department_Major
Email

You have to add the Passing Grade and Weight columns to two tables: TEST_ID and TEST_QUESTIONS_LINK. Fine; that's easy:

TEST_ID Table
Test_ID
Test_Name	TEST_QUESTIONS_LINK Table
Date_Created	Link_Test_ID
Author	Link_Question_ID
Comments	Weight
Type_ID
Passing_Grade
Time_Limit

What about the type of question? What did the instructor mean when she asked for "a list of all the questions by topic"? Let's see; she wanted a report by type of question ”that is, all the questions on a certain topic. So we need a way to mark each question with its "type" as well as to get a description of that type. Think back to normalization in Chapter 4. We automatically want to put Type and the Type_Desc with each question, right? Well, let's just add two columns: Question_Type and Type_Desc.

But wait a minute! Think about the nightmare that will be created when you want to change the description. You will have to go into every row in the QUESTIONS table and make the change! Instead, we do two things: (1) create a small table that has two columns ”Question_Type and Question_Type_Desc ”and then (2) add the Question_Type column to the QUESTIONS table. Anytime you need to report on the types of questions, you will just read the QUESTIONS table and link to the QUESTION_TYPE_DESC table. Because no other changes are necessary to handle question type, grade, location, and weight, all other tables stay as they are:

	TEST_TYPE_DESC Table
	Type_ID
	Type_Desc
QUESTIONS Table	ANSWERS Table
Question_ID	Answer_Question ID
Question	Answer_ID (answer #1 for the question)
Correct_Answer	Answer (actual answer for question #1)
Question_Type	. .
Date_Created	.
Author_ID	(These rows repeat for all the answers for each question.)

QUESTIONS_TYPE_DESC Table	STUDENT_ANSWER_HISTORY Table
Question_Type	Student_ID
Question_Type_Desc	Test_ID
	Question_ID
	Student_Answer

At the beginning of this chapter I told you that our eight tables would soon be ten. So now that we've added table nine, the QUESTION_TYPE_DESC table, what is the tenth one? If any of you caught the problem, congratulations!

It is a simple oversight, and I purposely waited until now to correct it. My thought was that it would make more sense to you at this point, where we've already covered normalization and preliminary table column definitions. What was missed in Chapter 4 was the fact that one person can author more than one test . See the break in normalization? We do not want to repeat the person's name over and over. Instead, we need a small AUTHORS table with just two entries ”an identifier and the author's name :

AUTHORS Table
Author_ID
Author

The significance is that this will become one of the base tables that another table is dependent on.

Note

By leaving out the AUTHORS table until now, I hope to have taught you to be very aware that table design and data normalization are difficult. In almost every case, these processes mean going over and over the tables and matching them against the real application. And even after you start building your tables, you may suddenly realize that something is missing ”be it a column, table, or link. Now is the time to make the change!