The following suggestions make table creation easier.
Always declare a primary key. The basis of relational databases is that each table has a way to refer to exactly one row. SQL can take care of this so that we don't have to. All we have to do is choose a column (or set of columns ) to be our primary key. It makes deleting an entry easier and selecting a specific record faster. And it takes almost no effort on our part.
There are a few exceptions, of course. You do not need a primary key, for example, if you plan to use the database as a very simple mailing list and will never refer to one particular column. Or you might not use a key if you add to a table orders of magnitude more often than you retrieve from it and retrieval time is less of an issue than update time (each primary key is an index and adds time to the update time). An example of an update- intensive application is a real time data collection application. It's better to create a key and not use it than need it and be unable to get to it.
Blast information into " atoms ." It is easy enough to concatenate a couple of pieces of information to form larger, more complex expressions, but breaking up text into smaller pieces involves parsing. We all know a few (very few) people who enjoy text parsing, but most of us don't. Let's assume that we want to avoid the necessity. Why would we want to store someone's entire name as one string when we can store a first name and a last name (and a middle initial too). It's easy enough to concatenate "Jesse" and "M" and "Jacobson" to make "Jesse M Jacobson," but we can also quickly make "Mr. Jacobson" or "Jesse" out of those fields. The smaller the data chunks , the more flexibility we have in how we use the data.
Consider an algorithm that would correctly parse the names in Table 12.2.
Table 12.2. Parsing Names
| Unparsed || Parsed |
| First || Middle || Last |
| Jake La Motta || Jake || || La Motta |
| Sue Ann Nivens || Sue Ann || || Nivens |
| Edna St. Vincent Millay || Edna || St. Vincent || Millay |
| Jesse M. Jacobson || Jesse || M. || Jacobson |
Avoid redundant columns. If we store the birthday of each of our players, there is no need to store their ages. Why? Because SQL always knows today's date and can quickly deduce age from birthday and vice versa. We delete the redundant column. How do we choose which redundant column to delete? We might chose arbitrarily. This is often the case. Or perhaps we decide that the data we store is used more often than the data we calculate. Or perhaps it's easier to calculate in one direction than another (like square root vs. square). But in our case, it's simple: A birthday is always the same, no matter what. A person's age changes constantly. So to avoid updates, we store just birthdate.
Use the same data types in columns that must be compared. It's hard to compare an int to a char , and even harder to compare it to a float or a date . By hard, we mean time-consuming . Therefore, if you plan on doing a lot of comparison between columns, it's best if they have the same general type. Their size doesn't really matter, though.
Be careful ”it's easier if you don't have to change the table. The table can be changed after it's created. But by thinking through exactly what we want a table to do before we create it, we can save ourselves a lot of agony later on. Review the table, perhaps with a colleague, before using it. It's painless to change the table before we enter any data, but it's not easy with data already in the table. Modifying the table (except deleting or renaming columns) is tricky.
For instance, suppose we enter a new column. All the entries already in the table would be assigned a null value for the column. This can create problems, especially if NULL is a meaningful state for that column. So we try to avoid it. But as the motto on a baby's bib goes, "Spit happens."