There has always existed a need to store information. To this end, people through the ages have written down information on papyrus, wood, stones, and paperanything they could use to ensure that the knowledge they had struggled to obtain would not be lost.
These knowledge gatherers were creating raw files of unstructured data, even if they did not understand the concept of a database. Searching for information stored in this way required reading a huge amount of information, and very few people were able to devote their lives to analyzing these documents. Some of these chosen few disseminated their information verbally, but vbase (verbal-based) data was not capable of spreading knowledge to all who may have needed it.
Simply storing data is not enough. A structure is needed to help you search for necessary information. In addition, a technique is needed that can help you navigate through this established structure.
With this in mind, there are three main components to a database system:
The structure of the data
The technology to store and retrieve the data
An example of a simple database would be a dictionary. It stores words and their meanings. However, it stores the words alphabetically so that the reader can use a searching technique to quickly find a specific word among tens of thousands of other unwanted words.
A more sophisticated database would be a telephone book. It can be considered as a sequential list of contacts (people or businesses) for which some key attributes are listed: name, address, and telephone number. Again, you can apply well known manual techniques to search for any contact by name, for this is the way the telephone book is structured. Once you find the name , you can discover the other attributes. However, if you want to search for all contacts that live in the same building, this telephone book would be useless. You would need to read the entire telephone book to obtain this information.
So far, only databases built on paper whose information must be processed manually using simple binary searching techniques have been discussed. When people built the first database systems, they were designed to store all data in flat files. Indexes were built on top of the system to find answers to specific questions. These database systems evolved over time to result in modern relational database systems, other types of database systems (object-oriented database systems), or even hybrid object-relational database systems. This book will focus on relational database systems, which is the basis for the way that Microsoft SQL Server 2005 structures data.
From a purely relational point of view, data is organized around entities and the relationship between these entities. A database designer must identify the following:
The entities that describe the reality of the business
The attributes of these entities
The roles that each one of these attributes play in the business
The relationship between entities and the nature of these relationships
The behavior of these entities and attributes when the data changes
The importance of the analysis above cannot be over- emphasized regardless of which type of database system you will be using. Your first and primary objective is to know your business and to design a model that appropriately describes your business data and procedures. Your database system will be limited by your design and will inherit any problems built into this design that are caused by a lack of understanding of your business needs.
With this design, you could build some sequential or random files (such as the old xBase files), or you could use something a little more sophisticated, such as Microsoft Access. This book will not cover these technologies. You might decide to build your own relational database system from this logical design. In this case, you would need to build tables containing columns and indexes, as well as constraints. You might perhaps need to add some behavior to them through programmable objects that would perform complex actions on this data.
However, designing a database in this way is not enough. You also need to take care of simple but important things, such as:
Authenticating the users who access the data
Authorizing or denying access to the data
Writing data to disk
Reading data from disk
Keeping indexes current so they can be used to search for the data needed
Checking data against business rules to ensure the quality of the data
Ensuring that relations between data are maintained , thereby avoiding disconnections between related objects
Isolating actions from concurrent users so that they cannot interfere with each other
Ensuring the durability of the data against system failures
Providing system information so as to monitor the system and take corrective action when necessary
This list can be lengthened, but the key point is that you need an intelligent system that can handle all of these tasks , regardless of the way the data is accessed. In the past, the database engine stored the data while the user application was responsible for performing the additional tasks listed above. However, it was impossible to guarantee that data was always managed through the user application, thereby leading to chances for data corruption.
This is why you can benefit from a server application that can both own and manage your data and perform all of these tasks to your advantage. This type of management is precisely the role of SQL Server 2005.
Having a system that can store and manage your data is important, but how should you read this data? How should you search for answers to your business questions? Relational database systems would be useless without a well defined query language that could help you write questions in a way that the system could understand, analyze, and execute. Most modern relational database systems use Structured Query Language (SQL) as the language of choice to interact with database systems. SQL Server 2005 uses a dialect of this language called Transact SQL, or T-SQL, as it is commonly called.
T-SQL is not only a query language, but also a programming language designed to interact with relational database systems. It might not be as sophisticated as other programming languages for procedural tasks, but it is extremely rich for database operations. In fact, it is an extremely difficult language to analyze by a computer system. Consider this fact: most programming languages are very predictive. In Visual Basic, if you write the keyword FOR, the system knows the exact structure of whatever you are going to write after it. However, when you write the keyword SELECT in T-SQL, it is difficult for the system to know what type of statement you are trying to write.
T-SQL possesses a complex and rich grammar that makes it extremely efficient for data access. However, this grammar would be useless without an intelligent query processor that was able to convert human-written T-SQL sentences into machine-executable instructions. In other languages, this is the role of the compiler. Yet a database needs an intermediary agent that makes sure that the code produced is optimum for the specific data it is to manage. This is the role of the Query Optimizer. There are always many ways to access the information you need, and the system should be able to identify the best manner to do it.