The problem of database design can be summarized in a question: Given some body of data to be represented in one or more databases, how do we decide on a suitable logical structure for that data such that the information needs of the users are properly accommodated?
Different authors suggest slightly different procedures for database design. In essence, all procedures contain the following stages:
Requirements collection and analysis: the process of collecting and analyzing information about the part of the organization that is to be supported by the database application, and using this information to identify the users' requirements of the new system . Requirement specification techniques include OOA (object-oriented analysis) and DFDs (data flow diagrams),
Conceptual database design: this phase involves two parallel activities: (a) conceptual schema design, which produces a conceptual database schema based on the requirements outlined in phase 1; and (b) transaction and application design, which produces high-level specifications for the applications analyzed in phase 1. Complex databases are normally designed using a top-down approach and use the terminology of the Entity-Relationship (ER) model or one of its variations.
Logical design: at this stage the internal schemas produced in phase 2(a) are mapped into conceptual and external schemas. Ideally the resulting model should be independent of a particular DBMS or any physical consideration.
DBMS selection: the selection of a commercially available DBMS to support the database application is governed by technical, economic, and sometimes even political factors. Some of the most relevant technical factors include: the type of data model used (e.g., relational or object), the supported storage structures and access paths, the types of high-level query languages, availability of development tools, and utilities, among many others.
Physical design: the process of choosing specific storage structures and access methods used to achieve efficient access to data. Typical activities included in this phase are: choice of file organization (heap, hash, Indexed Sequential Access Method - ISAM, B+-tree, and so on), choice of indexes and indexing strategies, and estimation of disk requirements (e.g., access time, total capacity, buffering strategies).
Implementation and testing: the designed database is finally put to work and many unanticipated problems are fixed and the overall performance of the database system is fine-tuned.