Methodology

 < Day Day Up > 



MIDEA, a multidimensional database development methodology (Cavero, 2001), consists of conceptual, logical and physical design. In conceptual design, IDEA is used as multidimensional data model. In the logical and physical design, multidimensional or relational logical models can be used. The methodology is supported by a CASE tool (Miguel, Cavero, Sánchez, & Canela, 2000) that allows the translation of conceptual IDEA schemes into logical schemes based on models supported by multidimensional or relational products. Figure 4 shows a tool prototype window.

click to expand
Figure 4: IDEA-DWCASE tool

IDEA multidimensional conceptual model is used to understand and represent analytical users' requirements in a manner similar to how ER model is used to interact with microdata (operational) users. Preexistent OLTP system data scheme and requirements obtained from analytical data users are the main inputs to the construction of IDEA multidimensional conceptual scheme.

The next step is to transform, using a set of methodological rules, each conceptual schema previously defined into a logical scheme based on the model of each product (pure multidimensional or relational with multidimensional issues). The most usual procedure in current projects is to translate directly the relational scheme into multidimensional schemes supported by OLAP tools.

The MIDEA approach allows reverse engineering of existing specific multidimensional schemes into IDEA conceptual schemes. These schemes could be checked against OLAP users' requirements to verify that the current data warehouse satisfies them.

In the same way, is possible to create and/or modify elementary ER conceptual schemes using a set of rules proposed in the methodology to satisfy analytical users' requirements.

The methodology uses as a reference framework the Spanish Public Methodology METRICA version 3 proposal (MV3), which is similar to British SSADM or French Merise. MV3 processes considered are those on which the data warehouse development has more influence, that is, information system analysis, design and construction (ASI, DSI and CSI). The new processes, modified from the MV3 proposal, have been named as ASI-MD (multidimensional), DSI-MD and CSI-MD. Of course, considering only these three processes doesn't mean that the others processes shouldn't be taken into account on a data warehouse development, but we have considered that the differences shouldn't be significant enough with respect to any other information system development.

Every process is divided into activities and every activity is divided into tasks. The methodology is fundamentally focused on data modeling and does not take into consideration the functional aspects of the development. Therefore, extraction, translation and load functions are not fully considered.

Figure 5 shows an overview of the methodology, showing the scope of its three processes, ASI-MD, DSI-MD and CSI-MD.

click to expand
Figure 5: Methodology overview

Below we offer a general overview of the three processes of the MIDEA.

The main purpose of ASI-MD process is to obtain a detailed specification of the data warehouse. This specification has to satisfy the information needs of users (business analysts, specialists,...) and serve as a basis for the design.

Information gathering is mainly done in ASI-MD 2 activity, "obtaining detailed requirements." We use as starting point the general requirements catalogue and high level schemes. Such catalogue consists of a set of generic and user-oriented requirements. These products should be refined with users by means of work sessions. In this way, data warehouse requirements are specified in more detail. In addition, data warehouse nonfunctional requirements must be identified (constraints that have to be accomplished related to performance, security, etc.). The purpose of activity ASI-MD 2 is to define a detailed and validated requirements catalogue, which serves as a basis to test correctness of schemes obtained in activity ASI-MD 3, "data warehouse conceptual modeling. " This activity contains a verification and validation task in which the schema must be revised to guarantee that it is complete, complies with the requirements catalogue, and meets some predetermined quality criteria.

Participation of users is essential to this process because it constitutes a warranty that requirements initially identified have been understood and incorporated into the system and, therefore, that it will be accepted.

In the DSI-MD process are described the necessary activities to obtain the data warehouse design. The process starts from the software requirements specification obtained in the ASI-MD process. The design process describes "how" to implement the elements detected in the analysis process. In this process the following tests are designed: query tests, query consistency tests and data warehouse acceptance tests.

Due to the nonexistence of a standard nor commonly accepted multidimensional logical model, the data design process should be done in one step, from conceptual to logical specific model (i.e., product dependent). This "one-step" logical designis carried out in activity DSI-MD 2 in case of MOLAP (multidimensional OLAP) systems or in DSI-MD 3 for ROLAP (relational OLAP) systems. Previously (in the activity DSI-MD 1), the appropriate technology (ROLAP or MOLAP) and product must be chosen. In this process there are three activities focused on the physical design. The purpose of them is to obtain and tune the physical design starting from the logical design obtained in previous activities (DSI-MD 2 and DSI-MD 3).

Finally, the main purposes of the CSI-MD process are codification and test of data warehouse starting from the design specification obtained in DSI-MD process. Tests made during this process are focused on queries and queries' consistency. Acceptance tests will be carried out during system implantation.



 < Day Day Up > 



Managing Data Mining Technologies in Organizations(c) Techniques and Applications
Managing Data Mining Technologies in Organizations: Techniques and Applications
ISBN: 1591400570
EAN: 2147483647
Year: 2003
Pages: 174

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net