Data Mart


In Chapter 3, we were introduced to the concept of the data mart. We learned that a data mart is a repository for data to be used as a source for business intelligence. The data mart is not used as part of day-to-day operations. Instead, the data mart periodically receives data from the online transactional processing (OLTP) systems. The data in the data mart is then made available to Analysis Services for creating cubes with preprocessed aggregates. This is shown in Figure 6-1.

image from book
Figure 6-1: An Analysis Services cube receiving data from a data mart

Who Needs a Data Mart Anyway?

In Chapter 4, we learned about an exciting new feature of SQL Server 2005 called the Unified Dimensional Model (UDM). The UDM makes it possible for business intelligence to be extracted right from the OLTP systems in a manner that does not put undue stress on these systems, thus eliminating the need for data marts. This is shown in Figure 6-2.

image from book
Figure 6-2: The Unified Dimensional Model eliminates the need for a data mart

So, why are we back to talking about data marts? Why don't we skip over all of this and get right to the UDM? The answer is this: even with the UDM, situations still exist where a data mart may be the best choice as a source for business intelligence data. These situations are shown in Figure 6-3.

image from book
Figure 6-3: Situations where UDM cannot be used

Legacy Databases

The UDM's data sources utilize Windows OLE DB providers to make a connection to OLTP systems. A number of different OLE DB providers are available for use when creating data sources. However, some database systems do not have an appropriate OLE DB provider to make this connection. In fact, some systems, especially legacy systems, do not have any way to allow the type of external access required by an OLE DB provider.

In these situations, the data must be exported from the legacy system and copied into a database that can be utilized by the UDM. This database is the data mart. The data must be exported from the legacy system into a format that can then be imported into the data mart. Usually, this is in the form of some type of text file: delimited columns, fixed-width columns, or, perhaps, XML.

Each time the export is run, a text file is created. This file is then imported into the data mart. The import is accomplished using Integration Services.

Data from Nondatabase Source

In some cases, the data needed for business intelligence is not even stored in a database. Production information from the automated manufacturing system could be written to a text file. Customer service call records might be logged to an XML file. Perhaps customer orders only exist on paper forms.

Again, in these situations, we need to import the data into a data mart before it can be utilized by a UDM. The text and XML files can be imported directly by Integration Services. The paper order forms must be scanned in or entered by hand into an electronic format that can be imported into the data mart.

No Physical Connection

In other cases, the data may exist in an OLTP database that has an OLE DB provider, but does not have a full-time connection to the location where the business intelligence must be utilized. The UDM requires some type of a connection that will support an OLE DB connection. If there is none, we again need to set up a data mart to serve as a repository for this data at the location where the business intelligence is utilized.

Of course, we also need some way to transport an export file from the OLTP database to the data mart, so it can be imported. This might mean performing a File Transfer Protocol (FTP) transfer over a dialup connection. It could mean putting the export file on a backup tape or burning it on a DVD-R and carrying it or shipping it between the two sites. However the data is transported, it is once again imported into the data mart using Integration Services.

Dirty Data

Dirty data can also trigger the need for a data mart. If the data in our OLTP systems contains a number of errors, inconsistencies, or duplicate information, we may need to clean the data before we can use it as a source of accurate business intelligence. Because of limitations in the OLTP systems, it may not be possible to properly cleanse the data in the OLTP database. Instead, the data must be exported from the OLTP system, and then cleaned up as it is imported into a data mart by Integration Services.




Delivering Business Intelligence with Microsoft SQL Server 2005
Delivering Business Intelligence with Microsoft SQL Server 2005: Utilize Microsofts Data Warehousing, Mining & Reporting Tools to Provide Critical Intelligence to A
ISBN: 0072260904
EAN: 2147483647
Year: 2007
Pages: 112
Authors: Brian Larson

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net