The Current State of Affairs

A healthy DW/BI system grows and changes. In our consulting practice we often speak with CIOs who want to know when the data warehouse project will be done and resources can be redeployed. Although the initial development phase will eventually end, its important to understand that the best DW/BI systems will have a permanent level of development activity. As the business changes, especially in a company that engages in mergers and acquisitions, the DW/BI system must evolve in step.

Even the best-designed DW/BI systems can start to experience trouble as they mature. In a recent article titled Data Warehouse Check-Ups, Margy Ross and Ralph Kimball outlined some common disorders and solutions. Were adapting and extending this list to discuss how to use SQL Server 2005 to revitalize your DW/BI system.

Reference 

See the article Data Warehouse Check-Ups by Margy Ross and Ralph Kimball, Intelligent Enterprise (June, 2004). A link to the article is available at www.kimballgroup.com/html/articlesbydate/articles2004.html .

Data Quality

The most serious and common data disorders are poor quality data, incomplete data, and late data. Frequently we find that data quality and delivery problems lead to business users unwillingness to accept the BI/DW system: The business users simply dont trust the data. Often for good reason.

The good news is that you seldom need to do a lot of research to find out where data quality is suffering. Your users are shouting at you that the numbers are wrong, or they cant get to the data when they need it.

To craft a solution, identify and prioritize the core problems. The business users will help you find the most important starting points, although the DW/BI team needs to temper business needs with a feasibility assessment. For example, customer duplication is a common data quality issue. The business users may be hot to have your team identify unique customers as the first project in your data quality rework . Customer de-duplication is almost always a high value project, but its also complex and high risk. You may do better to develop your SQL Server 2005 expertise with a project thats a bit less ambitious.

If your existing Extract, Transformation, and Load (ETL) system uses SQL Server 2000 Data Transformation Services (DTS) or is hand coded, it should be easy to find ways to improve quality and speed. If speed is the main issue, turn on DTS logging or evaluate your custom system to identify the main bottlenecks. You may be able to use SQL Server 2005 Integration Services to surgically re-work the problem areas, and leave most of the existing system in place. The more you modularized your existing system, the easier this will be to do. If quality is your main problem, you may be able to rework a relatively small piece of the ETL application by adding data quality screens within an existing transformation process.

You have to prepare for and investigate the possibility that your existing data quality issues are sufficiently severe that the best solution is to start over, one dimension or business process dimensional model at a time. Integration Services is more effective and performs better than DTS or hand coding. However, the greatest improvements are likely to come from a new ETL architecture, rather than by using Integration Services to pave existing cow paths. As politically unattractive as this option is, it could be the best alternative. Major changes that you should consider include:

  • Processing daily rather than monthly

  • Moving to incremental processing from full refresh

  • Standardizing names and addresses in preparation for de-duplication

  • Implementing data quality screening

As we describe later in this chapter, when we discuss converting the ETL system from SQL Server 2000 to SQL Server 2005, you may find it useful to create an Integration Services master package that coordinates the ETL system activities. This master package can call DTS packages, SQL scripts, or other processes that may be part of your current environment. With this infrastructure in place, you can be selective about which pieces of your existing ETL system to convert.

Mart Madness

A second kind of data disorder has to do with the way the data is structured for business users access. We work with many customers whove invested in developing an atomic, non-dimensional, normalized data warehouse thats in good shape. The problem is that business users complain about how hard it is to use the data warehouse. Perhaps youve tried to address this usability issue by spinning out a bunch of departmental data marts, often with dimensional models. We call this mart madness, and if youre in this situation you already know how hard this structure is to maintain, how much time it takes to bring up a new departmental mart, how much duplication you have between different departments marts, how seldom a report from one departments mart will foot back to anothers, and how difficult or impossible it is to combine data across marts.

There are significant architectural differences between mart madness and the Kimball Method architecture of conformed atomic business process dimensional models. Conformed means that business process models use the same dimensions, and snap together to answer enterprise-wide questions and avoid data inconsistencies. Atomic means that Kimball Method models are built at the finest grain of data available. Most departmental data marts are built with aggregated data. Business process dimensional models means that the design is based on business processes rather than departmental access patterns.

To address mart madness, return to the business requirements and develop the dimensional bus matrix, as we described in Chapter 1. You may be able to use the original business requirements document (if you have one), but you should conduct some new interviews. The bus matrix identifies the link between core business process dimensional models, and also highlights opportunities for improving the ETL system.

The updated business requirements and bus matrix should help clarify where to begin. If you have a well-functioning, non-dimensional data warehouse, use it to populate a Kimball Method dimensional model instead of departmentally focused aggregate data marts. In the short run you may be able to gain significant leverage from your existing infrastructure by building Analysis Services databases directly from the existing data warehouse. In the long run you should tackle the problem of conforming dimensions across the enterprise. Although in theory this conformation might be achievable on the fly, realistically you should plan to save the conformed dimensions and facts as a dimensional data warehouse in the relational database. This is especially true if the existing data warehouse has not been tracking the history of dimension attribute changes.

Business Acceptance Disorder

No matter how correctly your DW/BI system adheres to a methodology, no matter how accurate and timely the data is loaded, no matter how sophisticated the tools you provide to your users, if the business users arent using the DW/BI system, its a failure. The underlying problem may be any of those that we discuss in this section: data quality issues or cultural issues. The problem may derive from an insufficient focus on business requirements at the outset. Maybe the problem is something relatively easy to fix, like not enough predefined reports in the BI portal, or not enough user training. The only way to find out whats wrong is to ask the business users.

After youve revisited the business requirements and priorities, you may find a deeper problemyour initial business process dimensional model was of low value to the business users, or it was not delivered properly. Even in the case of a fundamental problem like these, you may be able to get a quick win by loading existing data into Analysis Services. You can build an Analysis Services 2005 cube from a normalized database, although it wont work as well or be as easy to maintain as one thats built from a dimensional model. Combine the new OLAP database with a query and analysis tool and some Reporting Services reports that are aligned with business requirements, and you may find youve greatly increased business users valuation of the DW/BI system, at a modest incremental cost.

Infrastructure Disorder

Its common to have a mix of tools and approaches in an enterprises information infrastructure. As the DW/BI system matures, you bolt on a new ETL product here, a new reporting application there, until you end up with Frankensteins monster.

SQL Server 2005 in and of itself isnt going to solve this problem. In fact, its likely to worsen the problem in the short run, as you add new tools and new approaches to the mix. However, if you commit wholly or partially to the Microsoft platform, you can be sensible about which components you migrate to SQL Server, choosing first those parts of your exiting infrastructure that are most problematic and require expensive software licenses.

Infrastructure problems are usually the result of a poorly thought out or non-existent architecture plan, which in turn results from a weak or nonexistent business requirements understanding. The best way to address infrastructure disorder is to go back to the business requirements and clarify what the DW/BI system needs to do to meet those requirements. Revisit the architecture design process to create a new high-level architecture model. Next, take this model and overlay the existing product mix and the functions available in SQL Server 2005. Identify the overlaps between the old and new, compare the capabilities and costs, and select the best option for meeting the business requirements. Finally, develop a phased implementation plan to migrate to the new architecture, starting with the capabilities that will support the most important business requirements.

Political and Organizational Problems

The most common reasons for a mature DW/BI system to be in poor health are organizational rather than technical. A healthy DW/BI system is one that supports a lot of use, grows with the addition of new business processes, updates its user interface every few years , and has business users who pepper the DW/BI team with requests for training or new reports.

If your existing DW/BI system is at all successfulor was ever successful in the pastyou had a business sponsor. The original business sponsor may have moved on to a new job, or may have lost interest in the project. If you go more than three months without meeting with your business sponsor, or if all new initiatives are driven by IT, youre likely suffering from sponsorship disorder.

The treatment for sponsorship disorder is to recruit a new business sponsor. Even if you plan to upgrade or migrate to SQL Server 2005 for some technical reason (improved features) or other reason (reduced licensing costs), you need to ensure you have an active and engaged business sponsor. Without a business sponsor, you should question your rationale for your proposed project.

Other political and organizational disorders are rooted in the corporate culture. These are difficult to address and again have very little to do with the technology. Examples of problematic corporate cultures are those with an excessive tendency to consensual decision-makinga common characteristic of academic and government organizationsor a culture that does not value informed decision-making but instead makes decisions on intuition. The only real solution to these problems lies with corporate management. The best you can realistically do is to bring the issue to managements attention.

Tip 

This problem is almost insurmountable for us as external consultants . When we encounter such organizations in our consulting practice, we usually walk away. Weve battered our heads against this wall too many times to sign up for it again. For you to succeed, you may need patience more than any technology or technique. If you can outlast the old guard and continue to work to find enlightened sponsors, you will eventually succeed.

Perfect Health

We hope that none of the disorders discussed applies to your organization, and your existing DW/BI system is in perfect health. If this is your scenario, youre probably reading this book because you have a new business process that users want you to add to the DW/BI system.

You can and should think critically about which components of the new business process dimensional model to host in SQL Server 2005. To reduce the projects risk, you may want to confine your initial use of SQL Server 2005 to only a few components, like Analysis Services. It would certainly be reasonable to put the relational data for the new dimensional model in the same database as the existing data warehouse, rather than hosting it separately in the SQL Server 2005 relational database. You may have your existing ETL tool make calls to invoke new Integration Services packages, or you may choose to continue using the tools youre comfortable with. Later in this chapter we discuss some integration issues in a heterogeneous environment.

One incentive for migrating part or all of an existing healthy DW/BI system is to minimize license costs. SQL Server is reasonably priced, but be certain youve evaluated the true licensing cost savings. In particular, consider whether the component youre evaluating, like Reporting Services or Analysis Services, is going to run on a new server. If so, thats an incremental licensing cost unless you have a site license. Also evaluate, as early as possible, whether youll require SQL Server 2005 Enterprise Edition. This is particularly likely for low latency or high volume systems.

It may very well be true that the licensing cost savings are sufficient to justify a migration project. But its best to correctly evaluate those savings before all the work is done.



Microsoft Data Warehouse Toolkit. With SQL Server 2005 and the Microsoft Business Intelligence Toolset
The MicrosoftВ Data Warehouse Toolkit: With SQL ServerВ 2005 and the MicrosoftВ Business Intelligence Toolset
ISBN: B000YIVXC2
EAN: N/A
Year: 2006
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net