1335-1337

Previous Table of Contents Next

Page 1335

CHAPTER 58

Data Warehouses
and Data Marts

IN THIS CHAPTER

  • An Introduction to Data Warehouses and Data Marts 1336
  • Typical Uses of Data Warehouses 1344
  • Designing a Data Warehouse 1345
  • Star Join Schema 1350
  • Tuning Data Warehouses in Oracle 1351
  • Oracle Data Marts 1351
  • OLAP Engines 1353

Page 1336

Oracle produces a wide range of products. The database management system (DBMS) works on a wide range of platforms and operating systems. The DBMS also is designed to support a wide range of user applications. This chapter discusses one of the more challenging uses of the Oracle DBMS: data warehouses. These are large repositories of information that can bring even the largest multiprocessing system to its knees. These warehouses can eat up amounts of memory that were unthinkable not too many years ago. They also can create the need for very high-speed data storage and transfer systems.

So what are these data warehouses used for? The most common application that I have run across deals with storing large volumes of sales data for analysis. The logic behind this is that sales are what generates revenue and profits for the company, therefore it is relatively easy to justify the cost of the system based on increases in efficiency in purchasing and marketing efforts. However, there are any number of uses of data warehouse technology including storing production data, scientific information or even information related to stock performance. So what is not a data warehouse? Typically systems that perform a lot of user data entry or control operations are referred to as online transaction processing (OLTP) systems. Data warehouses, on the other hand, receive data from other systems (not end user data entry) and support user queries and analysis.

When implemented properly, though, data warehouses can enable you to run your business based on facts in addition to intuition. The term information overload has been used many times to describe the amount of information available in the world today. The real challenge is finding that one key piece of information you need. Most organizations scatter information across a number of production systems. This method is somewhat analogous to storing the information in file folders in everyone's desk drawers. It works, but it can be hard for you to hunt down the right person, get access to that information, and then figure out that person's filing system. Conceptually, the data warehouse is similar to a warehouse that stores supplies or products produced by your organization. It is a centralized location where information is stored in an orderly fashion that enables users to quickly retrieve and ship data where it needs to go.

This chapter provides an overview of data warehouses in the Oracle environment. Its goal is to give you an appreciation of what is involved with a data warehouse and an understanding of a few of the concepts that differentiate an efficient and useful data warehouse from a jumble of data that happens to be stored in a single database. This chapter also discusses the concept of data marts, which is a recent term that describes smaller data warehouses focused on a particular subject area. These smaller data marts can be developed more quickly and at a lower cost. Data marts also require a smaller host computer and network capacity.

An Introduction to Data Warehouses and
Data Marts

More elegant definitions than the one I propose here for data warehouses have been published in the industry. Complexity and elegance tend to confuse me, though. Instead, I try to focus on the simple concepts at the core of a subject and then deal with the complexity after I understand

Page 1337

the basics. So what are the basic concepts at the core of data warehouses? The following list is a good start:

  • A data warehouse is an organized collection of information. Who cares if the information exists if it cannot be found? This information needs to be accessible not only to computer professionals who have a deep understanding of proper database schemas, but also to users who might be developing their own ad-hoc queries. Therefore, you usually have to make the schema accessible to people who might not want to form a SQL statement that joins a dozen tables together based on proper primary and foreign key relationships.
  • A data warehouse provides a central location from which to find key information needed to answer user questions. The information stored in the data warehouse almost always is available online somewhere else in your organization. Typically, the information is stored first in some OLTP system. Users could access this information by writing the appropriate software to avoid having to duplicate the information. There are three reasons why this approach usually is not practical. First, a number of different OLTP systems usually exist, and it would take some time to set up the connections (if it were even possible with some data storage formats) and train users to access these systems. Second, these systems have been optimized for speed of executing transactions as opposed to speed of executing large queries and have tight security restrictions associated with them. And third, you do not want a large query to bog down your day-to-day operations as it eats up CPU time.
  • A data warehouse needs to contain all the information that commonly would be needed to answer business questions. If you make it too hard for people to find the information, they probably will not put forth the effort and instead will rely on less accurate information sources, such as printed reports and the opinions of coworkers.
  • A data warehouse typically needs to be extended to answer ad-hoc questions that come up during the operation of a business. One of the tasks that kills most information systems (IS) organizations is keeping up with the never-ending series of reports and report modifications from the user community. Businesses continue to change; therefore, their information needs must change with them. In a way, that is what keeps all programmers and systems professionals employed. Keeping a dedicated emergency reports employee on the IS staff can be very expensive, however. Therefore, easy-to-use report-generation tools have been developed to enable end users who are at least a little bit computer savvy to develop reports. These tools enable you to answer ad-hoc reporting needs as they come up, which further enhances the value of the data warehouse.
  • Users need to be able to access information in the data warehouse in an easy and relatively quick fashion. From reading the earlier chapters in this book, you should realize that you can tune Oracle to perform at peak efficiency while using sophisticated data-search algorithms, such as those derived from the cost-based optimizer. That's a fine start, but it takes more than just technological sophistication and tuning
Previous Table of Contents Next


Oracle Unleashed
Oracle Development Unleashed (3rd Edition)
ISBN: 0672315750
EAN: 2147483647
Year: 1997
Pages: 391

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net