Dimensional Data Warehousing with MySQL. A Tutorial
Authors: Darmawikarta J.
Published year: 2004
Pages: 6-9/149
Buy this book on amazon.com >>

Who This Book Is For

Data warehousing is applicable to all types of organizations and businesses, from government agencies and non-profit organizations to schools , from manufacturers to retail stores, from financial institutions to health services, from brick-and-mortar to dotcom businesses.

This book is primarily for data warehouse developers. However, IT managers and other IT professionals, especially those interested in MIS (management reporting) and DSS (decision support application), will find this book useful too. In general, this book is for those responsible for or involved in preparing data for analytical applications and/or delivering information as printed or online reports .

This book is also for data warehousing beginners . It will, directly and instantly, be beneficial to those who are currently planning to develop their first data warehouse.

Teachers and students can use this book along with their text books to confirm their understanding of the concept and theory of data warehousing. Most chapters can be tailored and used for lab exercises.



Prerequisite Skills

This book is not for IT novices. To benefit from this book, you must have some system development experience. However, no prior data warehousing experience is necessary.

Those who will try the examples must have RDBMS and SQL hands-on skills.



What You Will Get from This Book

You will use only one example, a data warehouse that initially stores sales data, and by doing so, sharpen your data warehousing knowledge and practical skills. The example is a scaled down and simplified version of a real world data warehouse found in many types of businesses.

You will develop the example sales data warehouse on a MySQL database, step by step by using the techniques introduced in this book. These are techniques for resolving problems most commonly encountered in data warehouse development. By the time you finish reading this book and doing all the exercises, you will have acquired a working experience and be ready to get involved in your first real-world data warehousing project.



Chapters Overview

This book consists of 25 chapters and one appendix. The chapters are organized into four parts . Part I covers data warehousing basics. Part II explains the moving of data from the source to the data warehouse database. Part III talks about the techniques to handle growth. Part IV deals with some advanced dimensional techniques. The following section provides an overview of each chapter.

Part I: Fundamentals

Part I, covering the fundamentals of dimensional data warehouse, has four chapters.

Chapter 1, “Basic Components ” introduces the star schema (a database schema that has a fact table surrounded by dimension tables) and explains the basic components of the schema.

Chapter 2, “Dimension History” discusses the use of surrogate keys for maintaining dimension history.

Chapter 3, “Measure Additivity” covers one of the most fundamental characteristics of a dimensional data warehouse, namely the additivity of measurement stored in the data warehouse fact tables.

Chapter 4, “Dimensional Queries” introduces a type of SQL query that is most suitably applied to a star schema. A dimensional query is a way to prove the two most fundamental design points of a dimensional data warehouse: simplicity and performance.

Part II: Extract, Transform, and Load

All five chapters in Part II deal with data population and the fact and dimension tables.

Chapter 5, “Source Extraction” explains the various types of data extraction.

Chapter 6, “Populating the Date Dimension” covers the three most common techniques for populating the date dimension.

Chapter 7 , “Initial Population” and Chapter 8, “Regular Population” deal with the two types of population techniques: initial and regular.

Chapter 9, “Regular Population Scheduling” concludes Part II by providing step-by-step instructions to schedule regular population using Windows Task Manager.

Part III: Growth

Part III presents the various techniques for resolving problems associated with the growth of a successful dimensional data warehouse. There are ten chapters in Part III.

Chapter 10, “Adding Columns” deals with the techniques for adding columns to tables in the existing dimensional data warehouse.

Chapter 11, “On-Demand Population” covers the on-demand population technique.

Chapter 12, “Subset Dimensions” explains the techniques for helping users with subset dimensions.

Chapter 13, “Dimension Role Playing” is about using a dimension more than once in a fact table.

Chapter 14, “Snapshots” helps you deliver fast performance queries for users that need to work out summarized data.

Chapter 15, “Dimension Hierarchies” and Chapter 16, “Multipath and Ragged Hierarchies” are about single and multipath hierarchical techniques, respectively. These techniques help users with grouping and drilling analysis.

Chapter 17, “Degenerate Dimensions” shows you how to reduce the complexity of a data warehouse schema by applying the dimension degeneration technique.

Chapter 18, “Junk Dimensions” is about the junk dimension, a technique for selecting seemingly unrelated analytical pieces of data often required by users and organizing them dimensionally.

Chapter 19, “Multi-Star Schemas” shows you how to add more starts to your schema.

Part IV: Advanced Techniques

There are six chapters in this part.

Chapter 20, “Non-Straight Sources” explains how to deal with data sources whose structures do not map directly to the target tables in the data warehouse.

Chapter 21, “Factless Facts” helps you build an analytical aid, a factless fact table, for your users on data that does not have measure from its source.

Chapter 22, “Late Arrival Facts” covers a technique that you use when source data, particularly facts, does not come all together at its scheduled population time.

Chapter 23, “External Data Sources and Dimension Consolidation” covers two topics: handling external data sources and the technique for consolidating scattered attributes in multiple dimensions into one dimension.

Chapter 24, “ Accumulated Measures” discusses two related topics: computed measures and non-additivity of the accumulated measures.

Chapter 25, “Band Dimensions” explains a technique that helps users with their needs to analyze data on continuously valued attributes.

Appendix

Appendix A, “Flat File Data Sources” presents instructions on how to use the flat file data sources used in the book examples.


Dimensional Data Warehousing with MySQL. A Tutorial
Authors: Darmawikarta J.
Published year: 2004
Pages: 6-9/149
Buy this book on amazon.com >>

Similar books on Amazon