We have seen in this chapter how to create our database using the GUI tools, but many readers may prefer to create the database directly from SQL. The SQL to achieve this is shown in the following code, but it assumes that the database has already been created.
The example shown here has been created for a Windows 2000 system; a simple edit of the file specs is all that is required for a different platform. Also note that the file sizes here are very small compared with what you would use in a production environment.
The first step is to connect to the database using a powerful user name:
connect system/manager;
The next step is to create the tablespaces where the data will reside and their associated data files:
-- Create Tablespace for Undo Area CREATE TABLESPACE easy_undo datafile 'D:\EASYDW\easy_undo.f' size 8m reuse autoextend on default storage (initial 16k next 16k pctincrease 0 maxextents unlimited); -- Temporary Tablespace CREATE TEMPORARY TABLESPACE easy_temp tempfile 'D:\EASYDW\easy_temp.f' size 10m reuse autoextend on next 16k ; -- Tablespace to store Materialized Views CREATE TABLESPACE mview datafile 'D:\EASYDW\easy_mview.f' size 6m reuse autoextend on default storage (initial 16k next 16k pctincrease 0 maxextents unlimited); -- Tablespace for Dimensions CREATE TABLESPACE easy_dim datafile 'D:\EASYDW\dimensions.f' size 5m reuse autoextend on default storage (initial 16k next 16k pctincrease 0 maxextents unlimited); -- Tablespace for the INDEXES CREATE TABLESPACE easy_idx datafile 'D:\EASYDW\index.f' size 5m reuse autoextend on default storage (initial 16k next 16k pctincrease 0 maxextents unlimited); -- Default Tablespace CREATE TABLESPACE easydw_default datafile 'D:\EASYDW\easydw_default.f' size 5m reuse autoextend on default storage (initial 16k next 16k pctincrease 0 maxextents unlimited);
Once the tablespaces have been created for the dimensions, we can create the tablespaces for the fact table, "purchases." Since we will be partitioning the data, we must now create the tablespace for each partition. There will be one partition per month for the data and another partition for the indexes. Here we will create only the January partition for the data and index; simply repeat this process for the other partitions.
-- create the 3 month tablespaces for the fact partitions CREATE TABLESPACE purchases_jan2002 datafile 'D:\EASYDW\PURCHASESJAN2002.f' size 5m reuse autoextend on default storage (initial 16k next 16k pctincrease 0 maxextents unlimited); -- create the 3 month tablespaces for the fact indexes CREATE TABLESPACE purchases_jan2002_idx datafile 'D:\EASYDW\PURCHASESJAN2002_IDX.f' size 3M reuse autoextend on default storage (initial 16k next 16k pctincrease 0 maxextents unlimited);
Once the tablespaces have been defined, the EASYDW user can be created, which will create the schema where the data will be stored:
-- create a user called EASYDW -- this will be the schema where the objects will reside connect system/manager; CREATE USER easydw identified by easydw default tablespace easydw_default temporary tablespace temp profile default account unlock; GRANT unlimited tablespace to easydw ; GRANT dba to easydw ; GRANT create session TO easydw;
The DBA privilege has been granted to the users so they can create and manage the tables and indexes.
Hint: | Don't forget to connect as user EASYDW before creating the tables and indexes, or the tables and indexes will be defined in the wrong schema. |
-- now create the database CONNECT easydw/easydw; -- CUSTOMER Dimension CREATE TABLE easydw.customer (customer_id varchar2(10), town varchar2(10), county varchar2(10), postal_code varchar2(10), dob date, country varchar2(20), occupation varchar2(10)) pctfree 0 pctused 99 tablespace easy_dim storage (initial 16k next 16k pctincrease 0) ; CREATE UNIQUE INDEX easydw.customer_pk_index ON customer (customer_id) pctfree 5 tablespace easy_idx storage (initial 16k next 16k pctincrease 0) ; ALTER TABLE customer ADD CONSTRAINT pk_customer PRIMARY KEY (customer_id);
We have defined the constraint here by adding it later. It could also be created within the CREATE TABLE definition, but if this approach is used, you cannot control the name of the index it creates by default to ensure fast validation of the primary key. Therefore, we create the index first, and, when the constraint is defined, it will use this index rather than create its own.
-- PRODUCT Dimension CREATE TABLE easydw.product (product_id varchar2(8), product_name varchar2(30), category varchar2(4), cost_price number (6,2) constraint cost_price_not_null NOT NULL, sell_price number (6,2) constraint sell_price_not_null NOT NULL, weight number (6,2), shipping_charge number (5,2) constraint shipping_charge_not_null NOT NULL, manufacturer varchar2(20), supplier varchar2(10)) pctfree 0 pctused 99 tablespace easy_dim storage (initial 16k next 16k pctincrease 0) ; CREATE UNIQUE INDEX easydw.product_pk_index ON product (product_id) pctfree 5 tablespace easy_idx storage (initial 16k next 16k pctincrease 0) ; ALTER TABLE product ADD CONSTRAINT pk_product PRIMARY KEY (product_id);
Next the Time table is created. In this table, we have included a CHECK constraint to ensure that the column "public_holiday" can take only the values "Y" or "N."
-- TIME Dimension CREATE TABLE easydw.time (time_key date, day number (2,0), month number (2,0), quarter number (2,0), year number (4,0), day_number number (3,0), day_of_the_week varchar2(8), week_number number (2,0), public_holiday varchar2(1) constraint public_holiday CHECK (public_holiday IN ('Y','N'))) pctfree 0 pctused 99 tablespace easy_dim storage (initial 16k next 16k pctincrease 0) ; CREATE UNIQUE INDEX easydw.time_pk_index ON time (time_key) pctfree 5 tablespace easy_idx storage (initial 16k next 16k pctincrease 0) ; ALTER TABLE time ADD CONSTRAINT pk_time PRIMARY KEY (time_key);
For the TODAYS_SPECIAL_OFFERS table, we have defined the primary key to include two columns rather than a single column.
-- TODAYS_SPECIAL_OFFERS Dimension CREATE TABLE easydw.todays_special_offers (product_id varchar2(8), offer_date date, special_price number (6,2), offer_price number (6,2)) pctfree 0 pctused 99 tablespace easy_dim storage (initial 16k next 16k pctincrease 0) ; CREATE UNIQUE INDEX easydw.tso_pk_index ON todays_special_offers (offer_date, product_id) pctfree 5 tablespace easy_idx storage (initial 16k next 16k pctincrease 0) ; ALTER TABLE todays_special_offers ADD CONSTRAINT pk_specials PRIMARY KEY (offer_date,product_id);
Now we come to creating the all-important fact table, which is called PURCHASES. This table definition is quite complex, because it includes FOREIGN KEYS to several tables that are identified by the REFERENCES clause. For the column product_id, two constraints have been defined on the table: a NOT NULL and a foreign-key constraint. Provided each constraint is given a unique name, then it is allowed in a column in a table.
Here we have illustrated how to partition the table, which will be discussed in detail in Chapter 3.
-- Fact Table PURCHASES CREATE TABLE easydw.purchases (product_id varchar2(8) constraint not_null_product_id NOT NULL constraint fk_product_id REFERENCES product(product_id), time_key date constraint not_null_time NOT NULL constraint fk_time REFERENCES time(time_key), customer_id varchar2(10) constraint not_null_customer_id NOT NULL constraint fk_customer_id REFERENCES customer(customer_id), purchase_date date, purchase_time number(4,0), purchase_price number(6,2), shipping_charge number(5,2), today_special_offer varchar2(1) constraint special_offer CHECK (today_special_offer IN ('Y','N'))) PARTITION by RANGE (time_key) ( partition purchases_jan2002 values less than (TO_DATE('01-02-2002', 'DD-MM-YYYY')) pctfree 0 pctused 99 storage (initial 64k next 16k pctincrease 0) tablespace purchases_jan2002 , partition purchases_feb2002 values less than (TO_DATE('01-03-2002', 'DD-MM-YYYY')) pctfree 0 pctused 99 storage (initial 64k next 16k pctincrease 0) tablespace purchases_feb2002 , partition purchases_mar2002 values less than (TO_DATE('01-04-2002', 'DD-MM-YYYY')) pctfree 0 pctused 99 storage (initial 64k next 16k pctincrease 0) tablespace purchases_mar2002);
In this example, we have created the indexes immediately after the table definition. In a real data warehouse, the number of indexes created prior to loading the data is kept to an absolute minimum to ensure that the loading time is as fast as possible. Therefore, indexes exist usually only to check constraints.
-- Now create the indexes -- Partition on the Time Key Local prefixed index CREATE BITMAP INDEX easydw.purchase_time_index ON purchases (time_key) local (partition indexJan2002 tablespace purchases_jan2002_idx, partition indexFeb2002 tablespace purchases_feb2002_idx, partition indexMar2002 tablespace purchases_mar2002_idx); CREATE BITMAP INDEX easydw.purchase_product_index ON purchases (product_id) local pctfree 5 tablespace indx storage (initial 64k next 64k pctincrease 0) ; CREATE INDEX easydw.purchase_customer_index ON purchases (customer_id) local pctfree 5 tablespace indx storage (initial 64k next 64k pctincrease 0) ; CREATE BITMAP INDEX easydw.purchase_special_index ON purchases (today_special_offer) local pctfree 5 tablespace indx storage (initial 64k next 64k pctincrease 0) ;
The next step is to grant some privileges to our user, EASYDW. We will start with the following ones, which will allow us to use Summary Management, and, as we progress through this book, we will discuss other privileges that should be granted to users.
connect system/manager; -- Add privileges GRANT SELECT ANY TABLE TO easydw; GRANT EXECUTE ANY PROCEDURE TO easydw; -- Add privileges for summary management GRANT CREATE ANY DIMENSION TO easydw; GRANT ALTER ANY DIMENSION TO easydw; GRANT DROP ANY DIMENSION TO easydw; GRANT CREATE ANY MATERIALIZED VIEW TO easydw; GRANT ALTER ANY MATERIALIZED VIEW TO easydw; GRANT DROP ANY MATERIALIZED VIEW TO easydw; GRANT QUERY REWRITE TO easydw; GRANT GLOBAL QUERY REWRITE TO easydw;
You will have to repeat these steps for every user, and the privileges granted will, of course, vary by user.
The final step is actually not completed now but after the data is loaded. However, it is included here to remind you not to forget this important step, which is to analyze the table and indexes. These statistics are gathered using the package DBMS_STATS. The statistics gathered by the package are used by the optimizer. Without these statistics, features such as Summary Management will be unavailable. The DBMS_STATS command will be explained further in Chapter 3.
-- Now Analyze the Tables and Indexes EXECUTE dbms_stats.gather_table_stats('EASYDW','CUSTOMER'); EXECUTE dbms_stats.gather_table_stats('EASYDW','TODAYS_SPECIAL_ OFFERS'); EXECUTE dbms_stats.gather_table_stats('EASYDW','product'); EXECUTE dbms_stats.gather_index_stats('EASYDW', 'purchase_ customer_index');
Our database is now complete. We have a basic framework, and in the next chapter we will learn how to enhance our basic database design to include and use the sophisticated features that are available in Oracle 9i.