Data Source Views (DSV) enable you to create a logical view of only the tables involved in your data warehouse design; in this way system tables and other tables not pertinent to your efforts are excluded from the virtual workspace. In other words, you don't have to look at what you're never going to use directly anyway. DSVs are a powerful tool; in fact, you have the power to create DSVs that contain tables from multiple data sources which you learn later in this chapter. You need to create a DSV in your Analysis Services data since cubes and dimensions are created from the DSV rather than directly from the data source object. The DSV wizard retrieves the schema information including relationships so that joins between tables are stored in the DSV. These relationships help cube and dimension wizards to identify fact and dimension tables as well as hierarchies. If the relationships do not exist in the data source we recommend you to create the right relationships within the DSV. Defining the relationships between the tables in the DSV helps you to get a better overview of your data warehouse in the DSV. When you take the time to create a DSV it ultimately pays for itself in terms of speeding up the design of your data warehouse.
Back in Chapter 2 you used the DSV wizard to create a view on the Sales fact table in Adventure Works DW. The DSV wizard is a great way to get a jump-start on DSV creation. Then, once the DSV is created you can perform several operations on it such as adding or removing tables, specifying primary keys for tables, and establishing relationships between tables. These operations are accomplished within the DSV designer. You learn more about the DSV and operations within the DSV in the following sections.
The DSV Designer contains three panes, as shown in Figure 4-5. The center pane contains a graphical view of all the tables in the DSV; its primary keys and the relationships between tables are represented by lines with an arrow at the end. The top-left pane is called the Diagram Organizer, which is helpful in creating and saving concise views within large DSVs. When a DSV contains more than 20 tables it is difficult to visualize the complete DSV within the graphical view. When there are a large number of tables, typically only a subset of the tables is related. Hence you will likely perform operations only on a subset of these tables at any given time, and the Diagram Organizer is a handy way to create several diagrams that include just such subsets of relevant tables. Note that operations done on the tables within this diagram are reflected real-time in the entire DSV. By default you get one diagram that is called All Tables.
Figure 4-5 shows part of the default diagram All Tables that is created at the completion of the DSV wizard. The lower left pane is called Tables and is used to show the tree view of all the tables of the DSV along with their relationship with other tables. Figure 4-6 shows the Tables pane with detailed information of the DimCurrency table; you can see the primary key of the DimCurrency table and the CurrencyKey, which is distinguished by a key icon. In addition, there is a folder that indicates all the relationships between the DimCurrency table and other tables in the DSV. If you expand the Relationships folder (as shown in Figure 4-6) you will see that the DimCurrency table joins to the FactInternetSales and FactResellerSales through the CurrencyKey — where the join is specified within parentheses.
It is most common to initially create DSVs using the DSV wizard. Also common is the desire to modify what the wizard comes up with to maximize usefulness of the view; what you get initially is usually good, but subject to improvements. The DSV designer is what provides you with the capability to easily modify the existing tables in the DSV. To modify the existing tables, right-click the diagram view pane and select Add/Remove Tables, as shown in Figure 4-7.
This invokes the Add/Remove Tables dialog shown in Figure 4-8. Using this dialog you can add additional tables to the DSV by moving tables from the Available objects to the Included objects or remove existing tables by moving them from Included objects to Available objects. If you want to remove a table from the DSV, you can do so either in the DSV Designer or in the table view by using the following steps:
Select the table to be deleted.
Right-click the table and click Delete.
Click OK in the confirmation dialog that appears.
It is likely that you will encounter underlying databases without the primary key to foreign key relationships that you will need in place for preparation of data for analysis; that is, building dimensions and cubes. The DSV wizard extracts primary keys and the relationships specified in the underlying relational database to form primary keys and the relationships represented in the DSV. As mentioned, perhaps some of the OLTP systems you use do not have the primary keys and relationships specified in the relevant tables — or when you design your data warehouse you might want to change these to suit your data warehouse design. The DSV designer provides you with the functionality to specify primary keys for the tables that do not have them already, and in this way you can effectively modify or add new relationships between the tables in the DSV.
To specify the primary key(s) for a table, you need to do the following in the DSV Designer:
Select the column in the table that you want to specify as a primary key. If there is more than one column that forms the primary key, you can do multiple selections by holding the Ctrl key. If the tables have auto-increment setup for the key column in the database then you will not be able to change the primary key(s) of the tables.
Right-click and select Set Logical Primary Key.
When there is a relationship between two tables F and D, you typically have columns A and B in tables F and D that are involved in the join. Typically column B is the primary key in table D, but not always. Column A is referred to as the foreign key. An example would be a Sales fact table that has a product id as a column that joins with the product id in the dimension table Products. In order to specify relationships between tables in the DSV, you need to use the following steps:
Select the column A in table F that is involved in the join to another table.
With column A selected, drag and drop it to column B in table D.
This forms a relationship between tables F and D and a line will be created between these two tables with an arrow pointing toward table D. If you double-click this line you will see details on the relationship — tables involved in the relationship and columns used for the join. Figure 4-8 shows the relationship between FactResellersSales and DimReseller tables. You can modify the relationship using this edit relationship dialog by either changing the columns involved in the join or by adding additional columns that are involved in the join.
You can also create a new relationship by right-clicking a table and selecting New Relationship. You will be asked to specify the relationship in the Create Relationship dialog which is similar to the Edit Relationship dialog shown in Figure 4-9 where you need to choose the columns in the source and destination tables are involved in the join.
All graphical operations such as drag and drop and specifying primary keys that are accomplished in the diagram view can also be accomplished in the table view.
While modeling your data warehouse often you will want to select a few columns from tables, or restrict the fact table rows based on some specific criteria. Or you might want to merge columns from several tables into a single table. All these operations can be done by creating views in the relational database. Analysis Services 2005 provides the functionality of performing all these particular operations within the DSV using Named Query. You can invoke Named Query editor by right-clicking a table and then selecting Replace TableWith New Named Query…, as shown in Figure 4-10. If you want to add a specific table twice in your DSV or add some columns of a new table, you can launch the query designer by right-clicking the DSV Designer and selecting "With New Named Query…."
Named Query manifests itself as a query designer that helps you to build custom queries to create a view. The Create Named Query designer dialog is invoked when "With New Named Query…" is selected as shown in Figure 4-11. In this dialog you can add tables from the data source; select specific columns from the tables and apply restrictions or filters using the graphical interface. A SQL query is created based on your selections and is displayed in a pane. If you're a SQL wizard, you can forego filling out the dialog elements and paste a valid SQL query in the query pane. We recommend that you then execute the query to make sure the query is correct. The results from the underlying relational database will then be visible in a new pane beneath the query pane. Click OK once you have formed and validated your query. The table is now replaced with results from the query you have specified in the DSV with selected columns.
In certain instances you might want to create a new column in the table. An example of this would be to create a Full Name of an Employee from the first name, middle initial, and last name. One way to accomplish this task would be to replace the table with a named query and write the appropriate SQL to create this additional column. However, Analysis Services 2005 provides a simpler way to do the same operation. Right-click on the Employee table and select New Named Calculation. This action invokes the Edit Named Calculation dialog shown in Figure 4-12. To add a column called Full Name to the Employee table you just need to combine the first name, middle name, and last name. You can type the expression for this in the Expression pane as shown in Figure 4-12 and then click the OK button.
A new column is added to the Employee table as shown in Figure 4-13. The data type of this calculated column will be determined based on the data types of the actual columns involved in the calculation or data used within the expression. If the expression results in a number, the data type for this column will be an integer. In the preceding example the data type of this column is a string.
The DSV maintains the calculated column of a table as a computed column in the metadata; it does not write it out to the underlying tables. When you want to view the data of this table (which you see later in this chapter), the expression must be added to the SQL query so that you can see the data of this computed column.