SQL Server Reporting Services (SSRS) relies on the SQL Server 2005 database, SQL Server 2005 Analysis Services, and online analytical processing (OLAP) to mine project data. Figure 16-1 illustrates the overall architecture of Team System's data warehouse.
Figure 16-1
In Team System, metrics from every tool (testing tools, build, version control, and work items) are automatically pushed into several relational databases (operational stores), and then processed into an analysis cube (called TFSWarehouse). The default reports that ship with Team System tap into the warehouse to quickly generate graphs that provide a near-real-time view of the progress of your project.
Important | SQL Server Analysis Services is an OLAP database system. |
Online analytical processing (OLAP) is a term that was coined by Dr. Edgar F. Codd, considered by many as the inventor of the relational database model. Dr. Codd came up with 12 rules that allow anyone to correctly identify any OLAP system. For example, the OLAP approach is designed to provide summarized data very quickly, is typically used in business or sales applications, and is an important component of business intelligence (BI).
You aren't restricted to use Microsoft tools to access an OLAP cube. Several third-party products can help you interact with them (for example, ProClarity). Team System provides two primary ways of creating cube-based reports: via Microsoft Excel and the Business Intelligence Projects within Team Suite.
Important | All the default Team System reports use the TfsWarehouse OLAP cube as the primary data source. |
Team System relies on multidimensional OLAP (or MOLAP) to manage the data warehouse. By multidimensional, we mean that you can correlate many of the metrics of Team System; for example, you can view the work items by build-by date. Another example is a view by changeset, test results and work items. OLAP uses multidimensional expressions (MDX) query language to manipulate multidimensional databases.
A cube is a database structure that is specifically designed for OLAP, and provides a way for team members to collect and view aggregated data. For example, totals for work items, builds, test results, and much more.
Think of your typical database table; it is flat and represented in two dimensions. One axis represents the fields in your table. The other axis contains the rows of data in your table. An OLAP cube can represent data in three or more dimensions. Figure 16-2 shows an OLAP cube with three dimensions (including builds, changesets, and work items).
Figure 16-2
The Team System data warehouse is composed of four core components:
Operational stores - The operational stores gather live information about each component of Team System such as build, version control and work items. For example, the data source for work items is called TfsWorkItemTracking.
Adapters - The adapters handle a variety of tasks, including managing cube processing cycles, and adapting and customizing the data warehouse as your process templates begin to change (through customization). These adapters tap into the data warehouse API and Web services to perform the tasks in question.
Relational database (fact tables) - The relational database correlates and normalizes the data from each operational store. Because the information is integrated, it is a lot easier to manage and report against. Each of these tables contains measures, dimensions, and details.
OLAP Cube - Aggregates and optimizes the Team Project data to be processed in reports.
Figure 16-3 outlines how Team Project data flows through the warehouse. Your project data are captured within each operational store. Adapters normalize the data into an established format, which makes it easier to interrelate. The data are then pulled on a periodic basis into the OLAP cube.
Figure 16-3
Important | If you ever try to publish your test results to Team Foundation Server, you'll sometimes notice that the IDE complains that a build is not present within your current (or new) Team Project. The reason is that the test data are typically contained in the Team Foundation Build operational store. |
Figure 16-3 shows the structure of Team System's data warehouse.
The operational stores contain raw data about test results, builds, code churn, work items, and so forth. Each store has a unique schema tailored to each Team System feature and your operational stores are optimized to contain live data. The adapters perform a variety of tasks including data synchronization and correlation (for example, correlating builds and work items) and the normalization of data into the relational database. Adapters are also used in the processing of the cube.
When you create custom work items (or other types of customizations), the adapters also perform the task of synchronizing these new custom fields on the fly within the OLAP and relational schemas. All the adapters implement the IWarehouseAdapter interface, and are DLLs stored in the following directory: C:\Program Files\Microsoft Visual Studio 2005 Team Foundation Server\Web Services\Warehouse\bin\Plugins. For example, the build adapter is called Microsoft.TeamFoundation.Build.Adapter.dll.
Fact tables are simply used to aggregate and relate Team Project data. For example, the fact table will tell you that 12 work items are associated to a specific build for a specific Team Project. If you want precise details about each build, then you have to look into the operational tables. Fact tables usually contain numerical data - for example, summaries of all the current work items. Fact tables also help provide a historical view of your data. For example, fact tables contain dates and record counts that can be correlated.
You'll hear Microsoft say repeatedly that the relational database system uses a star schema. What exactly does that mean? Figure 16-4 shows the star relationship between the operational stores and the fact table. The fact tables contain measures, dimensions, and details.
Figure 16-4
The Team System relational database contains a variety of metrics pertaining to work items, changesets, builds, test results, and much more. Fact tables establish relationships between different project metric types - for example, work items and changesets, or builds and work items. Fact tables don't actually contain any details, only raw numerical data.
Important | The default Team System reports access the relational warehouse to get a timestamp. But you can easily report against the relational database (especially if all you want is an up-to-the-minute view of the data, or you just want to query a list - such as a list of work items). |
For example, the Build Changeset fact table only contains a list of unique identifiers for builds and their associated changesets. The table shown in Figure 16-5 was obtained by connecting to the SQL Server 2005 database using the SQL Server Management Studio, expanding the TfsWarehouse table, and generating a database diagram:
Figure 16-5
You'll notice that it contains a field for work items, a field for changesets, and a field for Team Projects. Work item and changesets are the dimensions of that particular fact table. Running a SELECT * query on the table reveals the following information shown in Figure 16-6.
Figure 16-6
As you can see, the unique identifiers (foreign keys) for both of the work items and changesets establish a relationship between the project data types.
There is a lot of terminology associated with OLAP, warehouses, and reporting. Let's get some clearer meaning of the terminology to get a better idea of how the parts all work together.
Measures allow you to tally up, count, and aggregate your data in a Team System report. For example, in the Build perspective, you have measures such as a build count and a build project count. These counts are stored in the relational warehouse within the fact table. Team System's OLAP cube contains seventeen measure groups (as shown in Figure 16-7).
Figure 16-7
A dimension is used within an OLAP cube to provide a summary of measures and contain dimension attributes. The entire column name in dimensions are Reference Names and some of the data rows are designed to be an internal resource within the OLAP cube. If you are planning to use dimensions within relational warehouse reports, you should use SQL views (set up in a separate database) to make the tables easier to work with. You can also set up indexes to boost up your performance. The separate database is important because it separates out the customizations you have made to the actual product, and prevents issues if the database schema is changed for whatever reason (for example, a service pack). Otherwise, all of your reports will break. Figure 16-8 shows all the dimensions available in the OLAP data cube.
Figure 16-8
As you can see, some examples of dimensions include Rank, Person, Date, and so forth. Dimensions are useful for slicing up report data. For example, you may have a list of work items. If you filter them using the Date dimension, you will get a view of all the work items by date values. Dimensions are implemented in the relational database as a set of tables that tie into the fact table.
Dimension members are a named instance of a dimension. For example, you can have dimension members for changesets called Changeset1, Changeset2, and so forth. This also applies to dates and other dimensions. If you try to do a query within a pivot table, you may find that too much information is returned. A filter is a collection of dimension members that limit (or filter) the amount of data provided in a query.
You can also view a dimensions hierarchy. For example, the Date dimension has a hierarchy as shown in Figure 16-9:
Figure 16-9
Details provide titles that are used in list reports in the relational database. They are composed of strings of text that can be used within reports, for example as a descriptive list. In a nutshell, details are used to provide more "detail" in an OLAP report.
The amount of data contained in the Team Foundation OLAP cubes is massive. Trying to create custom reports can be tricky because it can be difficult to filter out only the information that you need. You can use perspectives to filter your data and reduce the complexity and manageability of your reports. Perspectives will make any element of your data cube visible or invisible, based on your needs. Across all cubes, you have a shared set of dimensions across all cube perspectives. They include the following:
Team Project
Area
Iteration
Person
Date
You can use them to tie them against other measure groups. Note that you can only use perspectives if you install Team System with Microsoft SQL Server 2005 Enterprise Edition (x86 or x64) as your data tier.
Important | This section of the chapter is meant to provide a solid overview of the concepts around Team System's data warehouse. There is so much information that it would be quite easy to write an entire book on the topic. If you are craving deeper details, the best source of information bar none is the MSDN documentation. Refer to "Team Foundation Server Data Warehouse" at http://msdn2.microsoft.com/en-us/library/ms244712.aspx. |
The Microsoft Solutions Framework provides a good starting point in trying to understand what the Team System reports are for. To navigate to the list of reports, you can click Index⇨Reports in MSF for Agile Software Development and Index⇨Reports in MSF for CMMI Process Improvement. The documentation provides diagnosis information such as healthy and unhealthy examples of report data. You can then do a project course correction to fix the issues.