Unfortunately, testing, like reconciliation, is often done very poorly on BI projects, if at all. That is not acceptable ”and neither is the excuse that "it can be fixed in the next release." The next release will be even larger and more complicated, and it will require more time to test. In other words, if it takes too long to test now, it will take even longer to test later, which usually means that adequate testing is never done. Rushing into deployment at the expense of testing is inexcusable, especially if it is done to meet some artificial deadline. The same types of testing that apply to operational systems also apply to BI applications (Figure 11.5). The following sections briefly describe each type. Figure 11.5. Types of Testing
Unit TestingUnit testing refers to the testing of discrete program modules and scripts, not to testing the flow or the links between the programs. Every developer must test his or her program modules and scripts individually (if XP techniques are utilized, this is done in pairs). There are three components to unit testing: compilation, functionality, and edits.
If an ETL tool is used, unit testing still applies to the individual ETL tool modules. In that case, you are testing the validity of your ETL instructions, that is, the ETL technical meta data. It is important to note that if an ETL tool does not meet the standards of your organization, you will have to supplement the tool with your own custom-written code. In that case, unit testing will be a combination of testing the ETL technical meta data, testing the custom-written code, and testing the "handshake" between them. Integration TestingIntegration testing, also known as system testing, is the first complete ETL process run. This includes all three sets of ETL processes: the initial load, the historical load, and the incremental load. Just because all program modules pass their individual unit tests, it cannot be assumed that the entire ETL process will run smoothly. The interactions and flow of all programs, as specified in the ETL process flow diagram, must be observed and validated .
If an ETL tool is used, the entire ETL process must still be tested from beginning to end, except that you are running the ETL tool processes instead of custom-written programs. The business representative and the subject matter expert should be involved in integration testing. They are the first to know whether a particular run was successful or not. This is also an excellent opportunity for some advanced training for them, and it will go a long way toward allaying any future suspicions about the accuracy and quality of the data in the BI target databases. Integration testing, like unit testing, requires many test runs to remove all the defects and to tune the flow. Every time the actual test results do not equal the expected test results, the program producing the error must be corrected, and all programs must be rerun. But unlike unit testing, integration testing is far too complicated to be performed without a formal test plan, which should include a description of the test cases and the sequence in which the programs should be executed. Regression TestingThe most complicated and most time-consuming of all types of testing is regression testing. It is similar to integration testing, but this time the programs that are being tested are not new. Since the BI decision-support environment is evolving and new BI applications are added to the ETL process continuously, the project team will have to perform extensive regression testing on all releases except the first one. With every new BI release the ETL process has to be modified (enhanced) to extract more data from the operational systems for the new BI application. The new BI application may have a separate set of data access and analysis programs, but the ETL process is shared. The main goal of regression testing is to make sure that the modifications to existing ETL programs did not inadvertently produce some errors that did not exist before. If new programs were added to the ETL process flow, the new interactions between programs must be tested, and any affected subsequent old programs must be retested. If the ETL process flow had to be enhanced or redesigned to increase efficiency, the entire ETL process has to be retested.
Performance TestingPerformance testing, also known as stress testing, is performed to predict system behavior and performance. This is very similar to traditional performance testing on operational systems, but BI performance testing is more complicated because of the enormous volumes of data in the BI target databases. Since most organizations do not have more than three to four hours left in their nightly batch windows , the goal is to find out how much data can be processed during that time and how many nights it will take to complete the entire ETL process. Unlike integration testing or regression testing, performance testing does not have to be performed on every program module. Performance testing could be limited to only the most critical program modules with the highest volumes of data and the longest runtimes . In addition to running physical tests, you can use stress test simulation tools. These simulation tools allow you to describe the production platform, including other programs running on the same server and sharing the same space. Based on your input, the tools calculate and project estimated performance numbers . It is highly recommended to run a simulation prior to actual performance testing with real data. Quality Assurance TestingMost large organizations have strict procedures for moving an application into production. These procedures usually include QA testing, and in most cases a separate QA environment is established for such testing. Operations staff direct the developers in moving databases and programs into the QA environment. Then all operating instructions and scheduled jobs have to be turned over to the operations staff for testing. They will go through a simulated production run before allowing the application components to transfer to the production environment. Acceptance TestingAcceptance testing can be performed in one of two ways, depending on how testing as a whole is set up. Ideally, there should be a separate acceptance test environment, which could also be used for regression testing of future releases. With a separate acceptance test environment, QA testing and acceptance testing could be done at the same time. However, it may not be feasible or justifiable to maintain a separate acceptance test environment. A simpler alternative is to perform acceptance testing after QA testing in the same QA environment. If the business representative actively participated during integration testing or regression testing, there should be very few surprises during acceptance testing. In fact, if the business representative is comfortable with the integration or regression test results, and barring any unforeseen problems detected during QA testing, separate acceptance testing may not be necessary at all. However, if a traditional approach was followed in which the business representative was not involved in any design or testing activities except for occasional reviews, acceptance testing is the most important test of all. Some project teams limit acceptance testing to the access and analysis portion of the BI application and exclude the business representative from ETL testing. That is a big mistake. When business analysts and business managers complain about incorrect data in the BI target databases, the reason may not be that the report programs do not work properly but that the ETL process is faulty. Therefore, testing how to get the data into the BI target databases correctly is more important than testing how to get it out correctly because an error in a report program is a lot easier to find and correct than an error in the ETL process. Moreover, since the business representative is involved in source data analysis and in providing the business rules for data cleansing, it is only logical that he or she should test the ETL process that implements those rules. The business representative should ask some of the following questions.
|