Prototyping, Benchmarking, and Testing | Inside Microsoft SQL Server 7.0 (Mps)

As you make changes to your application design or hardware configuration, you should measure the effects of these changes. A simple benchmark test to measure differences is a tremendous asset. The benchmark system should correlate well with the expected performance of the real system, but it should be relatively easy to run. If you have a development "acceptance test suite" that you run before checking in any significant changes, you should add the benchmark to that test suite.

TIP
You should measure performance with at least a proxy test; otherwise , you're setting yourself up for failure. Optimism without data to back it up is usually misguided.

Your benchmark doesn't have to be sophisticated initially. You can first create your database and populate it with a nontrivial amount of datathousands of rows at a minimum. The data can be randomly generated, although the more representative you can make the data the better. Ideally, you should use data from the existing system, if there is one. For example, if a particular part represents 80 percent of your orders, you shouldn't make all your test data randomly dispersed. Any differences in the selectivity of indexes between your real data and the test data will probably cause significant differences in the execution plans you choose. You should also be sure that you have data in related tables if you use Foreign Key constraints. As we explained earlier in this book, the enforcement of Foreign Key constraints requires that those related tables (either referenced or referencing) be accessed if you are modifying data in a column that is participating in the constraint. So the execution plan is sometimes considerably more complicated due to the constraints than might be apparent, and a plethora of constraints can result in a system that has no simple operations.

As a rule of thumb, you should start with at least enough data so the difference between selecting a single row based on its primary key by using an index is dramatically faster than selecting such a row using a table scan. (This assumes that the table in production will be large enough to reflect that difference.) Remember that the system will perform much differently depending on whether I/O operations are physical or from the cache. So don't base your conclusions on a system that is getting high cache-hit ratios unless you have enough data to be confident that this behavior also will be true for your production system. In addition, keep in mind that running the same test multiple times might yield increasingly short response times. If you are testing on a dedicated machine, no other processes will be using SQL Server's memory, and the data you read in from the disk the first time will already be in cache for subsequent tests.

TIP
If you want to run your tests repeatedly under the same conditions, use the command DBCC DROPCLEANBUFFERS after each test run to remove all data from memory.

Early in the development process, you should identify areas of lock contention between transactions and any specific queries or transactions that take a long time to run. The SQL Server Profiler, discussed in Chapter 15, can be a wonderful tool for tracking down your long-running queries. And if table scans will be a drain on the production system, you should have enough data early on so that the drain is apparent when you scan. If you can run with several thousand rows of data without lock contention problems and with good response time on queries, you're in a good position to proceed with a successful development cycle. Of course, you must continue to monitor and make adjustments as you ramp up to the actual system and add more realistic amounts of data and simultaneous users. And, of course, your system test should take place on an ongoing basis before you deploy your application. It's not a one-time thing that you do the night before you go live with a new system.

TIP
Before you roll out your production system, you should be able to conduct system tests with the same volumes of data and usage that the real system will have when it goes live. Crossing your fingers and hoping is not good enough. Two tools in the Microsoft BackOffice Resource Kit might prove useful for this purpose. The filltabl utility populates a specified table with any number of rows of random data. A load simulator ( sqlls ) lets you run one or more SQL scripts with up to 64 operating system threads executing each script. These and other tools from the BackOffice Resource Kit are written for the previous version of SQL Server, but they will work with SQL Server 7.

Obviously, if your smallish prototype is exhibiting lock contention problems or the queries do not perform well within your desired goals, it's unlikely that your real system will perform as desired. Run the stored procedures that constitute your critical transactions. You can use a simple tool like Osql.exe to dispatch them. First run each query or transaction alonetime it in isolation and check the execution plans ( SET SHOWPLAN_TEXT ON ). Then run multiple sessions to simulate multiple users, either by manually firing off multiple OSQL commands or by using the SQL Load Simulator mentioned above. (A bit later, we'll look at how to analyze and improve a slow-running query.)

Also, based on your CRUD chart analysis (or on any other analysis of critical transactions), you should identify tasks that will run at the same time as your critical transactions. Add these tasks to your testing routine to determine whether they will lead to contention when they run simultaneously with your critical transactions. For example, suppose proc_take_orders is your critical transaction. When it runs, some reports and customer status inquiries will also run. You should run some mixture of these types of processes when you analyze proc_take_orders . This will help you identify potential lock contention issues or other resource issues, such as high CPU usage or low cache.

You might also want to use the SQL Server benchmark kit, which is available on the companion CD as well as at the Microsoft Web site (http://www.microsoft.com/sql). Although the transaction in the kit will probably not be directly applicable to your situation, the benchmark framework provides the infrastructure you need to launch and coordinate multiple client tasks simultaneously to time their work for throughput and response time. The kit also provides some sample code for populating the test database. Other custom benchmarking tools are available from other companies (such as Dynameasure from Bluecurve, Inc. at http://www. bluecurve .com.)

A Rant on Development Methodologies

How much you spec, how you spec, when you start coding, and the role of prototyping are all matters on which there is no consensus; there is no one right answer. What's right for one team or project might be wrong for another. You must remember that the point of development is to ship products or deploy applications. Your ultimate purpose is not to produce specs and design documents.

The spec exists to clearly articulate and document how a module or system should work. It ensures that the developer thinks through the approach before writing code. It is also vitally important to others who come in later and are new to the system. While writing the design document, the developer might have to write some quick prototype code to help think through an issue. Or the developer should at least write some pseudocode and include it as part of the document.

Any development task that will take more than a couple of days to implement deserves a simple design document. Ideally, such a document should be at most 15 pages, and often about 3 succinct pages are ideal for even a complex component. The best design documents can be read in one sitting, and the reader should come away understanding clearly how the system will be built. The document should be reviewed by other developers before coding begins. This is best done in a positive, informal atmosphere in which others can contribute ideas. And, of course, in a healthy environment, developers are continually bouncing ideas off their peers.

The document should assume that the reader already knows why a module or component is needed. The document provides the how . Too many specs have 40 pages describing the market need for something or why a potential product would be really cool, and then they have 1 page that says, in essence, "We'll figure it out when we code it up."

No one can write a perfect spec up front. Prototyping the spec in an iterative fashion works far better. Areas that are clear don't need a prototype; use prototypes for areas fraught with risk, for your critical transactions, and for critical areas of the system in which multiple approaches are possible or reasonable. For the tricky stuff that you can't describe or predict a best performance, prototypes provide enormous benefits.

A useful prototype can be "quick and dirty." If you don't worry about all the failure cases and every conceivable state in which something can exist, useful prototype code can often be produced in 5 percent of the time it takes to create production-caliber code (in which you must worry about those things). However, the production code might be 10 times better because you have recognized and corrected deficiencies early on. You'll either junk the prototype code or use it for the skeleton of the real system. Prototyping lets you learn and prove or disprove ideas, and then you can update the spec based on what you learn. And if you're selling your ideas to management or customers, your prototype can be useful to demonstrate proof of your concept.

As we mentioned before, every nontrivial development task deserves a brief design document. You need enough documentation to lay out the framework and system architecture and to detail how pieces fit together. But at the detailed levels, it's better to simply comment the source code liberally while you write and modify it. External design documentation rarely gets updated in a timely manner, so it quickly becomes useless. No one has much use for worthless comments (like adding incrementing i before the statement i++ in C). Rather, comments should describe the approach and intent of each routine, the expected inputs and outputs, and any side effects that might occur by changing the code. Explain operations when their purpose is not obvious. Don't assume that your readers will immediately grasp all the subtleties. And when you change something after the fact, add a comment about what you changed, when you changed it, and why. A module should be commented well enough so that testers or technical writers can gain a good understanding of what is going on just from reading the comments, even if they are not skilled programmers.