Counting Rows

Sometimes in your applications, you need to tell the user how many items are in a data table.

If you look at a sampling of e-commerce sites on the Internet, you will see that many of them on the home page inform the reader about how many items the site has available for purchase. This kind of information might be obtained by using an algorithm like this:

The same example written in Visual Basic looks like this:

Dim Counter As Integer = 0 While dataReader.Read     Counter += 1 End While dataReader.Close()

However, using this procedure in a table with many rows will take a very long time and require a great deal of resources. This slowdown occurs because the script must access each row, one at a time, to count them.

There is another way to get the number of rows in the same DataReader object, as shown here:

Dim Counter As Integer = dataReader.RecordsAffected()

However, this procedure uses the same amount of network resources as the previous one by connecting to the database.

As in other database-related tasks, system performance will be optimized by accessing data from within the database itself.

Using T-SQL Functions to Find the Number of Records

Transact-SQL (T-SQL) has special functions for aggregation (summarization of information about the data); these functions are scalar functions (which return only one value) and usually operate over a set of records.

Using COUNT

If you want to count the records in a table, you can use the following script:

SELECT COUNT(*) AS [Count] FROM Production.Product

Tip

Yes! You can use * here (but only here).

Running a Visual Basic Sample Application that Utilizes the Count Function

1.	On the sample cd, open Chapter05\Chapter 05.sln.
2.	Microsoft Visual Studio will open. From the Build menu, select Build Chapter 05.
3.	From the Debug Menu, select Start Debugging. This will open the application called Aggregate functions. It is important to note that you must have the AdventureWorks database attached locally at this point for the following steps to work.
4.	From the Database menu, select Connect.
5.	From the Demonstrations menu, select Counting Records. This will open the Counting Records application. You can now click three Run buttons to perform the three different aggregate functions and compare their runtimes.

These time-elapsed values will not be the same each time the operations are performed. Other factors affect the results, like connection pooling and the size of the script cache. However, you can get an idea about the average length of time each operation takes to complete.

Important

The most important lesson here is: always try to get the information directly from the database instead of making your own loops to get it.

You can use the COUNT function in several ways. The first way simply counts records. In this case, the COUNT function receives an asterisk (*) as argument.

However, sometimes you need to count only those records that do not have a NULL value in a particular field. In that case, you can use the name of the field as an argument.

The following sentence returns two different counts for the same table:

SELECT COUNT(*) AS [Count],        COUNT(Class) AS [Classes with values] FROM Production.Product

Table 5-1. Results
Count	Classes with Values
504	247

Perhaps you would like to know how many different credit card brands have been used by your customers to make their purchases. In this case, you need to count the number of distinct brands of credit cards, not the total number of credit card entries listed in the database. To do this, you need to apply the DISTINCT modifier to the COUNT function.

SELECT COUNT(DISTINCT CardType) AS [Different Credit Cards] FROM Sales.CreditCard

Tip

The COUNT function returns an int datatype, and the COUNT_BIG function performs the same aggregation but returns a bigint datatype. You have to use the CONT_BIG function when operating on tables containing billions of records.

Filtering Your Results

Sometimes, you have to query the database to answer questions like: "How many customers do we have from Seattle?" To answer this question, you can't use a simple COUNT(*) or a COUNT(DISTINCT field_name). You have to filter the counting process.

In fact, the aggregate syntax is simply a modification of the SELECT statement. If you consider this, you can filter the query using the WHERE clause just as in any other SELECT statement.

SELECT COUNT(*) AS [Customers In Seattle] FROM Person.Address WHERE (City = 'Seattle')

However, it would be better if you could give the user information for multiple cities instead of the count for just one city. Of course, you could repeat the query with a parameter in the WHERE clause so the user can specify the city they are interested in. What if you need to the answer the question: "How many customers do we have in each city?"

To answer this question, you have to change your SELECT statement by adding a GROUP BY clause. Look at this script and its result:

SELECT City, COUNT(*) AS [Count of Customers] FROM Person.Address GROUP BY City

Table 5-2. Result
City	Count of Customers
Cheltenham	55
Kingsport	1
Baltimore	1
Reading	31
...	...

It is hard to find a specific city in a long list unless it is sorted. To sort the information for the users, do the following.

Creating a Sorted Totals Summary

From the Start Menu, select All Programs | Microsoft SQL Server 2005 | SQL Server Management Studio. Open a New Query window by clicking the New Query toolbar button. Enter and execute the following script to sort your results using the ORDER BY clause. (This script and all the other COUNT examples from this section are included in the sample files as CountExamplesFromText.sql in the \SqlScripts folder.)

SELECT City, COUNT(*) AS [Count of Customers] FROM Person.Address GROUP BY City ORDER BY City

Perhaps you will be required to filter the data even further by answering the question: "How many customers do we have in each city, considering only those cities with more than 50 customers?" This is a special case because you have to filter the results of the COUNT aggregation after it performs the calculation. However, T-SQL has the solution for you. Click the New Query button, then enter and execute the following script to filter the COUNT aggregate with the HAVING clause:

SELECT City, COUNT(*) AS [Count of Customers] FROM Person.Address GROUP BY City HAVING (COUNT(*) > 50) ORDER BY City

The difference between the WHERE clause and the HAVING clause is over what kind of information the filter is applied. The WHERE clause filters the original set of information from your table. The HAVING clause filters the information after the execution of the aggregation function and normally filters based on the results of the aggregate.

If we look at the estimated execution plan for one script using the WHERE clause and another using the HAVING clause, the difference between the two is very clear. View the estimated execution plan for the two previous queries using the following procedure. The two execution plans are shown in Figure 5-1 and Figure 5-2.

Figure 5-1. Estimated execution plan for a T-SQL sentence using aggregates and the WHERE clause.

Figure 5-2. Estimated execution plan for a T-SQL sentence using aggregates and the HAVING clause.

Viewing the Estimated Execution Plan

Highlight the script whose execution plan you want to view.

Right-click the script, and select Display Estimated Execution Plan from the context menu.

Note

The estimated execution plan shows you graphically how a query will be processed and how much each step influences the final cost of the query. See the SQL Server Books Online topic "How To: Display the Estimated Execution Plan" to learn about the meaning of each part of the graphical representation.