Filtering and Formatting Data


  • Filter data by using Transact -SQL.

Formatting of data can be performed in T-SQL through a variety of statements, functions, and command options where the filtering of data is a much more basic concept. This unit is therefore separated into two basic divisions to cover what this objective entails. Formatting involves making the data look better and more meaningful to the end user . Filtering, on the other hand, is for determining the data to be selected based upon conditional requirements.

If you are going to understand this very broad objective, the first thing to do is to ask the most basic of questions: What is a comparison? A comparison involves taking two or more values, presenting them against each other and through a defined logical statement determining the winners and losers from the matchups. Essentially, the comparison comes down to a Boolean expression that has three possible outcomes . If two values are compared, the Boolean result is either positive, negative, or equal (greater than, less than, or equal to).

Filtering Data

Operators play an important part in determining the content of any conditional operations. An operator is a symbol specifying an action that is performed on one or more expressions. In SQL Server these operators are divided into a few elementary categories:

  • Comparison operators

  • Arithmetic operators

  • Logical operators

  • Assignment operators

  • String concatenation operators

  • Bitwise operators

  • Unary operators

Each operator category represents a piece of functionality that provides many operators. In some instances there are several dozen operator choices.

Comparison Operators

Comparison operators can be used in conjunction with the WHERE and HAVING clauses to ensure filtration in a number of ways. Comparison operators can be used with character, numeric, or date data expressions. Table 4.2 lists all comparison operators that are valid in SQL Server.

Table 4.2. Comparison Operators

Operator

Meaning

<

Less then

>

Greater than

=

Equal to

  <=  

Less than or equal to

  >=  

Greater than or equal to

  !=  

Not equal to

  <>  

Not equal to

  !<  

Not less than

  !>  

Not greater than

The following is an example of using comparison operators:

 SELECT * FROM Pubs.dbo.Publishers WHERE Pub_id>9000 

Step by Step 4.3 shows you how to use the comparison operators.

STEP BY STEP

4.3 Basic Comparisons

  1. If it is not already open , load the SQL Query Analyzer. Supply the logon connection information if requested and select the Northwind database.

  2. Enter a standard general query to test by entering and executing the following:

     SELECT OrderID, CustomerID, OrderDate FROM Orders 
  3. Change and execute the query so that customers are shown for only OrderID greater than 11,000:

     SELECT OrderID, CustomerID, OrderDate FROM Orders  WHERE OrderID > 11000 
  4. Change and execute the query so that customers are shown for only OrderID less than 11,000:

     SELECT OrderID, CustomerID, OrderDate FROM Orders  WHERE OrderID < 11000 
  5. Change the query so that customers are shown for only OrderID greater than 11,000 yet less than 11,010:

     SELECT OrderID, CustomerID, OrderDate FROM Orders     WHERE OrderID > 11000 AND OrderID < 11010 

You are not restricted to the use of only numeric data with comparison operators. You can also set up a comparison between character data and date data.

Examine the following query:

 SELECT * from Authors WHERE State='CA' 

This returns all authors who live in the state of California.

Whereas comparison operators evaluate the differences between two or more values, arithmetic operators aid in processing mathematical functions against values.

Arithmetic Operators

Arithmetic operators perform mathematical operations on two expressions of any of the data types of the numeric data type category. For more information about data type categories, see Transact-SQL Syntax Conventions in SQL Server Books OnLine. A complete listing of the available arithmetic operators and their uses is shown in Table 4.3.

Table 4.3. Arithmetic Operators

Operator

Meaning

+ (Add)

Addition

(Subtract)

Subtraction

* (Multiply)

Multiplication

/ (Divide)

Division

% (Modulo)

Returns the integer remainder of a division

The plus (+) and minus () can also be used to perform arithmetic operations on date values.

Arithmetic operators help perform mathematical operations and have other similar uses in performing statistical operations. Logical operators act against values in a manner that provides for the implementation of multiple conditions.

Logical Operators

Logical operators, also known as Boolean operators, can also be used within the queries you execute. The three logical operators are AND , OR , and NOT . Their meanings are pretty straightforward: AND adds an additional filter condition to the one specified and returns TRUE only when both or all conditions specified are met. The OR logical operator adds another filter condition to the existing condition as well, but returns TRUE when either condition is met. NOT tells SQL to get everything in the query except for what it has specified. Here is an example of using AND , NOT , and OR :

 SELECT title_id,type, advance FROM titles WHERE (type='business' or type= 'psychology') AND NOT advance>5500 

This query selects only records that are business-type or records that are psychology-type, both having an advance value not greater then 5500.

Note the parentheses around the type condition. They indicate that you want the AND NOT condition to apply to both types. If you wrote:

 WHERE type = 'business' or type= 'psychology' AND NOT advance>5500 

You would get only psychology books where the advance is greater than 5,500. A contrasting use of the logical operators is shown in Figure 4.2.

Figure 4.2. Comparison and contrast of logical operators.

graphics/04fig02.jpg

Brackets can significantly alter the results of operations that use logical operators. Test all conditions and alter the precedence by using brackets. According to the order of operations, bracketed comparisons are performed first. The logical operators are evaluated in the order: () first, then NOT , AND , and finally OR .

BETWEEN and IN

Ranges can also be specified when using the WHERE clause with the help of the BETWEEN keyword. Simply put, BETWEEN provides a range of values within which the data should lie; otherwise , the data does not meet the condition. BETWEEN is inclusive, meaning that the range includes the lower value specified and the upper value specified. The syntax in BETWEEN must be loval and hival , it does not work as expected if the hival is placed first in the expression.

Here is an example of how BETWEEN is used:

 SELECT title_id, ytd_sales FROM titles WHERE ytd_sales BETWEEN 1000 AND 11999 

Remember, the values are inclusive.

You can also incorporate something known as a list when using the WHERE clause. Essentially, a list specifies the exact values a column may or may not take. If the record does not contain the value for the column specified in the IN list, it is not selected. IN determines whether a given value matches a set of values listed. For example:

 SELECT state FROM Authors WHERE state IN ('TN', 'UT') 

This example limits the values of State to only UT and TN . Authors that live in the states mentioned in the IN list will be the only ones listed. The next section talks about using the LIKE clause to perform wildcard searches, a further add-on to the WHERE clause. The use of IN is similar to that of EXISTS .

When a subquery is introduced with the keyword EXISTS , it functions as an existence test. The WHERE clause of the outer query tests for the existence of rows returned by the subquery. The subquery does not actually produce any data; it returns a value of TRUE or FALSE . The following is an example of the use of EXISTS to find only those publishers in the United States.

 SELECT DISTINCT pub_name FROM Pubs.dbo.Publishers WHERE EXISTS    (SELECT *    FROM titles    WHERE Pub_id = Publishers.Pub_id    AND Country = 'USA') 

IN THE FIELD: BETWEEN versus IN

If you had the choice of using either the BETWEEN or IN keyword, the BETWEEN keyword should usually be used. If the range includes lots of data with different values, BETWEEN is a lot more efficient.

Using the LIKE Clause

You can retrieve rows that are based on portions of character strings by using the LIKE predicate. The LIKE predicate determines whether a given character string matches a specified pattern. The types of data a LIKE statement can work with are char , varchar , nvarchar , nchar , datetime , smalldatetime , and text . A pattern specified in the LIKE predicate can include regular characters and wildcard characters . An example of the output generated using LIKE is given in Figure 4.3.

Figure 4.3. Using the LIKE predicate.

graphics/04fig03.jpg

During pattern matching, regular characters must exactly match the characters specified in the column value. Wildcard characters can be matched with any character or set of characters according to the wildcard character used, as shown in Table 4.4.

The general syntax in which you submit a LIKE -style query is as follows :

  Expression to be matched  [NOT] LIKE pattern 

Patterns to be matched are dependant on the wildcard operators that are used.

Table 4.4. The Wildcard Characters Allowed in T-SQL

Character

Meaning

[ ]

Any one single character within the specified range ([f-j]) or set ([fghij])

_ ( underscore )

Any one single character

%

Any number of zero or more characters

[ ^ ]

Any single character not in the specified range or set

WARNING

Full-Text Search If your application repeatedly calls the LIKE predicate and performs numerous wildcard searches, you should consider using the MS Search Service if it is installed and in use on the server. Consider the value of the response time over the storage resources that the MS Search Service and full-text search capabilities require. MS Search Service is required to use full-text search. Full-text search allows you a variety of powerful wildcard searches. Full-text search is discussed later in Chapter 10.


The following example of how to use the LIKE clause uses the % wildcard to select all titles from the titles table where there is a character string match for 'computer' . (Full-text indexing could perform this search more efficiently if available; you should avoid LIKE searches that have a % wildcard at both the beginning and end.)

 SELECT title FROM titles WHERE title LIKE '%computer%' 

The following example uses the '[ ]' wildcard:

 SELECT Au_id from Authors WHERE Au_id LIKE '[0-2]%' 

You can also use the NOT keyword with the LIKE predicate, which simply retrieves a query that does not contain records matching the specified elements in the LIKE clause. With character matching it is sometimes more efficient to exclude characters with the use of NOT . It is common to use a negative test in particular when looking for values that represent true data, as in NOT NULL .

Selecting Rows Based on NULL Values

Before you can look at selecting and not selecting NULL values, you must understand what a NULL value is. A NULL value is a value given to a field that that has no value. Many people confuse NULL values with zero-length strings or the value zero, but such is not the case. NULL is just a fancy word for a value that is unknown.

In SQL Server, you can select the desired NULL values or reject them using IS NULL or IS NOT NULL . Don't enclose NULL in quotes because it is a keyword. Enclosing NULL in quotes would produce a character match for 'N' , 'U' , 'L' , 'L' , and not a NULL value. Look at the following example:

 SELECT State FROM Pubs.dbo.Authors WHERE State IS NOT NULL 

As you have already noticed, this query selects all records where the State field is NOT NULL . NULL behaves in a different manner with regular comparison operators, (>, <, =, and so on). This behavior depends on the ANSI connection settings for the session.

When SET ANSI_NULLS is ON , a comparison in which one or more of the expressions is NULL returns UNKNOWN as the result of the comparison. A value that is not known cannot be compared against another value. Use the IS NULL or IS NOT NULL instead to test for NULL values. T-SQL does support an extension of the traditional behavior that allows for comparison operators to return TRUE or FALSE as a comparison result. If you set ANSI_NULLS OFF , comparisons to NULL return TRUE when the value being compared against is also a null. The comparison returns FALSE for any other value. Regardless of the ANSI_NULLS setting, NULL values are always considered equal for the purposes of the ORDER BY , GROUP BY , and DISTINCT keywords.

Returning TOP rows

Returning limited entries to only a few records is possible in T-SQL. You may, for instance, need to select the top three people in the school or maybe only return the top ten bestsellers ; whatever the case may be, it can be accomplished. Use the following process shown in Step by Step 4.4 to return the highest and lowest values:

STEP BY STEP

4.4 Selecting TOP data

  1. If it is not already open, load the SQL Query Analyzer. Supply the logon connection information if requested and select the Northwind database.

  2. Enter a standard general query to test by entering and executing the following:

     SELECT OrderID, CustomerID, OrderDate FROM Orders  ORDER BY OrderDate ASC 
  3. Change and execute the query so that only the five earliest orders are shown:

     SELECT TOP 5 OrderID, CustomerID, OrderDate FROM Orders  ORDER BY OrderDate ASC 
  4. Change and execute the query so that only the 5 most recent orders are displayed:

     SELECT TOP 5 OrderID, CustomerID, OrderDate FROM Orders  ORDER BY OrderDate DESC 

Limiting your selection to only a few records is made possible by the TOP clause. The TOP clause limits the number of rows returned in the resultset to a specified number or percentage.

TOP is generally simple and may be accomplished with the following syntax:

 TOP  n  [percent] 

In this syntax, n is a number that specifies how many rows are returned. If additionally [percent] is specified, n is the percent of rows that are to be returned in the resultset. Examine the following queries:

[View full width]
 
[View full width]
Select top 50 returns the top 50 rows Select top 50 percent returns the top 50 percent of the result-set rows If a SELECT statement that includes TOP also has an ORDER BY clause, the rows to be graphics/ccc.gif returned are selected from the ordered result set. Look at the following example:SELECT graphics/ccc.gif TOP 5 Au_fname FROM Pubs.dbo.Authors ORDER BY Au_fname

There is, however, an alternative to TOP . You can also limit the number of rows to return using SET ROWCOUNT N . The difference between these two is that the TOP keyword applies to the single SELECT statement in which it is specified. SET ROWCOUNT stays in effect until another SET ROWCOUNT statement is executed, such as SET ROWCOUNT 0 to turn the option off. The TOP n keyword is the preferred between the two and so should always be used when limited rows need to be returned.

You can optionally specify that the TOP keyword is to use the WITH TIES option, in which case any number of records can possibly be displayed. WITH TIES displays all records that are equivalent to the last matching element. If you are looking for the top 10 employees and there is a tie for tenth between two employees , 11 records would be displayed. If the tie was for ninth or a higher position, only 10 records would be listed.

Bitwise and Unary Operators

Bitwise operators (see Table 4.5) are used on int , smallint , or tinyint data. The ~(Bitwise NOT) operator can also use bit data. All bitwise operators prform an operation on the one or more specified integer values as translated to binary expressions within T-SQL statements. For example, the ~(Bitwise NOT) operator changes binary 1s to 0s and 0s to 1s. To check bitwise operations, you can convert or calculate decimal values.

Bitwise operators perform bit manipulations between two expressions of any of the data types of the integer data type category.

Table 4.5. Bitwise Operators

Operator

Meaning

&

Bitwise AND

Bitwise OR

^

Bitwise exclusive OR

The operands for bitwise operators can be any of the data types of the integer or binary string data type categories (except for the image data type), with the exception that both operands cannot be any of the data types of the binary string data type category.

Unary operators (see Table 4.6) perform an operation on only one expression of any of the data types of the numeric data type category.

The + (Positive) and - (Negative) operators can be used on any expression of any of the data types of the numeric data type category. The ~ (Bitwise NOT) operator can be used only on expressions of any of the data types of the integer data type category.

Learning to use the entire set of SQL Server operators will take a considerable amount of practice with a wide variety of applications. Obviously it is difficult, even in a book this size, to provide details and examples covering every one.

Table 4.6. Unary Operators

Operator

Meaning

+

Value is positive

-

Value is negative

~

Returns the ones complement of the number

Formatting the Data

Often data returned from the database is not presented in a manner that the end user can identify with and access in a reasonable manner. Many SQL Server functions allow for the alteration or more appropriate presentation to the user. If data is not in a type presentable to the user, then the CONVERT and CAST functions will be the first ones used.

Converting Data with CONVERT and CAST Functions

Data needs to be converted from one form to another when you are using diverse functions or data types that don't match up with the type needed by a particular function. Data can be converted from one data type to another with the CONVERT and CAST functions.

In converting data types, two things need to be supplied:

  • The expression that needs to be converted

  • The data type to convert the given expression to

Here are some notes that should be examined before converting:

  • If you don't specify a length when converting, SQL Server automatically supplies a length of 30.

  • CAST is based on the SQL-92 standard and is favored over CONVERT .

  • SQL Server automatically converts certain data from one data type to another. For example, if a smallint is compared to an int , the smallint is automatically converted to int before the comparison proceeds. These are called implicit transactions because you don't have to use the CAST or CONVERT functions.

  • When data is converted to a new data type and the data is too large for the new data type to handle, SQL Server displays an asterisk or identifies the problem with an error message, depending on the data types involved.

Both CONVERT and CAST have their particular strengths in handling individual types of data. The two functions are compared and contrasted in the next two sections.

The CONVERT Function

The CONVERT function transforms data from one data type to another. CONVERT is also used extensively with date operations to format a date. For use in date operations, see Books Online CAST and CONVERT .

The syntax is

 CONVERT  (data type  [(length)],  expression  ) 

Where

  • Data type is the data type you want to convert to

  • Expression is the data being converted

The following example converts the Price column into a CHAR data type so that you can concatenate it with another string expression. Remember, concatenation can only be done using strings, so it must be converted:

 SELECT 'The book costs ' + CONVERT(CHAR(5),price) AS 'Price of book' FROM titles 
Using the CAST Function

The CAST function is similar to the CONVERT function in that it converts data. It is preferred over the CONVERT function because it's based on the SQL-92 standard. Of course, there is no harm in using either, but I personally use this more frequently. The syntax for CAST is:

 CAST(expression AS data type) 

NOTE

Character Formatting When data is converted to a char data type that will not adequately store the value, an asterisk (*) denotes the problem. However, when you convert to integer or Unicode data, an error message appears. The example in Figure 4.4 clarifies this point.

Figure 4.4. Data error clarification .

graphics/04fig04.gif


Notice that the parameters are switched around in comparison to the CONVERT function.

CAST is usually preferred over CONVERT for operations with more advanced processing requirements.

REVIEW BREAK: Data Filtering

Filtering of data is the selection of the required information, using supplied criteria, to create a resultset with only the requested data. Filtering minimizes the use of valuable bandwidth, provides for more efficient applications, and in general better suits the end user and business needs.

Formatting data is presenting the information so that more value is produced in the finished product. Formatting information can be as simple as placing output in the correct sequence; however, formatting can also mean the use of highly advanced procedures for the analysis and display of data.



MCSE Training Guide (70-229). Designing and Implementing Databases with MicrosoftR SQL ServerT 2000 Enterprise Edition
MCSE Training Guide (70-229). Designing and Implementing Databases with MicrosoftR SQL ServerT 2000 Enterprise Edition
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 228

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net