Integrating Ad Hoc Queries into Your Application | Hitchhikers Guide to Visual Studio and SQL Server: Best Practice Architectures and Examples, 7th Edition (Microsoft Windows Server System Series)

An ad hoc query is simply a T-SQL statement or a batch of several statements that you hard-code into your application. Generally, ad hoc queries are provided as a properly formatted SQL string. No, this does not mean you need to add extra line-feed or carriage-return characters to the string to make it look nice. Sure, this T-SQL string can be built separately and used to set the CommandText property or used with the SqlCommand New Constructors.

The CommandText string can be built on-the-fly, and you can concatenate the query components one at a time, as shown in Figure 10.8but before you start pounding in this code^[2] or loading up the DVD, let's take a closer look at the process of creating and managing ad hoc queriesand the litany of problems they introduce. The point of this exercise is to ensure that you understand the issues and learn to avoid this approachdespite the fact you'll see it illustrated throughout this book and most other books and documentation. I tend to use this approach as it's the best way to illustrate integration of T-SQL queries into your application. In most cases, developers move from embedded T-SQL in their code to stored procedures once they get the database schema and procedures created. The T-SQL embedded in your application will be a good starting point for these procedures.

^[2] This sample is named "SqlCommandTextExamples" on the DVD.

Figure 10.8. Creating a SqlCommand.CommandText with concatenation.

As I complete the final draft of my EBook, "Hitchhiker's Guide to SQL Server Everywhere", I find the use of concatenated queries even more important than ever before. Consider that SSEv does not support stored procedures so all of the queries you executeespecially parameter-driven queries, are going to involve concatenation on on-the-fly SQL construction of one kind or another. I've been asked any number of times if using ad-hoc queries hurts performance. Frankly, it used to, but given the way that SQL Server caches its query plans, you'll find that there is very little performance benefit from a properly constructed parameter query. The key phrase here is "properly constructed". If the query parameters substantively change how the query would be created, subsequent invocations of a query can perform poorly because the cached query plan does not efficiently return rows given a different set of parameter values. I talked about this in Chapter 2, "How Does SQL Server Work?," so you might want to revisit the section that discusses how the query optimizer works.

As you decide which coding strategy to use when building concatenated strings to pass to the CommandText, consider the following alternatives that balance ease of use, human readability, and performance:

Concatenate using the language-specific "&" (or "+" in C#). This approach is not always as fast as using the StringBuilder Append techniques but can be just as fast for small stringsand easier to code.
Concatenate using the String.Format method. This provides some datatype conversion but is limited in depthyou can only concatenate a few arguments.
Concatenate using the StringBuilder Append or (better yet) AppendFormat methods. These can improve performance if the strings are fairly long.

The following example illustrates the first suggested concatenation technique and suggests alternatives. Remember that the performance gained here must be balanced with the ability to clearly see what's being sent to the CommandText to help others maintain the code you write.

Yes, there are several problems with this code. Can you spot them all? No, I'm not talking about the fonts. Let's step through the issues:

First, the "&" concatenation operator is somewhat expensiveit's slower and trouble-prone when compared to other techniques. While it's fine for short strings, each time it's used, the string is copied several times behind the scenes. While book authors sometimes have to illustrate long queries by breaking them up like this, it's not a good idea at all to write your programs like this. Remember, it's the computer that has to read these queries most oftenmaking them easier for you to read is also important, but balanced with the expense of code execution performance. If you insist on using ad hoc queries, I suggest using the StringBuilder class or String.Format to construct them. I'll show you how in a minute.
When you concatenate values into the WHERE clause, as shown in Figure 10.8, you open yourself (as in, you, personally) for attackthat is, when your company data is penetrated or destroyed because of a SQL injection attack, everyone is going to blame you. If you're lucky, you'll just lose your job. If you work for some of the folks I know in Texas, you might not make it out of the parking lot with your dignity intact.
In this T-SQL, the LIKE expression requires a quoted string. Do you see any single quotes to frame the inbound strCustWanted string? The application assumes (quite foolishly) that the user will somehow know to enter the single quotes around the customer name they want.
Okay, let's say the user is (by some miracle) smart enough to frame the customer's name in single quotes. But what if the customer's name is "O'Malley"? Try it and see what happensyou'll get a syntax exception, and so will they. Okay, in this case, you need to train the user to double-up any single quotes embedded in a name they're searching for. Yes, you can scan the string and use the Replace method to substitute each single quote ("'") with two single quotes ("''"). Another approach is to attempt to remove all of the customers in the database whose names have embedded quotes. This was tried in a northern European countryit didn't work very well.

Tip

I call the embedded single-quote problem "The O'Malley Issue."
And what about the LIKE expression wildcard character? If you don't use one, the LIKE expression returns no rows. Right, in this case, you'll need to train the user to add a "%" to the name in the correct placeno, not "*", but "%". As it is, your training manual is getting kinda thick and the users are going to be a bit grumpy when told they need to learn all of these rules. So, you can (and should) add the wildcard character yourself, if it's not already there or is in the wrong place.

Let's go over several additional points to help you avoid trouble if you still want to use this ad hoc query approach.

Did you notice each line of the T-SQL is concatenated with an extra space added at the beginning? I began teaching this technique years ago when my student's concatenated queries failed with syntax errors. It seems that the T-SQL parser did not understand "CustomersWHERE". Remember that extra whitespace is ignored by T-SQL.
Consider that the flexibility of this approach is one of its strengths. You can build up a T-SQL statement that can specify the columns, tables, and entire WHERE-clause, as well as ORDER BY expressions. T-SQL will let you supply only WHERE clause constraints in a parameter query.
When you hard-code T-SQL in an application, if that query has to change even in simple ways, you're looking at a recode, retest, and redeploy, as well as possibly retraining your staff every time it needs to change. Wouldn't it be nice if you could make simple (or even moderately complex) changes to the queries without recoding? I'm about to talk about that optionit's called using stored procedures.
Notice that the T-SQL asked for specific columnsI didn't use SELECT *. As I've said before (and I'll say again), I think SELECT * has its place, but not in most queries. If you're dumping a table schema, sure, use it, but if you're building a query, return only the columns you need and no more. This helps eliminate performance bottlenecks when SQL Server is forced to return columns you (or your user) don't need or don't expect.
In this example, I opened the pre-instantiated SqlConnection object just before I executed the query. I also made sure the SqlConnection was closedeven if something went wrong (as it often does)by putting the Cn.Close in the Finally block. This code gets executed whether or not the code between Try and the first Catch works. It's a common mistake to place the Cn.Close just below the code that sucks the rows out of the DataReader. As I'll show you later, you shouldn't close the SqlConnection associated with a DataReader until you're done with the rowset.
Did you notice that I included an exception handler to trap the SqlException (that I really expect to get) in addition to the general Exception that I don't expect to get? And, no, I don't recommend showing any of these messages to the userI don't expect the average off-the-street user is capable of figuring out what to do to solve the problem (unless that street is NE 156th in Redmond in front of Building 40 on the Microsoft campus).
Last, but not least, consider that the O'Malley issue is only one of the many hidden issues you need to be aware of if you choose to roll your own ad hoc queries. You'll also have to deal with (complex) date formatting and framing, datatype assignments, and inbound data validity checks, among others. But far and above SQL injection is the most serious issue you need to be constantly vigilant to prevent. As a rule of thumb, you should never accept unvalidated strings from users if these are to be concatenated into your SQL. However, even complex validation routines have been fooled. It's far (far) safer to use parameter-based queries, as they can foil most SQL injection attacks.

Tip

How many of these issues did you catch? Once you can spot most of them in the code you're reviewing or writing, you're ready for the big leagues.

Preventing SQL Injection Attacks

I'm not going to give you (or anyone) the sequence of characters used to inject destructive (or simply intrusive) SQL into your queries. Suffice it to say that SQL injection is a very widely practiced and, sadly, widely published and discussed technique that permits hackers to gain full access to your server and do what they will. Consider that SQL injection attacks usually focus on the ability of SQL Server (and other more sophisticated DBMS SQL engines) to execute more than one statement in a single batch. That is, a SQL query can contain several (N) operations. While your query might only intend to return a few rows, SQL injection attacks "inject" additional (malicious) SQL into your benign query. Incidentally, because SSEv does not support multiple statements in a SQL query, it is not as vulnerable to SQL injection attacks.

There are several ways to protect your database, your company, and your career. All of these are fairly easy to implement, but let's discuss some broader security guidelines that reduce the frontal area of your server's security risk.

IMHO

Sadly, there seems to be no end to the number of articles that clearly show how to perform SQL injection attackssome by companies simply trying to sell injection-prevention software. It's kinda like lawyers drumming up business by dropping banana peels on the sidewalk.

Use the SqlCommand class to manage the query and any parameters it requires. Because ADO.NET knows how to deal with string and date framing and formatting issues, your code can actually be simpler to write. As another side-effect of the way ADO.NET handles parameters and executes the query, you and hackers will find that it's virtually impossible to insert anything into your SQL via the parameters.
Treat all input and all new data as "evil" and untrusted. Okay, this sounds a bit paranoid and rings of a Homeland Security tactic, but that's where the industry has evolved. As a rule of thumb, I never permit unvalidated data from entering the system in the first place. Knowing where the data comes from is not enough. Hacking can result from human input or spurious data introduced by a trusted programthat perhaps has been hacked. Remember that over 80% of all security breaches are from inside your firewallfrom your coworkers and those interlopers that have gained access to idle systems inside your buildings.
Avoid ad hoc or system-generated "dynamic" SQL either in your application or in a stored procedure. Concatenated SQL is the first route a SQL injection attacker will attemptit's the ground-level bedroom window left open to catch the summer breeze. These hacking attempts can be made as you capture a Login ID or the "Product Wanted" search argument.
Practice "least privilege" security: That is, set up specific accounts for your applications and users that grant only the specific rights they need. Given the credentials you're using (those provided by the Windows login account or IIS, or those hard-coded in the application), hackers running injected SQL with those rights can't do any damage or extract protected data. Don't put all of your data eggs in one basket. By this, I mean don't create a single application login that has broad rightsremember to provide just enough rights to get the job done.
Use encryption for all sensitive data: Make sure that credit card numbers, Social Security numbers (or the equivalent), and other private data is encrypted in place. Enable the SSL option in SQL Server to prevent hacking data feeds. Of course, enabling SSL won't help if the server is not configured to support encryption.
Don't depend on filtering algorithms that attempt to ferret out those keywords known to trigger SQL injection. While this might work in some cases, they're vulnerable to the seemingly infinite patience of hackers to figure out ways to bypass the filters.
Don't reveal the secrets of the inner sanctum by dumping your exception messages to the user. Remember, the user is the person least able to fix your problem, but a hacker is the person most likely to compromise your system with the inside information revealed by an Exception.ToString dump.

In your ASP.NET application, when an unhandled exception is generated, make sure only minimal help is offered to the hacker. This is accomplished by setting the debug attribute of the compilation element (in the Web.config file) to False and setting the mode attribute of the customErrors element to either On or RemoteOnly. For example:

<compilation defaultLanguage="c#"  debug="false" <customErrors mode="RemoteOnly" />

The RemoteOnly setting will ensure that users accessing the site from localhost will get informative error messages, while those accessing the site from a remote location will receive generic error messages that reveal no useful information about the exception. Use the On setting to have all users, including local users, see the generic error messages. Never use the Off setting in a production environment.

Building Safer SqlCommand Objects

When you build your SqlCommand objects, you can prevent SQL injection attacks (or at least reduce them) by

Using strongly typed data fields: Don't use string datatypes for any user input that can be expressed as an integer or other number. This means you're going to have to take the TextBox, ComboBox, or other control's Text property and cast it to the correct type.
Using SqlParameter objects (which are strongly typed) to manage all values passed to the server. In other words, don't pass any string to the server as a concatenated value in a SQL statement. Because of the way that ADO.NET executes parameter queries, it's virtually impossible to pass stray SQL by way of a Parameter.
Calling stored procedures that further filter the data based on business rules. Stored procedures can also limit the length of the inbound string parameters to help reduce the chance that extra values are introduced on obscure parameters.