Chapter 9: Scaling SQL Server | ppk on JavaScript, 1/e

How many times have you heard that SQL Server can't scale and isn't reliable? In this chapter, I hope to break that myth. I'll discuss linked servers quite extensively, and that information is useful later in the chapter, when I dive into distributed partitioned views.

Linked Servers

Linked server technology is one of the core components of the Microsoft scale-out tactic. This technology allows SQL Server to connect to any OLE DB-compliant source, and run queries or execute remote procedure calls (RPCs) as if it were a local SQL Server.

Data Flow of Linked Servers

Figure 9-1 shows how SQL Server implements this technology. A client connects to SQL Server and requests data from the linked server. SQL Server establishes a connection to the remote system on behalf of the client. The connection could be to another SQL Server or to any OLE DB-compliant data source. The remote data source executes the query, and then passes the results back to the SQL Server, which hands the results back to the requesting client.

click to expand
Figure 9-1: The basic architecture behind linked servers

Note

Extra work may have to occur on the SQL Server once it receives the data, depending on the database. I'll cover this in the 'Querying Linked Servers' section.

With linked servers, you can run ad hoc queries against remote systems as if they were SQL Servers. Linked servers give you the ability to install third-party connectivity tools (such as DB2 Connect) on the SQL Server, create a linked server to use them, and have all the connections use the linked server's connection to DB2. Without the linked server's technology, you'd have to install DB2 Connect or a similar connectivity tool on each workstation that wanted to pull data from DB2.

Myriad of Uses for Linked Servers

Linked servers also provide a method for spreading your SQL Server data across multiple servers and then querying as if you're querying one database. To view your linked servers, go to Enterprise Manager and select Linked Servers under the Security group. You can then select a linked server from the list, and view any existing tables and views for the server on the selected or default database.

You can also use the sp_linkedservers system stored procedure to pull back a list of installed linked servers, including configuration information. If you execute sp_linkedservers on the server that has the linked servers configured, the server displays its own name, as well as any linked servers that have been installed on it. Here is the abridged result of running the query:

SRV_NAME      SRV_PROVIDERNAME  SRV_DATASOURCE  SRV_CAT ------------- ----------------- --------------- ----------- SERVERB       SQLOLEDB          SQLCentral      Northwind XANADU        SQLOLEDB          XANADU          NULL (2 row(s) affected)

Note

Linked servers have a distant relative called remote servers. Remote servers are provided for backward compatibility and are used for replication. Remote servers can only execute remote stored procedures and return the results of the stored procedures. Linked servers, on the other hand, allow you to run ad hoc queries. You should only use linked servers, because remote servers could be discontinued in a later release of SQL Server. Linked servers also offer added abilities that I'll discuss throughout this chapter.

A great application for linked servers is to place central tables, used by all of your various applications, onto a single server. In several places in this book, I stressed the importance of having a similar database structure for similar products. For example, it's important that among the products your company offers, a Customer table in one product should always look like another Customer table in another product internally. If this is the case, you can place the Customer table on the linked server and have everyone use a central table. This way, if a customer representative updates the Customer table in one application, another representative will see the update seamlessly. With linked servers, you could place the customers in DB2 or Oracle and your queries won't be affected.

Tip

The catch to linked servers is that for them to be truly effective, you'll have to have a fast connection between the local server and remote servers. This can be quite a bottleneck if the connection is not fast enough.

Creating a Linked Server in Enterprise Manager

The primary methods for creating linked servers are via Enterprise Manager or through T-SQL (covered next). To create a linked server through Enterprise Manager, right- click the Linked Server group and select New Linked Server. This opens the Linked Server Properties screen shown in Figure 9-2.

click to expand
Figure 9-2: Creating a linked server

For the Linked Server field, specify a logical name for the server. Even though this is a logical name or alias, it is a good idea to use the name of the remote server. Then, if you select SQL Server as the Server Type, the Linked Server field represents the server's name so the other options can be ignored.

Caution

Make sure the linked server name is uppercase. Enterprise Manager forces uppercase names, but T-SQL doesn't. Use uppercase names everywhere, because there are rare cases where providers require uppercase names.

In the Trenches

If you are trying to create a linked server to a SQL Server 6.5 server using the OLE DB provider for SQL Server, you must run the Instcat.sql script against the SQL Server 6.5 machine. The script is in the \Microsoft SQL Server\MSSQL\Install directory, and it is required for running distributed queries against a SQL Server 6.5 server. Otherwise you'll receive error number 7399 when running queries.

For the Server Type selection, choose Other Data Source and then select the type of database system from the Provider Name drop-down box. As you can see, you can connect to a wide variety of database systems. There is even an OLE DB provider for DTS packages. Even if you're creating a connection to SQL Server, don't choose the SQL Server radio button for the Provider Name. If you select the SQL Server radio button, you lose some control of the default database.

Note

The list you see in the Provider Name drop-down box is from the server you're connected to in Enterprise Manager, not the providers on your workstation. If you install software such as Host Integration Services, the software may add providers in this list (for example, the OLE DB provider for DB2).

The Product Name is the name of the provider, and the field is optional. You can enter anything except 'SQL Server,' which would be the equivalent of selecting the SQL Server radio button. In fact, if you open an existing linked server and modify the Product Name to SQL Server, the system erases some of your properties. I generally use the name 'SQLServer' for the OLE DB Provider option when connecting to a remote SQL Server.

The Data Source option varies based on the provider. In SQL Server it is the server's name or IP address. Some providers require a location and possibly a provider string. Table 9-1 shows the various options for this screen, taking into consideration the type of provider. You can use this later when I discuss adding a linked server through T-SQL.

Table 9-1: Provider Types for Linked Servers
Remote OLE DB Data Source	OLE DB Provider	Provider Name	Data Source	Provider String
Access	Microsoft OLE DB provider for Jet	Microsoft.Jet.OLEDB.4.0	Path and filename of Access database	None
Excel	Microsoft OLE DB provider for Jet	Microsoft.Jet.OLEDB.4.0	Path and filename of Excel spreadsheet	None
SQL Server	Microsoft OLE DB provider for SQL Server	SQLOLEDB	SQL Server name	None
Oracle	Microsoft OLE DB provider for Oracle	MSDAORA	SQL*Net alias for the Oracle database	None
ODBC	Microsoft OLE DB provider for ODBC	MSDASQL	DSN name	Optional provider string
DB2	Microsoft OLE DB Provider for DB2	DB2OLEDB	Connection string fulfills this	Needed for DB2; can be generated by generating a data link

The last option in this screen is the Catalog option. This option allows you to specify a default database or schema that the linked server will see by default. If you don't enter an option, the system uses the default catalog for the login that is used to connect to the remote server. Keep in mind that in most cases in SQL Server the default database for most logins is the Master database. This could potentially be dangerous if your security is not tight. For some database types it is wise to set this option, because if you don't, the linked server will return a lot more tables than you'd be concerned with.

Creating a Linked Server Through T-SQL

You may decide to script out a linked server so you can deliver it with an application's setup program. Or you may just prefer to use T-SQL because you've created a template that's easy to use. I use T-SQL to create linked servers, because there are some details that Enterprise Manager doesn't offer.

In the Trenches

No validation of your input is performed when creating a linked server. You won't know there is a problem until you actually try to query it. For example, if you change the default catalog to a bad name like BadNorthwind, you would receive the following error when you try to connect to the linked server to run a query:

Server: Msg 4060, Level 11, State 1, Line 1 Cannot open database requested in login 'BadNorthwind'. Login fails.

You can use the sp_addlinkedserver system stored procedure to add the linked server. The basic syntax for the stored procedure looks like this:

sp_addlinkedserver [@server =] '<linked server name> ' [, [@srvproduct =] '<product name>'] [, [@provider =] '<provider name>'] [, [@datasrc =] '<data source>'] [, [@location =] '<location>'] [, [@provstr =] '<provider string>'] [, [@catalog =] '<default catalog>']

As you can see, the options line up with what you saw in the Enterprise Manager Linked Server Properties screen. For example, you can add the same linked server
I created earlier in Enterprise Manager with the following syntax:

sp_addlinkedserver    @server=SERVERB,    @srvproduct = 'SQLServer OLEDB Provider',    @provider = 'SQLOLEDB',    @datasrc = 'SQLCENTRAL'

In the Trenches

Remember never to use 'SQL Server' for the Product Name parameter. If you do, you'll receive the following error message:

Server: Msg 15428, Level 16, State 1, Procedure sp_addlinkedserver,Line 67 You cannot specify a provider or any properties for product 'SQL Server'.

If you use 'SQL Server' for the Product Name parameter, you cannot use the rest of the variables.

You're not limited to SQL Server when you're creating linked servers. You can create a linked server to Access, in which case you use the value 'Microsoft.Jet.OLEDB 4.0' for the @provider parameter. For example, the following query will set up a linked server to an Access database:

EXEC sp_addlinkedserver    @server=ACCESSDB,    @srvproduct='Jet 4.0',    @provider='Microsoft.Jet.OLEDB.4.0',    @datasrc='/books/1/517/1/html/2/C:\nwind.mdb'

Once you add a linked server through T-SQL, if you want to modify any of the primary properties, you have to drop it and re-create it. If you modify a linked server through Enterprise Manager, the GUI automatically modifies the system catalog for you. A quick trace of the system catalog can show you what's happening under the covers when you modify the Catalog option in Enterprise Manager. The server issues the following command after you click Apply to save your changes:

UPDATE master.dbo.sysservers SET catalog = 'Northwind2' where srvname = 'SERVERB'

You can use this trace to devise a way to use this information for your own installation or upgrade process. This is useful if you want to create an upgrade script to update a customer's linked server default catalog. The full script would look like the following:

sp_configure 'Allow Updates', 1 RECONFIGURE WITH OVERRIDE GO UPDATE master.dbo.sysservers SET catalog = 'Northwind2' where srvname = 'SERVERB' GO sp_configure 'Allow Updates', 0 RECONFIGURE WITH OVERRIDE

Tip

If you're ever curious about how Microsoft has implemented a certain feature in Enterprise Manager, try tracing it. You may find internal stored procedures that can save you loads of time in your own code. If you use internal stored procedures like sp_msforeachtable, be aware that there is always a slim chance that these procedures may eventually be removed. Microsoft does not support most of the cool stored procedures.

If you do not want to update the sysservers table, you could drop and re-create the linked server. In Table 9-2, you'll find a list of the mappings between the parameters in sp_addlinkedserver and the sysservers table.

Table 9-2: Mapping of Parameters to Values in the sysservers Table
Parameter	Field in sysservers Table
@@server	srvname
@@srvproduct	srvproduct
@@location	location
@@provider	providername
@@provstr	providerstring
@@datasrc	datasource

Note

Be careful when modifying any table in the system catalog. If, for example, you forget a WHERE clause, you can cause harm to your system. It has been rumored for several versions that Microsoft will eventually do away with DBA access to the system catalog. Although Microsoft may want to do this, too many third-party programs use it directly to do away with it outright.

Linked Server Security

Once you create the linked server, you're not quite ready to use it yet. You still have to set up the security context the linked server will use to connect to the remote system. If you don't specify any linked server security, your server attempts to pass the security of the user who is using the linked server. The following is a typical error when security is not established:

Server: Msg 18456, Level 14, State 1, Line 1 Login failed for user '(null)'.

To fix the problem, you have to configure the linked server's security.

Configuring Security in Enterprise Manager

You can configure the linked server's security in Enterprise Manager, using the Security tab of the Linked Servers Properties screen (see Figure 9-3). The following list shows some guidelines for determining the strategy you want to take:

click to expand
Figure 9-3: Configuring linked server security

Local Server Login to Remote Server Login Mappings This option maps a local login to a remote login while connecting to the remote server. For example, if you map the login bknight on SQL Server to the sa login on the remote server, SQL Server logs in as sa on the remote system any time it sees that the login bknight is requesting data from the linked server. If you check the Impersonate box, SQL Server passes the login credentials of the person who is connected to the local server to the remote system. The login and password must exactly match on both servers for this to work.
Not Be Made Selecting this option means that the system does not use any type of security when trying to connect to the remote server. This is an option that works well in conjunction with the Local Server Login To Remote Server Login Mappings option. In this model, you can map the users who are allowed to use the linked server, and any other user is denied access.
Be Made Without Using a Security Context Selecting this option means the system does not use any type of security. Some systems, such as Access databases, flat files, or Excel spreadsheets, do not require security if they're not protected.
Be Made Using the Login's Current Security Context Selecting this option passes the login name and password of the requesting user to the remote system.
Be Made Using this Security Context This option allows you to specify one login name and password that applies to every user. This is the easiest way to configure the linked server, but it's also the least secure. It's also sometimes difficult to debug a problem when every user shows up as the same login and host name, especially if you need to trace who is logged in.

Configuring Security Through T-SQL

You can also configure security through T-SQL. To map the linked server's security information through T-SQL, use the sp_addlinkedsrvlogin stored procedure. This stored procedure accepts the same parameters as those available on Enterprise Manager. The syntax is

EXEC sp_addlinkedsrvlogin   @rmtsrvname='<linked server name>',   @useself='<pass current login to remote server>',   @rmtuser='<remote login to login as>',   @rmtpassword='<remote password to login as>'

Only the @rmtsrvname parameter is required. To configure a linked server to log in to the remote system with the login name bknight and password badideaforapw, use the following syntax:

EXEC sp_addlinkedsrvlogin   @rmtsrvname='SERVERB',   @useself='false',   @rmtuser='bknight',   @rmtpassword='badideaforapw'

To use the easiest linked server security (the option Be Made Using The Login's Current Security Context option in Enterprise Manager), set the @useself parameter to true as shown here:

EXEC sp_addlinkedsrvlogin 'SERVERB', 'true'

In the Trenches

Enterprise Manager is excellent about caching data. This prevents the tool from constantly querying SQL Server for more metadata. However, the drawback is that when you make a change through T-SQL to a linked server, you may have to refresh the Linked Servers tree. You can accomplish this by right-clicking on the Linked Servers group and selecting Refresh. You would only have this information cached if you had already opened Enterprise Manager, and had drilled down to the Linked Servers tree. This can be a problem if you modify system-level information in T-SQL while someone is viewing the data in Enterprise Manager.

Once you have a linked server mapped, you can drop the mapping by using the sp_droplinkedsrvlogin stored procedure as shown here:

sp_droplinkedsrvlogin 'SERVERB', 'bknight'

You can ascertain the type of security the remote server is using through T-SQL by using the sp_helplinkedsrvlogn system stored procedure. The only parameter that is recommended is the linked server's name. If you don't provide a linked server name, the stored procedure returns every linked server's security information.

EXEC sp_helplinkedsrvlogin 'SERVERB'

This outputs the following results:

Linked server   Local Login  Is Self Mapping Remote Login --------------- ------------ --------------- ------------- SERVERB         NULL         0               bknight (1 row(s) affected)

Configuring the Linked Server

After you set up security, you're ready to configure the linked server. Configuration options can be tricky-invoking a single option could mean a 100 percent performance boost in some cases.

Configuring Linked Servers in Enterprise Manager

To configure the linked server in Enterprise Manager, open the Linked Server Properties dialog box and go to the Server Options tab as shown in Figure 9-4. The options are as follows:

click to expand
Figure 9-4: Configuring linked servers in Enterprise Manager

Collation Compatible Pay particular attention to this option. When your query involves string comparisons, the linked server doesn't know what type of collation to use by default. If you don't specify this, queries that need to make these types of comparisons are shipped to the local server, since that is the executing server.

If you know that your remote system is configured to support the collation you're looking for (for instance, Oracle), make sure you check this option. This option substantially speeds up those queries that have a WHERE clause. Otherwise, the entire table is shipped back to the local server to apply the WHERE clause.

Note
You don't have to set this option if you're specifying a SQL Server as your linked server.
SQL Server automatically detects the collation of the remote SQL Server system for you.
Collation Name This option is an attendant, equally important, specification. It allows you to specify which collation you want to use on a system that is not SQL Server. This setting is only used if the remote server is not a SQL Server and the Use Remote Collation option is enabled.
Use Remote Collation If you're using a SQL Server, you need only check the Use Remote Collation option to enable the linked server to use the remote collation. The linked server can only use standard collations that are available to SQL Server. If the collation is not available for your data source, you cannot use this option, and queries will have to be shipped back to the local server.
Data Access This option enables users to run queries against your remote system. (Even if the Data Access option is unchecked, you can still run stored procedures.) If you deselect this option, you prevent users from accessing tables directly. Users must then execute all of their queries through a controlled stored procedure. If they try to run a query on a linked server with this option disabled, they receive the following error:
```
Server: Msg 7411, Level 16, State 1, Line 1 Server 'SERVERB' is not configured for DATA ACCESS.
```
RPC and RPC Out These options enable execution of stored procedures on the remote server. The RPC option enables you to execute remote procedure calls (RPCs) on the remote server, and the RPC Out option enables you to execute RPCs to a remote server from the local server. If you try to execute a stored procedure on a remote system with the RPC Out option disabled, you receive the following error:
```
Server: Msg 7411, Level 16, State 1, Line 1 Server 'SERVERB' is not configured for RPC.
```
Connection Timeout and Query Timeout These options can also impact performance. Remote systems require differing amounts of time to connect to and run queries against. You can set these options to control both time spans. For example, you probably don't want users to run a query for an hour, so set the Query Timeout to an appropriate setting for your system.

Configuring Linked Servers Through T-SQL

You can configure a linked server by using the sp_serveroption stored procedure. The only parameters the stored procedure requires are the server name, the option name, and the setting. For example, to set the Collation Compatible option in T-SQL, use the following syntax:

EXEC sp_serveroption 'SERVERB', 'collation compatible', 'true'

Additional options are available through T-SQL configuration measures, and they're specific to scalability features, such as distributed partitioned views. I discuss those in the 'Distributed Partitioned Views' section, later in this chapter.

Linked Servers Between Heterogeneous Databases

The linked server feature is not limited to SQL Server. You can also create linked servers to other database systems, such as Oracle. (Database systems other than SQL Server are considered 'heterogeneous' and SQL Server systems are considered 'homogenous.')

In this section, I discuss the steps you must take to establish a linked server connection to DB2 and Oracle. You may have to consult with your Oracle or DB2 administrator to perform some of these steps.

Connecting to Oracle

To connect to Oracle, use the OLE DB provider for Oracle that ships with MDAC. This requires the following steps:

Install the Oracle connectivity tools. The OLE DB provider for Oracle requires Oracle Client Software Support File version 7.3.3.4.0 or later, and SQL*Net version 2.3.3.0.4. These tools should be installed on the machine that will hold the linked server.
Configure the SQL*Net alias name on the server to point to the Oracle instance. Missing this step is the most common mistake DBAs make. You may have to see your Oracle DBA or your Oracle documentation for this step. (Oracle ships with tools to configure the alias name.)

Create your linked server.

Note

If you're planning on installing tools and a linked server on a Windows 98 machine, note that the Oracle client software does not support distributed transactions through the Microsoft OLE DB provider for Oracle on Windows 98. When querying an Oracle system, always use object names.

Create your linked server to an Oracle system with the following query:

EXEC sp_addlinkedserver @server = 'ORACLESERVER' @srvproduct = 'Microsoft OLE DB Provider for Oracle', @provider = 'MSDAORA', @datasrc = 'OracleServerName'

Caution

A quirk you must be aware of is that when you're inserting into an Oracle system, you must qualify each column, even if the column allows NULLs.

Connecting to DB2

DB2 is a little trickier to connect to, requiring more configuration tasks, and also requiring changes in both DB2 and the server that'll be querying DB2. To connect to DB2, use the following steps:

Install the client for Host Integration Services (HIS) on the SQL Server, which gives you the option to install the OLE DB provider for DB2. You can also use similar products, such as DB2 Connect from IBM, or StarSQL from StarQuest.
Configure the OLE DB provider for DB2 (or a similar product). See your DB2 administrator for the DB2 settings you should use.
Install the appropriate packages on DB2 to support the connectivity option you have chosen. Whether you choose HIS or StarSQL, a package must be installed in DB2. A DB2 administrator can do this for you. If you choose to use the OLE DB provider for DB2, install the package from the program group Host Integration Services | Data Integration | Packages for DB2. Five packages are installed on the DB2 server to expand its connectivity support.
Configure the linked server with the DB2OLEDB provider name. You will also need the connection string. Host Integration Services ships with a utility to create the connection string by creating a data link.

As you're converting data to and from DB2, keep in mind that some data types may not be supported by DB2. Table 9-3 indicates the support for various data types:

S = Supported
L = Limited support

N = Not supported

Table 9-3: DB2 Support for SQL Server Data Types
Data Type in SQL Server	IBM DB2 Mvs	IBM DB2 AS400
binary(n)	S	L
Bit	S	S
char(n)	S	S
Datetime	S	S
Decimal	S	S
Float	L	S
Image	N	N
Int	S	S
Money	S	S
nchar(n)	S	S
Ntext	N	N
Numeric	S	S
nvarchar(n)	S	S
Real	S	S
Smalldatetime	S	S
Smallint	S	S
Smallmoney	S	S
Text	N	S
Timestamp	S	S
Tinyint	S	S
Uniqueidentifier	S	S
varbinary(n)	S	S
varchar(n)	S	S

Deleting a Linked Server

To drop a linked server and the mapped logins associated with it, right-click the server object in Enterprise Manager, and press DELETE.

You can also drop a linked server with the sp_dropserver stored procedure followed by the linked server name, as shown here:

sp_dropserver 'SERVERB'

If any logins for the linked server were established, you receive the following error:

Server: Msg 15190, Level 16, State 1, Procedure sp_dropserver, Line 44 There are still remote logins for the server 'SERVERB'.

This error is by design and prevents SQL Server from leaving orphaned records for the logins. To fix this problem, use the ‘droplogins' option to remove the logins and the linked server at the same time, as shown here:

sp_dropserver 'SERVERB', 'droplogins'

Querying Linked Servers

Now let's get to the fun stuff. Once you have a linked server created, how do you actually use it? Let's dive into the trenches and get our hands a little dirty. One of the ways of querying a linked server is to use a four-part qualifier. The four-part qualifier uses the following syntax:

<linked server name>.<catalog name>.<owner>.<object name>

Note

The catalog name and owner qualifier may not be needed for all data sources. I like to use it to avoid any confusion. When connecting to another SQL Server data source, you must use the owner name (generally dbo).

Once you have the qualifier down, everything else is easy. For example, to select data from a linked server, you could use the following query:

SELECT * FROM SERVERB.Northwind.dbo.orders WHERE OrderDate < '1997-01-01 00:00:00.000'

This query should return items for the orders taken before 1997. It's that simple! You can run almost any type of SQL query using the four-part name. For example, you can run an INSERT statement into the Northwind's Category table by using the following syntax:

INSERT INTO SERVERB.Northwind.dbo.Categories  (CategoryName, Description) Values('Bait','Sample category for bait')

Note

Some providers do not support INSERT, UPDATE, and DELETE SQL statements. All the ones mentioned in these examples do, however.

Joining Multiple Servers

Suppose you have your data spread across multiple servers? For example, you have all of your orders before 1997 on one server, and all of your orders from 1997 to date on another server. To tie the data together, use the UNION ALL clause in a query as shown in the following:

SELECT * FROM SERVERB.Northwind.dbo.orders WHERE OrderDate < '1997-01-01 00:00:00.000' UNION ALL  SELECT * FROM Northwind..orders WHERE OrderDate > '1997-01-01 00:00:00.000' ORDER BY OrderID

This executes the query in parallel on both servers. The results are brought back together to the local SQL Server.

Note

You can see the Execution Plan of queries in Query Analyzer under the Query Menu or by pressing CTRL-L.

It is important to notice the behavior of the ORDER BY clause in this query. The ORDER BY clause orders the entire resultset, not just the result from a particular server. Since the Orders table has a clustered index on OrderID on both the remote server and the local server, there is no need to sort the data once it's pulled back from the remote server.

Tip

Use Profiler to determine which queries are being submitted to the local and remote servers. This allows you to pinpoint any trouble spots.

Earlier in this chapter, I said you need to have a common table that all the applications could share. This approach allows multiple servers to operate in parallel. For example, you could place orders on a local server and maintain employee information in a Human Resources system. You can tie the systems together to see which employee took an order by using a JOIN clause as shown here:

SELECT Orders.OrderID, Employees.LastName, Employees.FirstName, Employees.EmployeeID, Employees.Title, Orders.CustomerID, Orders.OrderDate FROM Orders INNER JOIN SERVERB.Northwind.dbo.Employees as Employees ON Orders.EmployeeID = Employees.EmployeeID ORDER BY Employees.LastName

To make things easier, I aliased the SERVERB.Northwind.dbo.Employees four-part name as Employees. This saves a few keystrokes later in the query, and helps to visualize the Employees table as local, even though it's on another physical server. This time I issued an ORDER BY clause on the Employee's table on the LastName column. This is much more taxing on the local server since the column LastName is not a clustered index. In this case, the data is brought back to the local server for sorting.

Try to experiment with various types of ordering to see if your performance improves. However, if you can, avoid using ordering. For example, the same query I ran against the HR system would run with a considerable performance boost by avoiding the ORDER BY clause, because no sorting needs to take place after the data is pulled back.

Executing Stored Procedures on Linked Servers

Linked servers aren't limited to ad hoc queries-you can run stored procedures and query views on remote servers. Execute a stored procedure as you would normally, but add the four-part name before the stored procedure. For example, here's code that runs the CustOrderHist stored procedure in the Northwind database and passes the variable ‘Quick':

EXEC SERVERB.Northwind.dbo.CustOrderHist @CustomerID='Quick'

Before you run the stored procedure, make sure that the RPC settings for the linked server are set. To set the RPC and RPC Out options to true, use the following syntax in Query Analyzer:

sp_serveroption 'SERVERB', 'rpc', 'true' sp_serveroption 'SERVERB', 'rpc out', 'true'

Getting Verbose Error Messages

The error messages that linked servers return are sometimes very generic. Turn on the 7300 trace flag if you want to receive more detailed information. You can turn it on by using the following syntax:

DBCC TRACEON(7300)

Using openquery() and openrowset()

You should try to limit the work that the local SQL Server must do when querying
a linked server. If your local server has to perform an enormous amount of work to sort the data, you lose the benefit of linked servers.

With some providers, you can't execute proprietary syntax for the provider through four-part names, and an attempt to do so generates a syntax error. Instead, you can use pass-through queries to pass the query directly to the remote server, without checking the syntax locally.

To use pass-through queries, use the openquery() or openrowset() function. The openquery() function is one of the easiest methods of using an existing linked server for pass-through queries. The simple syntax looks like this:

SELECT * FROM openquery(SERVERB, 'SELECT * FROM Orders')

For example, here's how to use the openquery() function for the same query discussed earlier in this chapter:

SET QUOTED_IDENTIFIER OFF SELECT * FROM openquery(SERVERB, "SELECT * FROM Orders  WHERE OrderDate < '1997-01-01 00:00:00.000'")

Notice that the first line turns off quoted identifiers. I do this because I'm using character data in the WHERE clause, which requires quotes. If I use single quotes around the date, it breaks the openquery() function, which also is using the single quotes. The workaround is to use double quotes around the SELECT statement, and single quotes around the character data. Since double quotes are around the SELECT statement, the openquery() function won't be broken by single quotes.

Caution

It's a good idea to add the SET QUOTED_IDENTIFIER ON clause at the end of this query to ensure that this option is set back to the Query Analyzer default.

The openrowset() function allows you to create a linked server on demand. This is useful if you don't know where you want to connect until the application's runtime. However, I usually recommend avoiding this situation if you can, since it takes an extra step (and more complex code).

In the Trenches

If you use compatibility levels when you issue a query from a database with a lower compatibility level than 70, you may see the following error:

Server: Msg 155, Level 15, State 1, Line 1 'SERVERB' is not a recognized OPTIMIZER LOCK HINTS option.

This occurs because the linked server syntax wasn't supported until SQL Server 7.0. To fix this problem, adjust the compatibility level back to at least 70 for the database you're connected to when running the query.

Use openrowset() with the following (possibly more intimidating) syntax:

openrowset ( '<provider name>'      , { '<data source>' ; '<login name>' ; '<password>'          | '<provider_string>' }      , { [ <catalog>. ] [ <schema>. ] <object>          | '<query>' })

For example, you can use the following syntax to run the familiar query that I've used throughout this chapter:

SET QUOTED_IDENTIFIER OFF SELECT * FROM openrowset('SQLOLEDB','SQLCENTRAL';'sa';'password', "SELECT * FROM Northwind.dbo.Orders WHERE OrderDate < '1997-01-01 00:00:00.000'")

Again, since I use character data in the WHERE clause, I can use the workaround of turning off quoted identifiers temporarily. Otherwise, SQL Server thinks whatever is in the column is an object name and issues the following error:

Server: Msg 7314, Level 16, State 1, Line 2 OLE DB provider 'SQLOLEDB' does not contain table 'SELECT * FROM Northwind.dbo.Orders WHERE OrderDate < '1997-01-01 00:00:00.000''. The table either does not exist or the current user does not have permissions on that table.

Caution

Anyone who is using Profiler to trace the local server can see the password of the remote server when the openrowset() function is called.

Joining Tables with the openquery() Function

You can also join tables, just as I did with the four-part name earlier in this chapter. I'll take the UNION ALL query I demonstrated earlier and turn it into a pass-through query:

SET QUOTED_IDENTIFIER OFF SELECT * FROM openquery(SERVERB, "SELECT * FROM Orders  WHERE OrderDate < '1997-01-01 00:00:00.000'") UNION ALL  SELECT * FROM Northwind..orders WHERE OrderDate > '1997-01-01 00:00:00.000' ORDER BY OrderID

You can also join tables through a standard JOIN clause as shown next:

SET QUOTED_IDENTIFIER OFF SELECT Orders.OrderID, Employees.LastName, Employees.FirstName,  Employees.EmployeeID, Employees.Title, Orders.CustomerID, Orders.OrderDate FROM Orders INNER JOIN openquery(SERVERB, "SELECT EmployeeID, LastName,  FirstName, Title FROM Employees")  as Employees ON Orders.EmployeeID = Employees.EmployeeID Order by Employees.LastName

In this example, I aliased the Employees table to save a few keystrokes. This query will transfer the Employees table to the local server before performing the join. I'll discuss ways to optimize this in the next section.

Using openquery()

Even with the RPC Out and RPC options turned off, you can still run return results from the following stored procedures with the openquery() and openrowset() functions. This is because these functions send the query unchecked from the local server.
To execute a stored procedure using these functions, use the following syntax:

SELECT * FROM openquery(SERVERB, "CustOrderHist @CustomerID='Quick'")

Tuning and Supporting Linked Servers

You can take steps to make sure the processes you use on linked servers are running at optimum levels. In addition, you can take steps to keep an eye out for potential problems. I discuss these procedures in this section.

Matching Queries to Server Types

SQL Server tries to allow the remote server to perform as much of the query processing as possible. However, this approach doesn't always work. For example, if you have a linked server set up against a text file, the text file cannot handle a WHERE clause or ORDER BY clause remotely, because the file is flat and there is no server to filter the query remotely. Instead, SQL Server pulls the entire dataset into SQL Server, and then performs the appropriate relational query.

If you're querying a linked server that is defined as a SQL Server, much of this work can be handled by the remote system, and only the results the user cares about would be returned to the local SQL Server. Indexes on the remote server will also be used where applicable.

When a query occurs, SQL Server queries the OLE DB provider to determine if the query you're submitting has any functions that are not permitted by the OLE DB provider. Once SQL Server determines that the command you're executing is allowed by the remote system, it determines which parts of the query can be delegated to the remote system for execution. Some syntax is always pulled back and evaluated locally, no matter what your remote server's database type is. The following items are never delegated:

Queries that use bit, timestamp, and uniqueidentifier data types
TOP clauses
Updates, inserts, and deletes
Data conversion operations

The following actions can be executed remotely only if the remote server is a SQL Server:

Clauses such as CUBES, OUTER JOINS, and ROLLUP
Bitwise operators
Like queries
String and math functions

Providers that support SQL-92 also allow delegation of UNION and UNION ALL clauses. Other OLE DB providers may support statements such as the DISTINCT clause. If this is available for the provider, it is delegated. If it is not available for the provider, the results are evaluated locally.

All of these limitations are by design. As you're designing your linked server's queries, keep in mind what the remote server can support. Watch the execution plan of your query to make sure that unnecessary processing is kept away from your local server. You want to also use more stored procedures if applicable to enforce remote processing.

Viewing Metadata on the Linked Server

Linked servers can be quite difficult to troubleshoot, because you're dealing with multiple heterogeneous systems in addition to the local SQL Server. That's why Microsoft has provided a number of stored procedures to help you debug problems.

The sp_linkedservers stored procedure, which requires no parameters, lists all the linked servers on the server you're connected to, along with their general properties (the properties in the General tab of the Linked Server Properties screen in Enterprise Manager).

The sp_helpserver stored procedure returns the options configured in the Server Options tab of the Linked Server Properties screen. This procedure takes an optional value of the linked server name, but you can run it without any parameters. If you don't designate any parameters, all the linked servers are returned.

Viewing Catalog-Level Metadata

To determine which catalogs exist on the remote server, run the sp_catalogs system stored procedure with the required parameter of the linked server name, as shown here:

EXEC sp_catalogs  @server_name ='SERVERB'

This will return a list of all the catalogs (or databases) on the remote server.

Viewing Table-Level Metadata

One of the most important debug stored procedures is the sp_tables_ex system stored procedure. This procedure shows you the tables that exist on the remote server. More important, it shows you the table names by which the remote server wants you to reference the tables. The only parameter that's required to run this stored procedure is @table_server, which is the linked server's name. To run the stored procedure, use the following syntax:

EXEC sp_tables_ex    @table_server = 'SERVERB'

This outputs the following results:

Catalog name
Schema name or owner name
Table or view name
Table type (system table, table, or view)

The problem with running this stored procedure without additional parameters is that in some systems, such as DB2, there may be thousands of tables in one catalog. To narrow your query, you can run the stored procedure with additional optional parameters, as shown here:

EXEC sp_tables_ex    @table_server = 'SERVERB',    @table_catalog='Northwind',    @table_schema='dbo',    @table_name='Suppliers'

Viewing Column-Level Metadata

Some of your developers may not have installed the tools that let them determine which columns on a remote linked server are available to them. To get this information, use the sp_columns_ex stored procedure. The only required parameter is the @table_server parameter, which is the linked server's name.

EXEC sp_columns_ex     @table_server = 'SERVERB'

This returns tons of valuable data about the columns in your remote system:

Catalog name
Schema or owner
Table or view name
Column names
Data type
Column size, precision, and nullability

I highly recommend you don't run the stored procedure with only the @table_server parameter, because it returns every column in the catalog. Narrow your results with any of the following optional parameters:

EXEC sp_columns_ex     @table_server = 'SERVERB',     @table_catalog = 'Northwind',    @table_name  = 'Employees',    @table_schema = 'dbo',    @column_name='BirthDate'