Coding for Distributed Data | MCAD/MCSE/MCDBA 70-229 Exam Cram 2: Designing & Implementing Databases w/SQL Server 2000 Enterprise Edition

So far, all the data you've been able to access has been stored in databases on one SQL Server. This section talks about how to query data that's on other SQL Servers and in other systems without importing the data. Later sections in this chapter deal with how to import and export data with SQL Server.

You can use three types of syntax to access data; each one serves a different purpose. First, you're going to see OPENROWSET, which is the most flexible but hardest to use of the three. OPENQUERY is easier to use but requires a bit more setup. The third option is linked servers, which you will see how to set up and use in a multiserver relationship.

Each of the three methodsOPENROWSET, OPENQUERY, and linked serversrelies on having an operable OLE-DB driver for the system you're trying to talk to. SQL Server 2000 ships with OLE-DB drivers for SQL Server, text files, Oracle, IBM's DB/2, and others. To understand how to use each of these methods, you need to understand how to talk to an OLE-DB provider.

A Brief Discussion on OLE-DB

OLE-DB is a low-level database access protocol that is designed to enable a client to access different systems without having to understand how each system works. The foundation of OLE-DB is the OLE-DB provider, which has to implement certain methods. At this level, an OLE-DB provider needs to know certain things to make a connection to the system.

You can find a list of the OLE-DB providers that you have installed in the Registry at HKEY_LOCAL_MACHINE\Software\_Microsoft\MSSQLServer\Providers\ _SQLOLEDB.

In the case of another SQL Server, the OLE-DB provider needs to know the name of the server and how to log in to the server. In the case of a text file, the OLE-DB provider needs to know where the file is and what format the file is in, as well as other properties about the file, such as delimiters, whether the first row contains column names, and so on. The point here is that different providers have different requirements, but that after you get past those requirements they all work the same way as far as retrieving data.

The only way to understand how to use the provider is to consult the documentation for that provider and for that version of the provider and to actually work with it in a test or production environment.

Using OPENROWSET

The OPENROWSET function returns a rowset to SQL Server. It's used in a SELECT statement in the same place as a table. Here are the parameters for the OPENROWSET function with a short description of how to use each one:

provider_name This is the name of the provider, as specified in the Registry. It is not optional.
datasource This is the name of the file, server, or whatever the OLE-DB provider needs in order to figure out what it should be talking to.
user_id This is the username that the provider understands. Some providers may not need a username, so this could be blank.
password This is the password that the provider understands. Some providers may not need a password, so this could be blank also.
provider_string This is a free text field that has everything else that the provider needs in order to initiate the connection. Only the provider knows for sure what this is; you'll have to dig through piles of documentation.
catalog This is the name of the database, catalog, or whatever the provider understands. It's the top level of the hierarchy of object names. It's also likely to be blank if the provider doesn't use it.
schema This is the name of the owner of the object. It also may be blank if the provider doesn't use it.
object This is the name of the object being manipulated. Believe it or not, this may be blank if there's only one object in the data source, such as a text file.
query This is a string that's provider-specific, and it's passed directly to the provider as a query. It may be a SQL query; it may be something else. It's also not processed by SQL Server, so you should make sure that the syntax is valid before you send it; otherwise, you'll get a very incomprehensible error message.

Nearly every single argument in the entire OPENROWSET function is optional, and its use or lack thereof depends on the provider you are using. Experience dictates that the documentation and examples for how to use the particular provider in which you are interested will be either non-existent or inaccurate. In other words, good luck.

Actually, OPENROWSET is a very useful function; it's just complicated and difficult to set up. After you have it figured out for a particular provider, it works very well. You should plan on spending several hours with a new OLE-DB provider to figure out how it works. Here's an example of OPENROWSET in use:

 SELECT * FROM openrowset('sqloledb',        'SQLTest1';'sa';'', 'SELECT * FROM master.dbo.sysobjects')

This example is a fairly simple case; it just returns a rowset from another SQL Serverin this case a SQL Server running on a box named SQLTest1, which has a blank SA password. You can tell it's a request to another SQL Server because 'sqloledb' is the provider, and that's the name of the provider used to talk to other SQL Servers. It runs a simple SELECT statement and returns a simple rowset. Notice that it uses a full three-part name for the table. This isn't required, but it is good practice for doing remote queries. You can then use the results just as you would any other table: You can join them to other local tables, filter them, or whatever you like:

 SELECT SQLTest1sysobjects.name FROM openrowset('sqloledb',        'SQLTest1';'sa';'', 'SELECT * FROM sysobjects') SqlTest1sysobjects INNER JOIN sysobjects        ON sysobjects.name = SQLTest1sysobjects.name WHERE SQLTest1sysobjects.type = 'u'

Notice that this uses a table alias to reference the OPENROWSET return values; it filters on the object type from the remote table and does a join. So it really does work as a table does. But the syntax is incredibly cumbersome. You're probably hoping there's some shortcut you can use to avoid all the complexity. Well, read ahead to see how that can happen.

Creating a Linked Server

You can think of a linked server as a prebuilt set of arguments for OPENROWSET stored in a database object. You can just create an object called a linked server, and create it with all the attributes you'd normally use in a call to OPENROWSET. Then, rather than having to type all those parameters over and over again, you can just use the linked server.

There are two ways to create a linked server: using the Query Analyzer and using Enterprise Manager. In Query Analyzer, you create the linked server using the sp_addlinkedserver system stored procedure as described in the next section. Using the Enterprise Manager, you can select the options needed through the graphical interface, shown in Figure 10.1.

Figure 10.1. Linked Server options using the Enterprise Manager.

The Enterprise Manager interface is fairly straightforward. The only parameters that aren't part of the OPENROWSET you've already learned about are server and provider; the rest are just bad names on the same parameters. Server is the name of the linked server you are creating. Any SQL Server-approved object name works; it doesn't necessarily have to be the name of the server, but it probably should be. The srvproduct argument is the name of the OLE-DB data source. The provider argument specifies the same thing as the srvproduct, but does it with an OLE-DB PROGID, which is a long identifier. The rest of the options are from OPENROWSET. If you're linking to another Microsoft SQL Server, you can specify an @srvproduct of SQL Server only, and the @server argument must be the server's name. None of the other arguments is required or allowed. Here's a linked server definition for the server you saw used in the preceding section:

 sp_addlinkedserver @Server = 'OpenRowsetTest',        @SrvProduct = 'sqloledb',        @Provider = 'sqloledb',        @DataSrc = 'SQLTest1'

Thenand this is the really great partyou can query it like this:

 SELECT * FROM openrowsettest.master.dbo.sysobjects

That's a four-part name. Four-part names are required when using linked servers. You can't shortcut the object owner part either; you have to fully specify the entire thing. You could, of course, alias it, join it, or whatever you want to do; it's going to act just like a normal table.

The Security tab needs a little more elaboration on a couple of fronts. First of all, in Figure 10.1 you'll notice that there are two parts to the window. In the top part you can specify local logins and map them to logins on the remote server. In the bottom part you can specify what happens to local logins that aren't in the list. You can specify that users who are not on the list will not be allowed to use the remote server by choosing the Not Be Made option. You can choose that the users not on the list be forced into "guest" status, if available, on the remote server by choosing the Be Made Without Using a Security Context option. You can choose to attempt to pass through the authentication that the user has on the local server by choosing the Be Made Using the Login's Current Security Context item, or you can force everyone who isn't on the list in the top part of the window to use a specific login and password with the Be Made Using This Security Context option.

Now that you know how to create linked servers, it's time to look at using them in a little more detail.

Using Linked Servers

There are two ways to use an established linked server. The firstand easiest to understandmethod is to use the four-part name as shown previously. As you remember, a three-part name is database.owner.object, so for the sysobjects table in the Master database, you can reference it as master.dbo.sysobjects. For a linked server, you can use a four-part name, which is server.database.owner.object, or, in the preceding example, sqltest1.master.dbo.sysobjects.

The four-part name works great when you're using other SQL Servers as the remotes. What happens when you want to use something other than SQL Server on the other side? Then you need to use the OPENQUERY function. OPENQUERY looks suspiciously similar to OPENROWSET, but it takes only two arguments: the linked server name and a query. So you could run this:

 SELECT * FROM  OPENROWSET test.master.dbo.sysobjects

Or you could run this:

 SELECT * from OPENQUERY(OpenRowsetTest,   'SELECT * FROM master.dbo.sysobjects')

Notice that the linked server name does not have quotation marks around it. SQL Server doesn't like quotation marks there. Both queries return the same resultset.

These linked servers are really great, but don't get any ideas about using them to import data. Although it works well for smaller datasets (less than a few hundred thousand rows), there are significantly more efficient ways to handle data imports.

Remote Servers

A remote server is a logical definition of an instance of SQL Server that can be used by remote stored procedure calls. Remote servers have been present in previous releases of SQL Server and are still supported in SQL Server 2000, but linked servers offer greater functionality.

You may hit linked and remote servers on the exam as possible answers to connecting to other SQL Servers. When you do, remember that it is really the linked server that provides the most flexibility and can be applied in almost any circumstances.

A remote server configuration allows a client connected to one instance of SQL Server to execute a stored procedure on another instance without establishing another connection. The server to which the client is connected accepts the request and sends information to the remote server on behalf of the client. The remote server processes the request and returns any results to the original server.