Layered Service Provider | Network Programming for Microsoft Windows (Microsoft Professional Series)

Layered Service Provider

As we mentioned, a layered service provider (LSP) installs itself into the Winsock catalog so that an application that creates a socket will call into it without necessarily having any awareness of the LSP. This is useful for developing system components that modify or monitor any portion of the Winsock API. For example, a secure socket provider that implements SSL can be implemented as a layered service provider. In this example, the LSP would negotiate the SSL connection when the application issues a connect as well as encrypting data sent via any Winsock send command while decrypting data returned from the receive commands. Other possibilities include Winsock proxy clients and content filtering.

An LSP accomplishes this by installing an entirely new Winsock provider that mimics or extends an existing provider. For example, if you were developing an LSP that filters HTTP requests, you would need to layer your provider over the Microsoft TCP provider because the HTTP protocol runs over TCP. You would want this new provider to be virtually indistinguishable (at least from an application's perspective) from the base Microsoft provider because you want any application that uses TCP to go through your provider first. Of course, it is possible to create an LSP that implements an entirely different protocol with different semantics on top of an existing Winsock provider.

In Chapter 2, you saw how Winsock selects the appropriate provider to load when a socket is created. When an LSP is installed, it is placed in the catalog in a certain order. When an application creates a socket, the catalog is enumerated in order until the best match is found, at which point the system loads that provider. This allows a layered provider to be loaded instead of the default Microsoft provider.

When an application that created a socket from the layered provider makes a Winsock call, the system routes the call into the LSP. At that point, the LSP can perform its necessary tasks. It can also pass the request to the provider below itself if further action is required. For example, in our HTTP content filtering examples, we may want to intercept HTTP requests and modify them before actually making the request. This would require the LSP to perform some action for any of the Winsock APIs that send data. When the application calls any Winsock send function, the call is routed to the LSP, which examines the send buffer and makes the appropriate modifications to it. Of course, the LSP doesn't actually know how to send TCP data; it relies on the underlying TCP provider, which has a kernel mode driver that implements the protocol. The LSP must know where it resides in the protocol chain to pass the modified send request to the provider beneath it. In many cases, this will be the base provider but an LSP can be installed over other LSPs. Eventually, the request will make it to a base provider, which will perform the appropriate action. In the next section you'll see exactly how these protocol chains are implemented because when an LSP is installed, these chains must be built. Figure 12-2 shows the relationship between applications, layered service providers, and base providers.

Figure 12-2 Layered provider architecture

Winsock LSPs are implemented as a standard Windows dynamic-link library into which you must export a single function entry named WSPStartup. When the system invokes the layered provider's WSPStartup, it must expose 30 additional SPI functions that make up the LSP via a function dispatch table passed as a parameter. Table 12-2 lists those SPI functions that must be implemented within the DLL.

**Table 12-2** *Transport Provider Support Functions*
API Function	Maps to SPI Function
WSAAccept (accept also indirectly maps to WSPAccept)	WSPAccept
WSAAddressToString	WSPAddressToString
WSAAsyncSelect	WSPAsyncSelect
Bind	WSPBind
WSACancelBlockingCall	WSPCancelBlockingCall
WSACleanup	WSPCleanup
closesocket	WSPCloseSocket
WSAConnect (connect also indirectly maps to WSPConnect)	WSPConnect
WSADuplicateSocket	WSPDuplicateSocket
WSAEnumNetworkEvents	WSPEnumNetworkEvents
WSAEventSelect	WSPEventSelect
WSAGetOverlappedResult	WSPGetOverlappedResult
getpeername	WSPGetPeerName
getsockname	WSPGetSockName
getsockopt	WSPGetSockOpt
WSAGetQOSByName	WSPGetQOSByName
WSAIoctl	WSPIoctl
WSAJoinLeaf	WSPJoinLeaf
Listen	WSPListen
WSARecv (recv also indirectly maps to WSPRecv)	WSPRecv
WSARecvDisconnect	WSPRecvDisconnect
WSARecvFrom (recvfrom also indirectly maps to WSPRecvFrom)	WSPRecvFrom
Select	WSPSelect
WSASend (send also indirectly maps to WSPSend)	WSPSend
WSASendDisconnect	WSPSendDisconnect
WSASendTo (sendto also indirectly maps to WSPSendTo)	WSPSendTo
setsockopt	WSPSetSockOpt
shutdown	WSPShutdown
WSASocket (socket also indirectly maps to WSPSocket)	WSPSocket
WSAStringToAddress	WSPStringToAddress

In most cases, when an application calls a Winsock function, WS2_32.DLL calls a corresponding Winsock SPI function to carry out the request using a specific service provider. For example, select maps to WSPSelect, WSAConnect maps to WSPConnect, and WSAAccept maps to WSPAccept. However, not all Winsock functions have a corresponding SPI function. The following list details these exceptions.

Support functions such as htonl, htons, ntohl, and ntohs are implemented within WS2_32.DLL and aren't passed down to a service provider. The same holds true for the WSA versions of these functions.
IP conversion functions such as inet_addr and inet_ntoa are implemented only within WS2_32.DLL.
All of the IP specific name conversion and resolution functions (i.e., the WSAGetXbyY functions) as well as WSACancelAsyncRequest and gethostname are implemented within WS2_32.DLL.
Winsock catalog functions and blocking hook-related functions are implemented within WS2_32.DLL. Thus, WSAEnumProtocols, WSA-IsBlocking, WSASetBlockingHook, and WSAUnhookBlockingHook do not have SPI equivalent functions.
Winsock error codes are managed within WS2_32.DLL and as such WSAGetLastError and WSASetLastError are not mapped to service providers.
The event object manipulation and wait functions—including WSACreateEvent, WSACloseEvent, WSASetEvent, WSAResetEvent, and WSAWaitForMultipleEvents—are mapped directly to native Windows operating system calls and aren't present in the service provider.

Also, a sample LSP is included on the companion CD in the directory Lsp. This LSP is a pass-through LSP. It doesn't modify any of the Winsock API calls, it simply passes the call down to the lower layer. Throughout our discussion of layered providers, we'll refer to the sample code to illustrate various points.

Before getting into the details of installing and implementing an LSP, we should discuss error handling. Winsock applications often use WSAGetLastError and sometimes WSASetLastError. However, as we have pointed out, there is no SPI equivalent to these functions. Instead, each of the SPI functions an LSP must implement (listed in Table 12-2) are exact mirrors of their API equivalents in terms of parameters except for an additional parameter, lpErrno. Those APIs that can be called in an overlapped manner have one additional parameter in addition to lpErrno the thread ID for the calling thread (which is discussed in the “Handling I/O” section). This is a pointer to an integer that should be set to the correct error code in case the LSP function fails. To indicate a failure, the LSP function should return SOCKET_ERROR and set lpErrno. For success, NO_ERROR is returned and the lpErrno value is ignored. The only exception is WSPStartup, which either returns NO_ERROR or the actual error code that caused startup to fail.

Installing an LSP

Before we talk about implementing an LSP, the first step is installing the layered provider into the Winsock catalog, which can become very complicated in itself. In Chapter 2, you saw how an application can enumerate the Winsock catalog as well as provide a code sample illustrating that. Installing an LSP consists of installing a WSAPROTOCOL_INFOW structure defining the characteristics of the layered provider as well as how the LSP fits into the “chain.” As the name “layered service provider” implies, providers are layered on top of one another to form a protocol chain that is defined as

typedef struct _WSAPROTOCOLCHAIN {     int ChainLen;     DWORD ChainEntries[MAX_PROTOCOL_CHAIN]; } WSAPROTOCOLCHAIN, FAR * LPWSAPROTOCOLCHAIN;

The ChainLen field is important because it indicates the type of provider the entry is. Table 12-3 lists the possible values. When ChainLen is zero or 1, the data contained in the ChainEntries array is meaningless. The value of one indicates a base provider, such as the Microsoft TCP and UDP providers. Typically, a base provider has a kernel mode protocol driver associated with it. For example, the Microsoft TCP and UPD providers require the TCP/IP driver TCPIP.SYS to ultimately function. It is also possible to develop your own base providers, but that is beyond the scope of this book. For more information about base providers, consult the Windows Driver Development Kit (DDK).

**Table 12-3** *Chain Length and Type of Provider*
ChainLen Value	Description
0	Layered provider entry
1	Base provider
2 or more	Layered chain entry

Layered providers use a chain length of zero or greater than 1. Entries whose chain length is zero are special. When a layered provider is installed, the protocol chain must be constructed that describes where the layered provider resides. This is done by filling in the ChainEntries array with the catalog IDs for each protocol in the chain. The catalog ID is the dwCatalogEntryId contained in the WSAPROTOCOL_INFOW structure.

Let's look at a quick example before going any further. Say we're developing an LSP that will be layered over the base Microsoft TCP provider. This will require us to install a single provider whose ChainLen will be two. The ChainEntries array will contain two entries: first is the layered provider catalog ID and second is the Microsoft TCP provider catalog ID. The problem is the value to use for the layered provider's catalog ID. When constructing the WSAPROTOCOL_INFOW structure that describes the layered chain for our LSP, the dwCatalogEntryId is not initialized and we cannot simply make one up. A catalog ID is assigned only when a provider is installed via WSCInstallProvider. To solve this problem, a dummy provider entry is installed first whose ChainLen is zero. Once this dummy provider (also known as the layered provider) is installed, the system assigns the catalog ID, which we can then use to install the actual layered chain entry.

The dummy layered provider's WSAPROTOCOL_INFOW structure contains meaningless data (except for the path to the provider's DLL, which will be discussed later). Furthermore, an application that calls WSAEnumProtocols will not see any entry with a chain length of zero; only WSCEnumProtocols will return these entries (along with all other entries). When writing the install (and remove) code for service provider, you want to use WSCEnumProtocols or you'll never see the layered provider dummy entries, only base and layered chain entries.

Getting back to our example LSP, first the dummy LSP entry is installed, after which the catalog is enumerated so we can find the provider ID of the dummy entry. Then we build the WSAPROTOCOL_INFOW structure, which describes our layered chain. In this structure the ChainLen is 2; ChainEntries contains two values. The first value is the catalog entry ID of the dummy entry just installed and the second array index contains the catalog entry ID of the base TCP provider. Figure 12-3 illustrates three WSAPROTOCOL_INFOW structures. The structure on the left is the default Microsoft TCP provider. The structure in the middle is the dummy LSP entry, and the structure on the right is the layered chain entry for the LSP provider. Notice that the protocol chain for the LSP provider contains two entries. Also notice that the figure illustrates only the first four protocol chain entries while the WSAPROTOCOL_INFOW structure actually contains MAX_PROTOCOL_CHAIN entries (which is seven).

Figure 12-3 Example LSP layered over the base Microsoft TCP Provider

Installing a Provider Entry

Now that we've covered the basics, let's look at the API used to install a Winsock provider, WSCInstallProvider. The API is defined as

 int WSPAPI WSCInstallProvider( IN LPGUID lpProviderId, IN const WCHAR FAR *lpszProviderDllPath, IN const LPWSAPROTOCOL_INFOW lpProtocolInfoList, IN DWORD dwNumberOfEntries, OUT LPINT lpErrno );

The first thing to notice is this API comes in only a UNICODE version. The parameter list is almost self-explanatory. Each provider installed requires a GUID to uniquely identify that provider entry. A GUID can be generated by the command line utility UUIDGEN.EXE or programmatically by UuidCreate. One GUID is required for the dummy layered provider entry and one (or more) is required for the layered chain entry or entries. The lpszProviderDllPath parameter is a UNICODE string that contains the path to the DLL that implements the layered provider. The DLL path can contain environment variables such as %SYSTEMROOT%. This provider path should be correct for both the layered provider entries and layered chain entries. Lastly, note that only members of the Administrators group can install (and remove) Winsock catalog entries.

The lpProtocolInfoList is an array of WSAPROTOCOL_INFOW structures. Each entry in the array is a separate provider entry to be installed. dwNumberOfEntries indicates the number of entries in the array. If the provider being installed is layered over multiple providers, they may be installed all at once or one at a time—which is an issue to consider as we will find out later. Of course, the dummy layered provider entry must be installed first by itself to obtain a catalog ID entry used by the layered protocol entries. The last parameter returns the error code in case of a failure, at which point the API returns SOCKET_ERROR.

As we have already mentioned, the layered provider entry is meaningless and is installed only to obtain a catalog ID. For layered protocol entries, the WSAPROTOCOL_INFOW structure is typically copied from the provider that is to be layered over with two exceptions. First, the szProtocol field is modified to contain the name of the new provider. Second, the flag XP1_IFS_HANDLES is removed from the dwServiceFlags1 field if present. When this flag is set, it indicates that the socket handles that this provider returns are true operating system handles and may be passed interchangeably to APIs that don't specifically take SOCKET handles (such as ReadFile and WriteFile) without taking a performance penalty. For a layered provider to return IFS handles there must be an associated kernel mode component that creates these handles such as what TCPIP.SYS does for the Microsoft TCP and UDP providers. We'll discuss socket handles in more detail later in this chapter.

Of course, if the LSP being developed is a completely new protocol, the install application must set the proper flags and fields within the WSAPROTOCOL_INFOW structure to accurately describe the behavior that the provider exposes. See Chapter 2 for a full description of the protocol structure.

Finally, when installing new provider entries, the entries by default appear at the end of the Winsock catalog when enumerated. If your LSP mimics a TCP/IP provider, it will never be called by default because the system will always match socket creation calls to the MSAFD TCP/IP provider that appears before your LSP's entry in the enumeration (see Chapter 2 for more information on how the system finds the appropriate Winsock provider to load). As a result, it may be necessary to reorder the catalog so that the newly install LSP entries appear first. This is done with the API WSCWriteProviderOrder defined as

int WSPAPI WSCWriteProviderOrder( IN LPDWORD lpdwCatalogEntryId, IN DWORD dwNumberOfEntries );

The first parameter is an array of DWORD, which contains the catalog entry IDs for every provider in the catalog in the order in which they should be written. For example, if there are 20 entries in the Winsock catalog (as returned from WSCEnumProtocols), the array should contain 20 entries, each with the catalog ID of an existing provider. After the API is called, the catalog will be reordered in the sequence specified. The array should not contain any duplicates. Note that this API is defined in the header file SPORDER.H and in the library SPORDER.LIB. On recent Platform SDK releases, the definition of this function has been moved into WS2_32.LIB, and SPORDER.LIB simply contains a forward to that definition.

After successfully installing an LSP it is a good idea to reboot the machine. Many of the system services, such as Local Security Authentication Server System (LSASS), create only the majority of their sockets upon bootup and create additional sockets as time goes on. The problem is that after an LSP is installed over the providers they are using, these services now have a mixed set of sockets from multiple providers. This can be problematic if these applications use the select API.

To summarize, installing a provider requires the installation of the “dummy” layered provider entry with a chain length of zero. This is necessary to obtain a catalog ID that the layered chain entries can later reference. After the layered provider is installed, each layered chain is installed that references the catalog ID of the layered dummy provider as its first chain entry. The subsequent chain entries are the catalog IDs of the providers layered under this one. Then, in most cases, the Winsock catalog needs to be reordered so that most applications will call into the LSP rather than into the base providers.

Note that special consideration must be taken when manipulating the Winsock catalog on 64-bit Windows. In order for 32-bit applications to run on 64-bit Windows, two separate Winsock catalogs are maintained—one for 32-bit applications and one for 64-bit native applications. To manipulate the catalog, several new WSC functions have been introduced. There is a new API for each WSC function that has the string “32” appended to the function name. Note that the parameters remain the same. For example, the function WSCInstallProvider has a corresponding function WSCInstallProvider32. The “normal” function (i.e., not ending in 32) operates on the Winsock catalog for the platform the install application is compiled for. That is, if our LSP install program is compiled for 64-bit Windows, the WSCInstallProvider function installs the LSP into the 64-bit catalog. Likewise, if it was compiled for 32-bit Windows, the LSP would be installed into the 32-bit catalog. The new functions ending in 32 can be used by a native 64-bit application to manipulate the 32-bit catalog.

The only problem is what to do if an LSP needs to be installed into both the 32-bit and 64-bit catalog. When we build the protocol chain, the same catalog ID for the dummy provider is present in the chain. To solve this problem, there is another version of the install function, WSCInstallProvider64_32. This function installs the provider into both catalogs so that the same catalog ID is assigned to the 32-bit and 64-bit entries. Also note that when installing into both catalogs, two versions of the LSP's DLL need to be present. The native 64-bit compiled version goes in %SYSTEMROOT%\system32, while the 32-bit version is placed in %SYSTEMROOT%\syswow64. Note that it still requires two separate calls to remove the LSP once installed into both catalogs—i.e., there is no equivalent uninstall routine that operates on both catalogs.

Finally, the Winsock catalog functions (the WSC variation) also have a new version ending in 32 that follows the same rules discussed above. If a native 64-bit application needs to enumerate both the 32-bit and 64-bit catalogs, it must call WSCEnumProtocols to obtain the 64-bit catalog followed by WSCEnumProtocols32 to obtain the 32-bit catalog. There is no method for a 32-bit application to obtain the 64-bit catalog, which is true for all 32-bit applications (i.e., a 32-bit application has no way of manipulating the 64-bit catalog).

Removing an LSP

Once an LSP is installed into the Winsock catalog, removing the provider is, in most cases, an easy process. The function WSCDeinstallProvider will remove all the catalog entries associated with the given GUID. This API is defined as

int WSPAPI WSCDeinstallProvider( IN LPGUID lpProviderId, OUT LPINT lpErrno );

In the simple case, an LSP will require two GUIDs: one for the dummy entry and one for all of the layered chain entries. To completely remove the LSP in this case, call WSCDeinstallProvider once for each GUID. Of course, if the layered chain providers were installed using multiple GUIDs, the uninstall code will have to call WSCDeinstallProvider on each one.

Uninstalling an LSP becomes extremely complicated if after your LSP is installed, another LSP is installed over it. The second LSP's chain entries will contain references to your LSP's catalog IDs. If your LSP is blindly uninstalled, the second LSP is broken. If the second LSP is exposing itself as a TCP/IP and UDP/IP provider, most likely the system won't boot or will have no network access.

In this situation, the uninstall code for your LSP must check for any other Winsock entries that reference your LSP's catalog IDs within its protocol chain. If it finds other entries, the uninstaller must copy them, uninstall them, remove your LSP's entry from its protocol chain, and install it back into the catalog. For example, consider the case illustrated in Table 12-4. There are two LSPs in this example. LSP1 is installed over the base TCP/IP and UDP/IP providers, and LSP2 is installed over LSP1's TCP/IP and UDP/IP providers. To remove LSP1, we must also fix LSP2's entries so that they no longer reference catalog ID's 1010 and 1011. To do this, the uninstaller for LSP1 must find all entries that reference any of LSP1's catalog Ids, which in this case are entries 1021 and 1022. These entries should be saved off and uninstalled. Then the saved entries should be modified so that their chain lengths are 2 and any reference to 1010 or 1011 in their protocol chains are removed. Finally, these entries should be installed under the same GUID as before. The entry for 1021 should have a protocol chain of 1020 followed by 1001 and the entry for 1022 should be 1020 and 1002. Note that after re-installing the provider, the entry will appear at the end of the catalog when enumerated. It may be necessary to reorder the catalog to put the LSP2 entries back to their original locations.

**Table 12-4** *Winsock Catalog with Multiple LSPs*
Catalog ID	Description	Address Family/Protocol	Chain Length	Protocol Chain
1021	LSP2	TCP/IP	3	1020 —> 1010 —> 1001
1022	LSP2	UDP/IP	3	1020 —> 1011 —> 1002
1020	LSP2 Dummy	N/A	0	N/A
1010	LSP1	TCP/IP	2	1009 —> 1001
1011	LSP1	UDP/IP	2	1009 —> 1002
1009	LSP1 Dummy	N/A	0	N/A
1001	MSAFD TCP/IP	TCP/IP	1	N/A
1002	MSAFD UDP/IP	UDP/IP	1	N/A

Modifying LSP Entries

As you can see, uninstalling a provider when other providers have layered over it is a horrendous task—especially in the complicated cases in which the second provider layered over yours is layered over many other providers and all installed with the same GUID! As a result, Windows XP introduces a new API designed to ease the pain of uninstalling providers. This API allows you to modify an existing provider without uninstalling it. This function is WSCUpdateProvider and is defined as

int WSPAPI WSCUpdateProvider( IN LPGUID lpProviderId, IN const WCHAR FAR *lpszProviderDllPath, IN const LPWSAPROTOCOL_INFOW lpProtocolInfoList, IN DWORD dwNumberOfEntries, OUT LPINT lpErrno );

The parameter list is the same as for WSCInstallProvider except that instead of installing a new provider in the catalog, this API updates the providers referenced by the supplied GUID. With this API you can update any of the fields within an existing provider's entry except for its provider ID (GUID). So for the example given in Table 12-4, this makes fixing LSP2 almost trivial because all that needs to be modified are the protocol chains and chain length.

Writing the Layered Provider

As we mentioned previously, an LSP is implemented in a DLL. There are three important parts to layered providers: the WSPStartup function, tracking socket handles, and handling the various I/O models (such as select, WSAEventSelect, WSAAsyncSelect, overlapped, and completion ports). Of course, this doesn't include the difficulty of implementing the functionality that the LSP provides (such as HTTP proxy, and SSL sockets).

In addition to the three important tasks there is the matter of handling the Microsoft-specific Winsock extensions such as AcceptEx and TransmitFile. This topic is covered after the main three tasks. Finally, there are a few minor tasks that an LSP must implement to achieve 100 percent compatibility. The last few sections discuss these tasks in detail.

Initializing the Provider

Each LSP must implement and export the WSPStartup function. The function is prototyped as

int WSPAPI WSPStartup( WORD wVersion, LPWSPDATA lpWSPData, LPWSAPROTOCOL_INFOW lpProtocolInfo, WSPUPCALLTABLE UpCallTable, LPWSPPROC_TABLE lpProcTable );

The first parameter is the Winsock version that the application requested. The lpWSPData parameter is a WSPDATA structure, which is a subset of the WSADATA structure seen in Chapter 1. The LSP must fill in the WSPDATA structure provider to indicate the Winsock version supported. The lpProtocolInfo structure is a WSAPROTOCOL_INFOW structure corresponding to the provider being loaded. With an LSP, this is one of the protocol structures of our LSP. The UpCallTable parameter is an input parameter that contains function pointers to various Winsock support routines which we will discuss later. Finally, the lpProcTable is a structure of function pointers to those Winsock provider functions that the LSP implemented that it must completely fill in before returning.

Before getting into the specifics of initializing a layered service provider, let's look at what happens when the system invokes an LSP's WSPStartup function. When an application calls WSAStartup, the system takes no action and it's not until the application actually creates a socket that a provider's WSPStartup is called. When the application creates the socket, the system searches the Winsock catalog for a matching entry as described in Chapter 2. When the matching entry is found, the system loads the provider's DLL and invokes its WSPStartup function.

The second issue is how many times you can expect your LSP's WSPStartup to be invoked, which depends on how the LSP is installed. For example, consider a system with two layered protocol entries, as shown in Figure 12-4. Two layered protocol entries are on the left side of the diagram: one layered over the MSAFD TCP/IP provider and the other layered over the MSAFD IPX provider. Note that both of these entries were installed in the same call to WSCInstallProvider because they both contain the same provider GUID. If an application creates a TCP/IP socket, at that point the system loads the LSP and calls its WSPStartup function. Afterward, if the application creates an IPX socket, the system will not invoke the WSPStartup function again as it has already been invoked for the provider with GUID A. Also, any further TCP/IP or IPX sockets created will not invoke additional calls to WSPStartup.

Figure 12-4 WSPStartup calling behavior

However, if the layered protocol entries for TCP/IP and IPX were installed separately with two calls to WSCInstallProvider (and therefore two different GUIDs), when the application creates the first TCP/IP socket the LSP's WSPStartup is invoked. Then when the application creates an IPX socket the system invokes the WSPStartup function once more as the provider corresponding to GUID B has not been initialized yet.

This is an important consideration because it dictates how much initialization overhead is required. For example, consider the LSP that layers over every installed protocol—such as a content filter. In this case, there could be multiple protocols and multiple provider entries for each protocol. If the LSP's layered protocol entries are installed all at once under the same GUID, the LSP's WSPStartup will be called no more than once when the application creates a socket. However, most applications tend to create sockets from a single address family that corresponds to two or three provider entries (for example, IPv4 has three entries: TCP/IP, UDP/IP, and RAW/IP). In this case, the LSP may require setting up multiple internal data structures for each provider: IPv4, IPv6, IPX/SPX, NetBIOS, AppleTalk, and IrDA. However, if the LSP's layered protocol entries are installed in groups corresponding to the different address families, the necessary internal data structure can be allocated only when the calling application decides to use that particular protocol. The decision to install under a single or multiple GUIDs is completely up to the LSP implementer, but it is a good idea to know how often WSPStartup will be invoked.

Getting back to the initialization steps, here are the three tasks an LSP must perform within its WSPStartup function:

Keep track of how many times WSPStartup has been invoked.
Initialize the lpWSPData and lpProcTable parameters.
Find its location within the protocol chain and initialize the lower layer(s).

The first requirement is fairly simple. The LSP should keep track of how many times WSPStartup has been called so that each time it is invoked a simple reference count should be incremented. A provider may need to initialize some internal data structures, which will most likely occur on the first call to WSPStartup. Likewise, an LSP must implement WSPCleanup, at which point the reference count should be decremented. Once the count reaches zero, any internal data structures and all other dynamically allocated resources should be freed.

The second requirement is also simple. The LSP must initialize the WSPDATA and WSPPROC_TABLE parameters. For the WSPDATA structure, the LSP can either verify that the Winsock version is correct itself or it may pass the lpWSPData parameter into the lower provider's WSPStartup function when initializing the lower layers (discussed next) if the LSP does not want to validate the parameters. The WSPPROC_TABLE is a giant structure containing function pointers for all the functions implemented in the LSP. The layered service provider must initialize every pointer.

The third requirement is a bit more complex. The layered provider needs to “load” the providers appearing beneath it in the protocol chain. If the LSP is layered above multiple lower layers, load the provider underneath each of the LSP's provider entries. How does the LSP find where it resides within the chain? The lpProtocolInfo parameter passed to WSPStartup is the provider entry for one of the LSP's protocol chain entries. As we touched on previously, this will match one of the LSP's providers depending on the type of socket the application created first.

As we mentioned, the system passes a WSAPROTOCOL_INFOW structure of the layered provider corresponding to the socket that the application is creating from one of the LSP's layered protocol entries. The first entry within the protocol chain array is the catalog ID for the LSP. Given this value, the Winsock catalog can be enumerated and the remaining layered chain providers can be found. The second index of the chain array of each LSP entry contains the catalog ID of the underlying provider that needs to be loaded.

Loading the underlying provider is a simple process. For each underlying provider, call WSCGetProviderPath to obtain the location of the DLL implementing that provider. This function is prototyped as

int WSCGetProviderPath( LPGUID lpProviderId, LPWSTR lpszProviderDllPath, LPINT lpProviderDllPathLen, LPINT lpErrno );

After obtaining the provider's DLL path, LoadLibrary is called on it, followed by GetProcAddress for that DLL's WSPStartup function. Initializing the lower layer is simply calling that DLL's WSPStartup. Again, the lpWSPData passed to your LSP's WSPStartup can be passed to the lower provider's WSPStartup call so that it can verify the version requested is correct.


	For Windows 95, Windows 98, and Windows Me, a provider's DLL path is always returned as a UNICODE string so that it must be converted to ANSI and LoadLibraryA must be used.

Each lower provider initialized by calling its WSPStartup must follow the same rules that are applied to your LSP's WSPStartup. You must pass in a WSPDATA structure as well as the WSAPROTOCOL_INFOW structure for that provider's entry regardless of whether it is a base provider or another layered provider. For example, if the underlying provider is the MSAFD TCP/IP provider, the WSAPROTOCOL_INFOW structure passed is the MSAFD TCP/IP provider and not the LSP's layered protocol entry. Each lower provider will initialize the WSPPROC_TABLE passed into it with a list of its function pointers. The LSP must save off this function table. When an application makes a Winsock call into your LSP, your LSP must eventually call the lower provider's corresponding Winsock function (unless the LSP's purpose is to prevent or prohibit that action).

In the sample LSP provided on the companion CD, the following structure is allocated for each layered chain entry that comprises our LSP:

typedef struct _PROVIDER {     WSAPROTOCOL_INFOW NextProvider,                       LayeredProvider;     WSPPROC_TABLE     NextProcTable;     EXT_WSPPROC_TABLE NextProcTableExt;     WCHAR             ProviderPathW[MAX_PATH],                       LibraryPathW[MAX_PATH];     char              ProviderPathA[MAX_PATH],                       LibraryPathA[MAX_PATH];     int               ProviderPathLen;     HINSTANCE         hProvider;     LPWSPSTARTUP      lpWSPStartup;     SOCK_INFO        *SocketList; } PROVIDER;

This structure keeps track of a layered chain entries' protocol structure (LayeredProvider) as well as the underlying provider's protocol structure (Next-Provider). The underlying provider's procedure table is stored in NextProcTable and NextProcTableExt is our own structure of Microsoft-specific Winsock functions that the lower provider exposes. We'll discuss these entries in detail later. In addition, both the UNICODE and ANSI versions of the provider's path are saved (ProviderPath). The LibraryPath fields contain the same data as the ProviderPath field except that all system variables are expanded via the ExpandEnvironmentStrings API. hProvider saves off the handle returned from LoadLibrary and lpWSPStartup contains the address for that DLL's WSPStartup function. Lastly, SocketList is a linked list of all sockets this provider created. This field will become important later.

Before going any farther, let's summarize the steps involved in initializing an LSP:

Verify the correct Winsock version requested.
Increment the reference count.
Save the WSPUPCALLTABLE passed in.
Find the Winsock providers layered underneath this LSP's providers. (Note that this may be a subset if this LSP's layered protocol entries were installed under separate GUIDs.)
Allocate a PROVIDER structure for each layered entry.
Load each underlying provider's DLL and invoke its WSPStartup.
Save the WSPPROC_TABLE returned from each underlying provider.

If any one of these steps fails, the LSP's WSPStartup should return an error. There are several relevant Winsock error codes usually associated with startup, which are

WSAEINVALIDPROCTABLE Indicates the lower layer returned an invalid proc table (for example, one or more entries are NULL).
WSAEPROVIDERFAILEDINIT Indicates the LSP encountered an error that prevents it from initializing properly.

Of course, if there is a more specific Winsock error code for the type of error encountered, that should be used. For example, if during startup the LSP dynamically allocates memory but that call fails, WSAENOBUFS should be returned.

Creating Sockets

The next important task of layered service providers is creating socket handles. The sequence of events covered so far is the application creates a socket whose parameters match an entry of our LSP. Next, the system loads our LSP by calling its WSPStartup, and obtains the LSP's function dispatch table, and calls our LSP's WSPSocket function to create a socket to return to the application. Because we are dealing with layered providers and not transport providers, the LSP has no way of creating true operating system handles. As a result, the “real” socket handle is obtained by calling the underlying provider's WSPSocket function. Remember that we obtained the lower provider's WSPPROC_TABLE when it was loaded by our LSP's WSPStartup.

Within your LSP's WSPSocket function, it must validate the address family, socket type, and protocol parameters, and find which underlying provider should be used—assuming your LSP is installed over multiple entries. Keep in mind that for some transport protocols it is valid for the protocol parameter to the socket creation API call to be zero. For example, if our LSP is layered over MSAFD TCP/IP and MSAFD UDP/IP and the application makes the following socket call: s = socket(AF_INET, SOCK_STREAM, 0); our LSP's WSPSocket function will be called with those same parameters. The LSP must then determine that this matches the LSP's entry layered over MSAFD TCP/IP. Then we must find the function table returned from our startup call to the DLL implementing MSAFD TCP/IP, at which point its WSPSocket can be called. In addition, the calling application may pass in the WSAPROTOCOL_INFO structure, which will belong to the LSP. Before creating a socket from the lower provider, its WSAPROTOCOL_INFO structure should be substituted.

Once a socket is created from the underlying provider, there are two options. The first is to simply return that socket handle. The problem with this is there will be no way to modify or monitor data sent or received on that socket. In the next section, we will go into detail on why this is the case. The second option is to return a “dummy” handle. The LSP then associates this dummy handle with the handle that the lower provider returned. Now whenever the application calls a Winsock API with our dummy handle, it is routed into our LSP, at which point the LSP finds the lower provider's handle associated with the dummy handle and calls the same Winsock API of the lower provider with the lower provider handle.

These dummy handles are created by calling WPUCreateSocketHandle. Note that the LSP cannot call this API directly. Instead, the function pointer to this API is provided in the WSPUPCALLTABLE passed into the LSP's WSPStartup routine. The prototype is

SOCKET WPUCreateSocketHandle( DWORD dwCatalogEntryId, DWORD_PTR dwContext, LPINT lpErrno );

The first parameter is the catalog ID of the LSP's layered protocol chain. The second parameter is any data blob that you wish to associate with the SOCKET handle returned. The last parameter indicates the error code in case this API call fails.

Typically, the LSP creates a socket from the lower provider and then allocates a data structure that contains context information for this socket. In the sample LSP the following context structure is used:

typedef struct _SOCK_INFO {    SOCKET ProviderSocket;     // Lower provider socket handle    SOCKET LayeredSocket;      // App's socket handle    DWORD  dwOutstandingAsync; // Count of outstanding async operations    BOOL   bClosing;          // Has the app closed the socket?    volatile LONG  RefCount;  // Reference count    DWORD  BytesSent;         // Byte counts    DWORD  BytesRecv;    HANDLE hIocp;             // Associated with an IOCP?    HWND   hWnd;              // Window (if any) associated with socket    UINT   uMsg;              // Message for socket events    CRITICAL_SECTION  SockCritSec; // Protect access to this object    struct _PROVIDER  *Provider;// Which provider this belongs to?    struct _SOCK_INFO *prev,  // Used to link these structures together                      *next; } SOCK_INFO;

This is a lot of information but it is necessary for a robust LSP. The first field is the socket handle that the underlying provider returned. The second field is the handle returned from WPUCreateSocketHandle. When we call WPUCreateSocketHandle we pass the address of a SOCK_INFO structure as the context information. The majority of the remaining fields deal with handling socket I/O, which is discussed in the next section.

We can now construct the basic outline of the LSP's WSPSocket API. It would look like the following example:

SOCKET WSPAPI WSPSocket(     int                 af,     int                 type,     int                 protocol,     LPWSAPROTOCOL_INFOW lpProtocolInfo,     GROUP               g,     DWORD               dwFlags,     LPINT               lpErrno) { PROVIDER  *Provider=NULL; SOCK_INFO *Context;        SOCKET  ProviderSocket,                LayeredSocket;        // Validate the arguments first        // Find the PROVIDER structure for the layered protocol entry        // that matches the given arguments - set as Provider        // Substitute lpProtocolInfo with the lower provider's if it         // is supplied.        ProviderSocket = Provider->NextProcTable.lpWSPSocket(                af,                type,                protocol,                lpProtocolInfo,                g,                dwFlags,                pErrno                );        if (ProviderSocket != INVALID_SOCKET) {            Context = AllocateContext(); // Allocate a SOCK_INFO struct            LayeredSocket = MainUpCallTable.lpWPUCreateSocketHandle(                Provider->LayeredProvider.ProtocolChain.ChainEntries[0],                (DWORD_PTR) Context,                lpErrno                );            if (LayeredSocket == INVALID_SOCKET) { // Handle failure } Context->LayeredSocket = LayeredSocket; Context->ProviderSocket = ProviderSocket; } return LayeredSocket; }

There are a couple of significant fields in the SOCK_INFO structure that should be discussed. The bClosing field indicates whether the application has called WSPCloseSocket on the dummy socket handle. If there are any outstanding I/O operations, then the LSP must not free the socket's context information until all the I/O has completed (most likely with errors). Also, a reference count is maintained (via RefCount) so that if the calling application is multi-threaded and one thread is using the socket and another thread closes the socket, the LSP will not free the SOCK_INFO structure underneath the first thread (and cause an access violation).

The LSP must implement each of the SPI functions listed in Table 12-2. For those functions that do not create socket handles (for example, everything but WSPSocket, WSPAccept, and WSPJoinLeaf) but take a socket handle as a parameter, it is necessary to translate the application's socket handle into the underlying handle. Remember that a SOCK_INFO structure was associated with each dummy socket handle. This context information can be retrieved by calling WPUQuerySocketHandleContext. Again, this function is found in the WSPUPCALLTABLE structure. This API is defined as

int WSPAPI WPUQuerySocketHandleContext( SOCKET s, LPDWORD_PTR lpContext, LPINT lpErrno );

For example, let's take a look at how an LSP might implement the WSPGetSockOpt function.

int WSPAPI WSPGetSockOpt( SOCKET s, int level, int optname, char FAR *optval, LPINT optlen, LPINT lpErrno ) { SOCK_INFO *lpContext=NULL; int rc=NO_ERROR; rc = MainUpCallTable.lpWPUQuerySocketHandleContext( s, (LPDWORD_PTR)&lpContext, &err ); if (rc == SOCKET_ERROR) { *lpErrno = WSAENOTSOCK; } else { rc = lpContext->Provider->NextProcTable.lpWSPGetSockOpt( lpContext->ProviderSocket, level, optname, optval, optlen, &lpErrno ); } return rc; }

In this example, we first query for the context information. If it cannot be found, we return the error WSAENOTSOCK. Otherwise, we simply call the underlying provider's WSPGetSockOpt function with the correct socket handle. Note that in a real implementation when the socket context is looked up, the reference count would be incremented and before returning from the SPI function the reference count would be decremented. In the sample LSP, two helper routines are defined to do this. They are FindAndLockSocketContext and UnlockSocketContext, which are listed below.

SOCK_INFO *FindAndLockSocketContext(SOCKET s, int *lpErrno) {     SOCK_INFO *SocketContext=NULL;     int        ret;     EnterCriticalSection(&gCriticalSection);     ret = MainUpCallTable.lpWPUQuerySocketHandleContext(             s,             (PDWORD_PTR) &SocketContext,             lpErrno             );     if (ret == SOCKET_ERROR)     {         SocketContext = NULL;     }     else     {         InterlockedIncrement(&SocketContext->RefCount);     }     LeaveCriticalSection(&gCriticalSection);     return SocketContext; } void UnlockSocketContext(SOCK_INFO *context) {     EnterCriticalSection(&gCriticalSection);     InterlockedDecrement(&context->RefCount);     LeaveCriticalSection(&gCriticalSection); }

In this code sample, gCriticalSection is a global CRITICAL_SECTION object for the entire DLL. By calling FindAndLockSocketContext before using any socket within an SPI function (for any WSP function or any support function that needs to query for the socket context), we ensure that multi-threaded applications that close sockets in the middle of other operations will not cause an access violation. Of course, it is important to ensure that a corresponding call to UnlockSocketContext occurs before returning from the SPI function.

The last consideration for socket creation comes when the application closes a socket handle. First, query for the socket context of the supplied handle. Note that this action will cause the socket's reference count to be incremented by one (meaning that if the reference count is greater than one, another thread is accessing the structure). The next step is to close the underlying provider's socket handle, which is contained in the ProviderSocket field of the SOCK_INFO structure. This is necessary so that any outstanding I/O operations will complete with the proper error (discussed in the next section).

However, if the socket context does indicate that there is outstanding asynchronous I/O (indicated via the dwOutstandingAsync field of the context information) or if the reference count is greater than zero, then we cannot close the dummy socket handle yet. Instead, we mark the socket context structure as closing (by setting bClosing to TRUE). If we didn't, then if the application created another socket, the handle value could be re-used, which can cause subtle and hard-to-find problems. For example, consider the case in which two threads are accessing a socket whose handle value is 0x300. If one thread closes the socket and the second thread is about to access it, the socket is closed and the context information removed. Then a third thread creates a new socket and is assigned the handle value 0x300. The thread that was about to access the socket now looks up the context and is returned this new socket's context. At this point, the new socket may be in the wrong state (for example, not connected when it should) or could even be a socket of the wrong protocol. Whatever API uses this socket will most likely fail with a very unexpected error code such as WSAENOTCONN or WSAEOPNOTSUPP.

Only when the reference count indicates no other thread is accessing the socket context information and the outstanding operation count is zero can the dummy socket be closed and the context information be freed. After asynchronous I/O has completed and when the reference count is decremented, the bClosing field of the socket context should be checked. If it is TRUE, it indicates that the application has closed the socket and the dummy handle needs to be closed when it is safe to do so.

Once it is determined safe to close the socket, this is done with the WPUCloseSocketHandle API, which is another function contained in the WSPUPCALLTABLE structure. This function is defined as

int WSPAPI WPUCloseSocketHandle( IN SOCKET s, OUT LPINT lpErrno );

Finally, remember that the functions WSPAccept and WSPJoinLeaf also return socket handles and the same steps just described for WSPSocket apply. Once a new socket handle is returned from the lower provider, an application socket is created with WPUCreateSocketHandle, context information is associated with it, and this application socket is returned to the caller. However, in some instances the WSPJoinLeaf SPI function does not create a new socket (more on this later).

Handling I/O

The last major task to creating an LSP is handling the various types of I/O an application might initiate on a socket. Remember from Chapter 5 that there are a number of I/O models an application may use: blocking sockets, select, WSAAsyncSelect, WSAEventSelect, overlapped I/O, and completion ports. As we mentioned previously, if an LSP wishes to modify or monitor data send or received, it must create its own socket handles with WPUCreateSocketHandle and must handle all possible types of I/O that may occur on the socket. In this section we'll look at each of the I/O models and discuss what steps must be made for each to work.

Before getting into each of the I/O models, it is worthwhile to mention some common rules that apply to all types of I/O. First, if an LSP needs to modify the send buffers, it should not modify the data within the application's buffer—it should make its own copy. Also, LSPs should not behave contrary to the type of I/O initiated. If an application has put a socket into non-blocking mode, the LSP should not block when handling operations that would normally fail with WSAEWOULDBLOCK.

Blocking and Non-blocking

For the most basic I/O blocking and non-blocking sockets there really isn't much to do. For those SPI functions that send and receive data, all the LSP needs to do is translate the socket handle to the provider's socket handle and call the lower provider's function. For example, the WSPSend function for an LSP would look like the following code:

int WSPAPI WSPSend(     SOCKET          s,     LPWSABUF        lpBuffers,     DWORD           dwBufferCount,     LPDWORD         lpNumberOfBytesSent,     DWORD           dwFlags,     LPWSAOVERLAPPED lpOverlapped,     LPWSAOVERLAPPED_COMPLETION_ROUTINE lpCompletionRoutine,     LPWSATHREADID   lpThreadId,     LPINT           lpErrno     ) { SOCK_INFO *SocketContext=NULL; int          ret; // Get the context info SocketContext = FindAndLockSocketContext(s, lpErrno); if (lpOverlapped == NULL) // Make sure this is not overlapped { SetBlockingProvider(SocketContext->Provider); ret = SocketContext->Provider->NextProcTable.lpWSPSend( SocketContext->ProviderSocket, lpBuffers, dwBufferCount, lpNumberOfBytesSent, dwFlags, NULL, NULL, lpThreadId, lpErrno ); SetBlockingProvider(NULL); } UnlockSocketContext(SocketContext); return ret; }

This is a very straightforward process: find the socket context and call the lower provider. For compatibility with 16-bit Winsock, we do have to keep track of which provider is blocking in case the application calls WSACancelBlockingCall, which is what the SetBlockingProvider function does. It saves off the address of our PROVIDER structure, which is currently issuing a blocking call for that thread. If the application calls WSACancelBlockingCall, all we have to do is call the blocking lower layer's WSPCancelBlockingCall. The SetBlockingProvider routine uses thread local storage to save the pointer to the current PROVIDER issuing a blocking call.

Select and WSPSelect

When an application uses the select API to wait for events on a set of sockets, things get a bit complicated. The select API will map to the SPI function WSPSelect and requires some work before passing the call down to the lower provider. There are three FD_SET structures passed in that reference the layered sockets and not the underlying provider's sockets. Because of this, the socket context for each socket contained in the FD_SETs must be obtained and a new FD_SET built that contains the lower provider's sockets.

The following code shows how to translate the fdread FD_SET passed into WSPSelect.

int WSPAPI WSPSelect(     int          nfds,     fd_set FAR * readfds,     fd_set FAR * writefds,     fd_set FAR * exceptfds,     const struct timeval FAR * timeout,     LPINT        lpErrno) { FD_SET ReadFds, WriteFds, ExceptFds, int ret, HandleCount, count; // Simple structure to quickly map LSP sockets to provider sockets struct { SOCKET LayeredSocket; SOCKET ProviderSocket; } Read[FD_SETSIZE], Write[FD_SETSIZE], Except[FD_SETSIZE]; // Translate LSP handles into provider handles if (readfds) { FD_ZERO(&ReadFds); for(i=0; i < readfds->fd_count ;i++) { SocketContext = FindAndLockSocketContext( (Read[i].LayeredSocket = readfds->fd_array[i]), lpErrno ); Read[i].ProviderSocket = SocketContext->ProviderSocket; FD_SET(Read[i].ProviderSocket, &ReadFds); UnlockSocketContext(SocketContext); } } // Do the same for writefds // Do the same for exceptfds SetBlockingProvider(SocketContext->Provider); ret = SocketContext->Provider->NextProcTable.lpWSPSelect( nfds, (readfds ? &ReadFds : NULL), (writefds ? &WriteFds : NULL), (exceptfds ? &ExceptFds : NULL), timeout, lpErrno ); SetBlockingProvider(NULL); HandleCount = ret; // Map the signaled provider handles back to the LSP handles if (readfds) { count = readfds->fd_count; FD_ZERO(&readfds); for(i=0; (i < count) && HandleCount ;i++) { 5 if (MainUpCallTable.lpWPUFDIsSet( Read[i].ProviderSocket,  &ReadFds))  { FD_SET(Read[i].LayeredSocket, readfds); HandleCount--; } } } // Do the same for writefds // Do the same for exceptfds }

For this to work, a mapping is maintained between the layered sockets passed into select and its corresponding provider socket. This is necessary because after the lower provider's WSPSelect is called, the LSP has to return only those layered provider sockets that were signaled.

The first step is to go through each handle in the FD_SET and find its context information. The mapping of the provider socket to the layered socket is maintained in the Read array. We then have a second FD_SET, ReadFds, which contains the underlying provider's socket handles, which we then pass into the lower provider's WSPSelect function. Upon return, we know how many handles were signaled with the return value. Then it is a process of seeing which provider handles passed were signaled. This is done by calling the helper function WPUFDIsSet function for each provider socket passed in. If it is set, we take the associated layered socket and set it into the readfds FD_SET passed into the function so that upon return from the LSP's WSPSelect the application has the correct handles signaled. This same process has to be performed for writefds and exceptfds. Of course, the sample does not perform error handling, nor does it handle the case when a timeout value is supplied. See the sample LSP for the full implementation.

The WPUFDIsSet function is another helper function passed to the LSP in the WSPUPCALLTABLE structure. The function is defined as

int WSPAPI WPUFDIsSet( IN SOCKET s, IN fd_set FAR *fdset );

This function behaves exactly as the FD_ISSET macro seen in Chapter 5.

There is one major issue frequently encountered when implementing an LSP's WSPSelect: what to do if one of the event handles passed in an FD_SET is unknown. If the LSP queries for the socket context of a given handle and it fails, should the LSP indicate an error (such as WSAENOTSOCK) or simply pass that event handle to the lower provider unmodified? The unique problem with the WSPSelect API is that it is the only Winsock function that can take multiple socket handles in a single call. For all other Winsock functions, the system knows exactly which provider should handle that API call because there is just one socket handle passed as a parameter.

Even though the Winsock specification explicitly states that only sockets from the same provider may be passed into select, many applications ignore this (including Microsoft Internet Explorer) and frequently pass down both TCP and UDP handles together. The base Microsoft providers do not verify that all socket handles are from the same provider. In addition, the Microsoft providers will correctly handle sockets from multiple providers in a single select call. This is a problem with LSPs because an LSP may be layered over just a single entry such as TCP. In this case, the LSP's WSPSelect is invoked with FD_SETs that contain their own sockets plus sockets from other providers (such as a UDP socket from the Microsoft provider). When the LSP is translating the socket handles and comes upon the UDP handle, the context query will fail. At this point, it may return an error (WSAENOTSOCK) or pass the socket down unmodified. If an error is returned, then for the case of an LSP layered only over UDP/IPv4 (or TCP/IPv4), Internet Explorer will no longer function. A workaround is to always install the LSP over all providers for a given address family (such as for IPv4, install over TCP/IPv4, UDP/IPv4, and RAW/IPv4). No Microsoft application or service currently passes socket handles from multiple address families into a single select call, although LSASS on Windows NT 4.0 used to pass IPX and IPv4 sockets together (this has been fixed in the latest service packs for Windows NT 4.0).

WSAAsyncSelect

Handling sockets that register for event notification via WSAAsyncSelect also require some additional help. As you recall from Chapter 5, an application registers for notification on certain events that will be posted to the given window. The problem here is that the WPARAM parameter posted to the application's window contains the socket handle. This is bad because the LSP will translate the handle passed into its WSPAsyncSelect and call the lower provider's function with the translated socket and the remaining parameters. As a result, when an event is posted, it is posted directly to the application's window handler and the WPARAM parameter contains the lower provider's socket and not the LSP-created socket.

To handle this case correctly, the LSP must create its own hidden window on which to receive events from the lower provider. Then within the LSP's window handler, the socket can be translated back to the application socket and posted to the application's window handler. The LSP's WSPAsyncSelect must save the window handle and message that is associated with the application socket. This information is saved in the socket context (for example, the SOCK_INFO structure of the sample LSP). The following code shows how this is handled:

int WSPAPI WSPAsyncSelect (     SOCKET       s,     HWND         hWnd,     unsigned int wMsg,     long         lEvent,    LPINT        lpErrno) {    SOCK_INFO *SocketContext;    int ret;    SocketContext = FindAndLockSocketContext(s, lpErrno);    if (SocketContext != NULL)    {       SocketContext->hWnd = hWnd;       SocketContext->uMsg = wMsg;       // Get the handle to our hidden window       if ((hWorkerWindow = GetWorkerWindow()) != NULL)       {          SetBlockingProvider(SocketContext->Provider);          ret = SocketContext->Provider->NextProcTable.lpWSPAsyncSelect(                   SocketContext->ProviderSocket,                   hWorkerWindow,                   WM_SOCKET,                   lEvent,                   lpErrno);           SetBlockingProvider(NULL);       }    }    UnlockSocketContext(SocketContext);    return ret; }

In this code, the socket context is found and the window handle and message is saved. Then the hidden asynchronous window that the LSP created is returned via the GetWorkerWindow call. This routine simply creates the window and thread to handle the events (see ASYNCSELECT.CPP for the full implementation). Then the lower provider is called with the lower provider socket, except that we supply the window handle of our LSP helper window instead.

The code for our hidden window handler is simple:

static LRESULT CALLBACK AsyncWndProc(     HWND hWnd,     UINT uMsg,     WPARAM wParam,     LPARAM lParam) {     SOCK_INFO *si;     if (uMsg == WM_SOCKET)     {         if (si = GetCallerSocket(NULL, wParam))         {             MainUpCallTable.lpWPUPostMessage(                 si->hWnd,                 si->uMsg,                 si->LayeredSocket,                 lParam);             return 0;         }     }     return DefWindowProc(hWnd, uMsg, wParam, lParam); }

Here we look for only the socket notifications. The only challenge is to map the provider socket (indicated as wParam) back to the LSP-created socket, which is what the GetCallerSocket function does (defined in SOCKINFO.CPP). As you recall, the PROVIDER structure contains a linked list of all the SOCK_INFO each provider created. The GetCallerSocket searches all the linked lists in search of the SOCK_INFO that contains the given lower provider socket handle. This is necessary because there is no other convenient way of mapping provider sockets back to LSP sockets.

Once that is found, the helper function WPUPostMessage is called to post the event to the application's window with the correct socket handle. Remember, the window handle and message were saved earlier when the application called WSAAsyncSelect on the handle. This function is located in the WSPUPCALLTABLE and is defined as

BOOL WSPAPI WPUPostMessage(     IN HWND hWnd,     IN UINT Msg,     IN WPARAM wParam,     IN LPARAM lParam     );

WSAEventSelect

This socket model requires no work on the LSP's part. When the select events are signaled, a simple event handle is used—no socket handles are returned, so no extra socket translation is needed. For example, the application calls WSAEventSelect with a socket, event, and event mask. Within the LSP's WSPEventSelect, the handle is translated and passed to the lower provider with the same event and event mask. When a requested event occurs on the socket, the lower provider sets the application's event to be signaled, after which the application calls a send or receive function as described previously in the section “Blocking and Non-blocking.”

Unless the LSP is required to intercept these event notifications (as determined by the LSP's actual purpose), there is no need to substitute our own event handle to wait for notification from the lower layer. If you did, once your substituted event was signaled, the LSP would perform the necessary computation and then signal the application's event (which must be saved in the socket context) with WSASetEvent.

Overlapped I/O

Handling overlapped I/O is another complicated issue that depends on what platforms the LSP is to be installed on. The easiest and most elegant method is to handle all application-initiated overlapped I/O by using a I/O completion port regardless of whether the application is using events, callbacks, or completion ports. However, if the LSP is to be run on Windows 95, Windows 98, or Windows Me, this is impossible. The sample LSP provider handles both cases.

In the case of Windows NT and I/O completion ports, the LSP creates a completion port and a worker thread that services the completion notifications by calling GetQueuedCompletionStatus. When an application makes an overlapped I/O call, the LSP first checks to see if the lower provider handle has been associated with the LSP's completion port yet. This information is contained in the socket context information as the hIocp field. If the lower provider socket has been associated, this field is non-NULL; otherwise, it contains the handle of the LSP's completion port.

Once the provider socket is associated with the LSP's completion port, the I/O is posted on the lower provider's socket handle. Once it has completed, the LSP worker thread will receive the notification and the LSP can complete the application's request. After the LSP receives completion notification, the application's overlapped I/O is completed so that the application will receive notification either via event, asynchronous procedure call, or its own completion port.

Each Winsock SPI function that can be made in an overlapped fashion (including Microsoft extension functions) requires special handling. First, the LSP must keep track of the WSAOVERLAPPED structure the application passed into the function. It maintains useful information such as indicating I/O in progress, error codes, and bytes transferred. To perform this function, the LSP defines its own WSAOVERLAPPED structure to maintain information about each overlapped I/O operation posted to the lower provider's socket. This structure is defined as

typedef struct _WSAOVERLAPPEDPLUS {     WSAOVERLAPPED  ProviderOverlapped;  // passed to lower provider     PROVIDER      *Provider;            // lower provider info     SOCK_INFO     *SockInfo;            // socket info for this op     SOCKET         CallerSocket;        // app (LSP) socket     SOCKET         ProviderSocket;      // lower provider socket     HANDLE         Iocp;                // LSP completion port     int            Error;               // error code?     union                               // Arguments to operation     {         ACCEPTEXARGS        AcceptExArgs;         TRANSMITFILEARGS    TransmitFileArgs;         CONNECTEXARGS       ConnectExArgs;         TRANSMITPACKETSARGS TransmitPacketsArgs;         DISCONNECTEXARGS    DisconnectExArgs;         WSARECVMSGARGS      WSARecvMsgArgs;         RECVARGS            RecvArgs;         RECVFROMARGS        RecvFromArgs;         SENDARGS            SendArgs;         SENDTOARGS          SendToArgs;         IOCTLARGS           IoctlArgs;     }; #define LSP_OP_IOCTL               1     // WSPIoctl #define LSP_OP_RECV                2     // WSPRecv #define LSP_OP_RECVFROM            3     // WSPRecvFrom #define LSP_OP_SEND                4     // WSPSend #define LSP_OP_SENDTO              5     // WSPSendTo #define LSP_OP_TRANSMITFILE        6     // TransmitFile #define LSP_OP_ACCEPTEX            7     // AcceptEx #define LSP_OP_CONNECTEX           8     // ConnectEx #define LSP_OP_DISCONNECTEX        9     // DisconnectEx #define LSP_OP_TRANSMITPACKETS    10     // TransmitPackets #define LSP_OP_WSARECVMSG         11     // WSARecvMsg     int             Operation;           // Type of operation this is     LPWSATHREADID   lpCallerThreadId;    // Caller thread     LPWSAOVERLAPPED lpCallerOverlapped;  // App's WSAOVERLAPPED struct     LPWSAOVERLAPPED_COMPLETION_ROUTINE   lpCallerCompletionRoutine;     _WSAOVERLAPPEDPLUS                *next; } WSAOVERLAPPEDPLUS, * LPWSAOVERLAPPEDPLUS;

As you can see, there is a lot of information maintained for each overlapped operation that the application initiates. We won't go into detail about all of these fields because many of them are self explanatory. Instead, let's walk through what needs to occur when handling an overlapped call. The steps are:

Allocate an LSP-overlapped context structure (for example, the WSA-OVERLAPPEDPLUS object shown in the last listing).
Save the caller's WSAOVERLAPPED pointer in the lpCallerOverlapped structure.
If a completion routine is supplied, save it as lpCallerCompletionRoutine.
Mark the caller's WSAOVERLAPPED structure as pending by setting the Internal field to WSS_OPERATION_IN_PROGRESS (defined in WS2SPI.H).
Make sure the lower provider's socket is associated with the LSP's completion port.
Call the same SPI function in the lower provider with the lower provider's socket and the ProviderOverlapped field of the WSAOVERLAPPEDPLUS structure for this operation.
Return SOCKET_ERROR and set the error code to WSA_IO_PENDING.

This lists the minimum steps required. The sample LSP does a few extra steps. First, it saves the caller's parameters, such as buffer pointers and flags. These are saved in the unnamed union within the WSAOVERLAPPEDPLUS structure. The union contains a structure for each overlapped enabled Winsock function—each containing fields corresponding to their respective Winsock functions' parameter lists. The sample LSP doesn't use the saved parameters but it may be necessary to do so for an LSP that performs a specific task. One important item to note is that some of the pointer parameters supplied can be stack-based and therefore the LSP cannot just capture the pointer values. For example, WSASend, WSASendTo, WSARecv, and WSARecvFrom take an array of WSABUF structures that contain the send or receive buffers. This array can be stack-based, which means as soon as the application calls the Winsock function it may return from the calling function or free that memory (if dynamically allocated). The LSP must copy the buffer pointers into its own allocated WSABUF structures.

Once the overlapped I/O has been posted to the lower provider, it's simply a matter of waiting for the LSP's completion thread to receive notification for that operation. When the completion thread receives notification, the pointer to the WSAOVERLAPPED structure for the operation returned from GetQueuedCompletionStatus is actually our WSAOVERLAPPEDPLUS structure. The following three steps need to be performed to finish this operation.

Call the lower provider's WSPGetOverlappedResult to get bytes transferred, flags, and the appropriate error code in case of a failure.
Update the caller's WSAOVERLAPPED structure with Offset equal to the error (if any), OffsetHigh to the flags returned (if any), and InternalHigh to the bytes transferred.
Complete the application's overlapped request using either WPUQueueApc or WPUCompleteOverlappedRequest depending on whether a completion function was supplied.

The last step is what notifies the application that its I/O operation has completed. If a completion routine was supplied, the LSP needs to execute that function. This is performed by the WPUQueueApc function, which is a field of the WSPUCALLTABLE structure and is defined as

int WSPAPI WPUQueueApc(     IN LPWSATHREADID lpThreadId,     IN LPWSAUSERAPC lpfnUserApc,     IN DWORD_PTR dwContext,     OUT LPINT lpErrno     );

The first parameter is the thread ID of the application's thread that initiated this I/O because the completion routine must fire within the context of that thread. If you recall, this is one of the parameters saved in the WSAOVERLAPPEDPLUS structure when the application initiated the I/O. The second parameter is the application's completion function to call. The dwContext is the caller's original WSAOVERLAPPED structure, and lpErrno returns an error if WPUQueueApc fails.

If the application did not specify a completion routine and supplied only a WSAOVERLAPPED structure, the LSP completes the I/O with WPUComplete-OverlappedRequest. It's curious to note that this function is not a member of the WSPUPCALLTABLE. Instead it is contained in WS2_32.DLL and is called normally. This function is defined as

int WSPAPI WPUCompleteOverlappedRequest (     SOCKET s,     LPWSAOVERLAPPED lpOverlapped,     DWORD dwError,     DWORD cbTransferred,     LPINT lpErrno );

The parameter list is easy. The SOCKET parameter is the application's socket and lpOverlapped is its WSAOVERLAPPED structure. dwError is the error if the call failed (otherwise, it should be NO_ERROR). cbTransferred is the number of bytes transferred in the operation. The lpErrno parameter returns the error code if the WPUCompleteOverlappedRequest call fails.

You will notice that an overlapped operation that the LSP handles automatically fails with WSA_IO_PENDING even though it is possible that when the LSP makes the call to the lower provider, that overlapped operation could succeed immediately. The LSP does not do this because regardless of whether the operation succeeds immediately, notification will always be posted to the completion queue. The code is a bit cleaner by always processing completion notifications in the worker thread in addition to being perfectly legal according to the Winsock specification. Care must be taken to ensure that the calling application receives only one notification per I/O operation. The sample LSP provided always returns pending and waits for the completion thread to receive the notification before completing the request.

Handling overlapped I/O on Windows 95, Windows 98, and Windows Me is a bit more challenging. There are two possible approaches. First, the LSP can issue the overlapped I/O to the lower layer and use events for completion notification. The drawback to this, as we saw in Chapter 5, is a single thread can only wait on MAXIMUM_WAIT_OBJECTS event handles (which is currently 64). The other method is to use completion functions, which is easier to implement.

When the calling application issues an overlapped request, the LSP builds a WSAOVERLAPPEDPLUS structure as we described earlier and then this object is placed in a queue. For this model, we still use a worker thread whose purpose is to wait for overlapped requests to be placed in the queue. Once the worker thread is notified of available work items, it removes an item from the queue and actually makes the requested overlapped operation (by calling the lower provider). It is important that these overlapped operations are executed in the context of the LSP thread and not an application thread. The calling thread must be in an alertable wait state for the completion functions to execute. Because the calling application should not have to be aware if the Winsock provider is layered, it most likely will not go into an alertable wait unless the application is using completion functions (which it may). As a result, the LSP's worker thread executes the overlapped requested and when not servicing work items, it remains in an alertable wait state.

Note that when the LSP issues the overlapped I/O with a completion function, the completion function supplied is the LSP's, not the application's. Once the LSP's completion function fires, the LSP will post the completion to the application via whatever notification mechanism the application supplied (such as signaling the event or firing the completion function).

Winsock Extension Functions

For LSPs that create their own sockets, they must also handle the Winsock-specific extension functions that take socket handles as parameters. This includes AcceptEx, TransmitFile, ConnectEx, DisconnectEx, TransmitPackets, and WSARecvMsg. This is done within the LSP's WSPIoctl function. When an application loads a Microsoft-specific function, it will call WSAIoctl with the ioctl code SIO_GET_EXTENSION_FUNCTION_POINTER. The LSP simply has to determine which function is being loaded via the InBuffer parameter, which contains the GUID for the requested function. Once that is done, the LSP returns the address of its own extension function. This extension function will then translate all the socket handles and load the extension function of the lower layer, which will be invoked by the LSP. This works even if the application uses the TransmitFile and AcceptEx functions exported directly from MSWSOCK.DLL because those functions simply end up calling WSAIoctl with SIO_GET_EXTENSION_FUNCTION_POINTER.

The sample LSP will then implement its own extension functions in EXTENSION.CPP. The implementation of these functions is the same that it is for the other SPI functions. The LSP must translate the handle, validate arguments as necessary, and handle the possibility of overlapped I/O. The code for WSPIoctl is contained in SPI.CPP and you'll notice that the first step done is check to see if the ioctl code is SIO_GET_EXTENSION_FUNCTION_POINTER.

Miscellaneous Requirements

This section is devoted to the miscellaneous tasks that an LSP must perform to behave properly. In this section, we'll cover each service provider API in which an LSP must perform a special action.

WSPGetSockOpt

When the calling application calls the LSP's WSPGetSockOpt with SO_PROTOCOL_INFOA or SO_PROTOCOL_INFOW, the LSP should return its own protocol info structure and not translate the handle to pass to the lower provider. If that were the case, the call would return the lower provider's WSAPROTOCOL_INFO structure instead of the LSP's. Note that both the ANSI and UNICODE versions must be supported, so the LSP may have to perform the appropriate string conversions on the returned structure.

WSPSetSockOpt

After an application calls AcceptEx, it typically calls setsockopt with SO_UPDATE_ACCEPT_CONTEXT. The argument passed to WSPSetSockOpt is the socket handle of the accepted socket. The LSP must translate that handle as well as the listening socket handle before passing the call to the lower provider.

WSPIoctl

There are a couple of ioctl codes that an LSP must handle differently. We've already mentioned that if an LSP is implementing its own extension functions (which it must if returning its own handles), it must capture the SIO_GET_EXTENSION_FUNCTION_POINTER command. In addition, it must capture the SIO_QUERY_TARGET_PNP_HANDLE. The handles WPUCreateSocketHandle created are not true plug-and-play handles and cannot receive notifications. As a result, applications can use SIO_QUERY_TARGET_PNP_ HANDLE to obtain the base provider's socket handle. The LSP should return the lower provider's socket handle in the return buffer.

WSPJoinLeaf

The WSAJoinLeaf function is a bit odd. Depending on the protocol, the return value is either a new socket handle (as in ATM) or the same handle passed in as the sparameter (as in IP multicasting). See Chapter 9 for more information about WSAJoinLeaf and its behavior with the various multicast enabled protocols. Currently, Ipv4 and Ipv6 are the only protocols that do not create new socket handles when WSAJoinLeaf is called. If your LSP is to be layered over IP, it should take this into account. Otherwise, if it did create new handles including the context information, these structures would be leaked because the calling application will call closesocket on just one of the handles.

WSPAddressToString and WSPStringToAddress

These functions are unique because they do not take a socket parameter. Instead, the WSAPROTOCOL_ INFOW structure of the LSP entry that matches the given address is passed in. The LSP should find the provider layered beneath the supplied WSAPROTOCOL_INFOW structure and call that provider's corresponding SPI function. The only rule is if the underlying provider is not a base provider, the WSAPROTOCOL_INFOW structure should be passed unmodified. Otherwise, if the underlying provider is a base provider, the LSP should substitute the base provider's WSAPROTOCOL_INFOW structure.

Debugging an LSP

Developing an LSP is a complicated task in which one mistake will probably break all applications accessing Winsock for the protocols the LSP is layered over. In the event of IP, critical services such as LSASS will fail. If this does happen, booting into safe mode and uninstalling the LSP will return the Winsock catalog back to normal. Also, it is a good idea to smoke test the LSP before rebooting the system. Internet Explorer is always a good test application (when the LSP is layered over IP). Otherwise, it may be necessary to write a small suite of test applications to verify the LSP's functionality.

For tracking down minor problems with applications, printing debug messages to the debugger can be invaluable. The sample LSP we've provided uses OutputDebugString in several places; it also has the ability to turn on verbose debugging by defining DEBUG and DEBUGSPEW for the project. Using message boxes for debug messages is a bad idea because during the boot process several system services can load the DLL before the user interface subsystem is fully initialized, which will cause the LSP DLL to fail during load.

For especially difficult problems, it is often necessary to use a debugger to determine the point of failure. For interactive user applications, the Visual Studio debugger, as well as the NT Symbolic Debugger (NTSD)—a text-mode debugger available with the Platform SDK—are both excellent choices. In general tracing the steps of socket creation through the various APIs called on that socket will track down the problem. For NTSD, this is accomplished by enabling “break on load” (for example, the NTSD command is sxeld) for each DLL loaded until the LSP DLL is loaded. At this point, breakpoints may be set for the LSP's functions of interest (such as WSPStartup, WSPSocket, and WSPConnect).

If problems occur with system services such as LSASS during boot, debugging is much more complicated. This requires a kernel mode debugger to be attached to the machine running the LSP. Then it is possible to attach NTSD to the failing system service and pipe the NTSD console to the kernel debugger running on the second machine. For information about using and setting up the various types of debuggers, consult the Microsoft Developer Network (MSDN) online at http://msdn.microsoft.com.

LSP Sample

Throughout this discussion we have referred to the sample LSP on the CD in the directory LSP. In this section, we'll briefly describe each file of the project as well as how to install the LSP. The following is a list of files and what they implement.

ASYNCSELECT.CPP Implements helper routines used for handling WSAAsyncSelect. This includes creating the hidden window for receiving events from the lower provider as well as the window procedure that services those notifications.
EXTENSION.CPP Implements all of the Microsoft-specific Winsock extensions available, such as AcceptEx, TransmitFile, TransmitPackets, ConnectEx, DisconnectEx, and WSARecvMsg.
INSTLSP.CPP Implements the installation and removal code. This file is compiled into an .EXE that will install and/or remove the LSP from the Winsock catalog.
OVERLAP.CPP Implements handling overlapped I/O for the LSP. For Windows NT, this includes creating the completion port as well as the worker thread for handling completion notifications. For Windows 95, Windows 98, and Windows Me, this includes establishing a work item queue and a worker thread that services I/O placed within the queue.
PROVIDER.CPP Implements common routines for enumerating the Winsock catalog as well as defining the GUID under which the LSP is installed. These routines are used by both the LSP DLL and the installation utility (INSTLSP.CPP).
SOCKINFO.CPP Implements common routines for looking up associated socket context structures for sockets that the LSP creates. This file also contains functions for allocating and freeing SOCK_INFO structures in addition to inserting and deleting them from the PROVIDER structures (which maintain a list of all sockets that provider created).
SPI.CPP This is the “guts” of the LSP. It defines all of the WSP* functions, including WSPStartup.