12.5 PID and MID Revealed

12.5 `PID` and `MID` Revealed

Simply put:

a PID identifies a client process,
a [ PID , MID ] pair identifies a thread within a process.

That's the idea, anyway. The client provides values for these fields when it sends a request to the server, and the server is supposed to echo the values back in the response. That way, the client can match the reply to the original request.

Some systems (such as Windows and OS/2) multiplex all of the SMB traffic between a client and a server over a single TCP connection. If the client OS is multi-tasking there may be several active SMB sessions running concurrently, so there may be several requests outstanding at any given time. The SMB conversations are all intertwined, so the client needs a way to sort out the replies and hand them off to the correct thread within the correct process (see Figure 12.3).

Figure 12.3. Multiplexing SMB over a single TCP connection

Instead of opening individual TCP connections, one per process, some systems multiplex all SMB traffic to a given server over a single connection. (T , T ₁

graphics/12fig03.gif

The PID field is also used to maintain the semantics of local file I/O. Think about a simple program, like the one in Listing 12.1 which opens a file in read-only mode and dumps the contents. Consider, in particular, the call to the open () function, which returns a file descriptor. File descriptors are maintained on a per-process basis that is, each process has its own private set. The descriptor is an integer used by the operating system to identify an internal record that keeps track of lots of information about the open file, such as:

Is the file open for reading, writing, or both?
What is the current file pointer offset within the file?
Do we have any locks on the file?

Listing 12.1 Quick dump

 #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define bSIZE 1024 int main( int argc, char *argv[] )   /* ---------------------------------------------------- **    * Copy file contents to stdout.    * ---------------------------------------------------- **    */   {   int      fd_in;   int      fd_out;   ssize_t  count;   char     bufr[bSIZE];   if( argc != 2 )     {     (void)fprintf( stderr,                    "Usage: %s <filename>\n", argv[0] );     exit( EXIT_FAILURE );     }   fd_in = open( argv[1], O_RDONLY );   if( fd_in < 0 )     {     perror( "open()" );     exit( EXIT_FAILURE );     }   fd_out = fileno( stdout );   do     {     count = read( fd_in, bufr, bSIZE );     if( count > 0 )       count = write( fd_out, bufr, (size_t)count );     } while( bSIZE == count );   if( count < 0 )     {     perror( "read()/write()" );     exit( EXIT_FAILURE );     }   (void)close( fd_in );   exit( EXIT_SUCCESS );   } /* main */

Now take all of that and stretch it out across a network. The files physically reside on the server and information about locks, offsets, etc. must be kept on the server side. The process that has opened the files, however, resides on the client and all of the file status information is relevant within the context of that process. That brings us back to what we said before: The PID identifies a client process. It lets the server keep track of client context, and associate it correctly with the right customer when the requests come rolling in.

Further complicating things, some clients support multiple threads running within a process. Threads share context (memory, file descriptors, etc.) with their sister threads within the same process, but each thread may generate SMB traffic all on its own. The MID field is used to make sure that server replies get back to the thread that sent the request. The server really doesn't do much with the MID . It just echoes it back to the client so, in fact, the client could make whatever use it wanted of the MID field. Using it as a thread identifier is probably the most practical thing to do.

There is an important rule which the client should obey with regard to the MID and PID fields: only one SMB request should ever be outstanding per [ PID , MID ] pair per connection. The reason for this rule is that the client will generally need to know the result of a request before sending the next request, especially if an error occurred. The problems which might result should this rule be broken probably depend upon the server, but defensive programming practices would suggest avoiding trouble.

12.5.1 `EXTRA.PidHigh` Dark Secrets Uncovered

Earlier on we promised to cover the EXTRA.PidHigh field. Well, a promise is a promise...

The PidHigh field is supposed to be a PID extension, allowing the use of 32-bit rather than 16-bit values as process identifiers. As with all extensions, however, there is the basic problem of backward compatibility.

In this case, trouble shows up if (and only if) the client supports 32-bit process IDs but the server does not. In that situation, the client must have a mechanism for mapping 32-bit process IDs to 16-bit values that can fit into the PID field. It doesn't need to be an elaborate mapping scheme, and it is unlikely that there will be 64K client processes talking to the same server at the same time, so it should be a simple problem to solve.

Since that mapping mechanism needs to be in place in order for the client to work with servers that don't support the PidHigh field, there's no reason to use 32-bit process IDs at all. In testing, it appears as though the PidHigh field is, in fact, always zero (except in some obscure security negotiations that are still not completely understood ). Best bet, leave it zero.