Appendix B -- Reacting to Service Failures

In Chapter 14, I created a service that monitors other services, servers, and databases on a network. During the discussion of network monitoring, I noted that having an operations person periodically check the status of various servers was much like a computer program that polled, or looked for input periodically in a somewhat inefficient way. I then described the MonitorService program that, in effect, did the polling for the operations people and interrupted them when needed.

Microsoft Windows 2000 offers one additional way to monitor a service that is closer to a true interrupt model. Using the new Win32 API function ChangeServiceConfig2 , you can let Windows 2000 know what it should do if a service fails. The meaning of a service failing is straightforward and somewhat limited. For purposes of ChangeServiceConfig2 , a service has failed if it terminates without reporting a SERVICE_STOPPED status. Thus, the method of monitoring service failure covered in this appendix is of no value if the entire server has failed. Nonetheless, ChangeServiceConfig2 can be useful in monitoring services that are failing in the way the function defines failure.

Let's take a look at ChangeServiceConfig2 .

 BOOLChangeServiceConfig2(SC_HANDLEhService, DWORDdwInfoLevel, LPVOIDlpInfo); 

The first parameter is a handle opened using OpenService or CreateService . The second parameter, dwInfoLevel , can be one of two constants:

  • SERVICE_CONFIG_DESCRIPTION allows the description for the specified service to be changed. In this case, the third parameter, lpInfo , points to a SERVICE_DESCRIPTION structure. This structure contains a single member, lpDescription .
  • SERVICE_CONFIG_FAILURE_ACTIONS allows the actions taken when the service fails (as defined above). In this case, lpInfo points to a SERVICE_FAILURE_ACTIONS structure, which is defined as follows :
  •  typedefstruct_SERVICE_FAILURE_ACTIONS{ DWORDdwResetPeriod; LPTSTRlpRebootMsg; LPTSTRlpCommand; DWORDcActions; SC_ACTION*lpsaActions; }SERVICE_FAILURE_ACTIONS; 

The first member, dwResetPeriod , indicates the number of seconds that must elapse before the failure count is reset to 0. Setting dwResetPeriod to INFINITE means that the failure count is never reset. If you choose to have the server reboot as the result of a failure (using the method described below), lpRebootMsg is the message broadcast to all users. If lpRebootMsg is NULL, the reboot message is not changed, and if it points to an empty string, the reboot message is deleted and no message will be broadcast.

You can also choose to have the command specified in lpCommand run when a service fails. If lpCommand is NULL, the command is unchanged, and if lpCommand points to an empty string, the command is removed.

The last two members of the structure are cActions and lpsaActions . You can specify any number of actions to take place when a failure occurs, with the number of actions specified in cActions . An array of SC_ACTION structures pointed to by lpsaActions is used by the Service Control Manager (SCM) to determine what actions to perform when a failure takes place. The offset within the array indicates which action should take place. For example, when the service fails for the first time, the SC_ACTION structure at offset 0 (the number of failures minus one) takes place. The second failure causes the SCM to look at action 1 and so on. The SC_ACTION structure is defined as follows:

 typedefstruct_SC_ACTION{ SC_ACTION_TYPEType; DWORDDelay; }SC_ACTION; 

The first parameter, Type , can be one of the following values:

SC_ACTION_NONE No action is taken.
SC_ACTION_REBOOT Reboot the server.
SC_ACTION_RESTART Restart the service.
SC_ACTION_RUN_COMMAND Run a command.

The second member of SC_ACTION is Delay , used to specify the time to wait, in milliseconds , before performing the action. When rebooting the server or restarting the service, a delay is generally specified. For the reboot, to allow users to try to save their work, and for restarting the service, the delay may allow the service to recover before the restart is attempted.

One approach would be to allow the first failure to run a command, perhaps running a program notifying an administrator of a failure and attempting to restart the service from within the program run on the command line. The command could also trigger a message to some network monitoring software, such as the MonitorService example from Chapter 14. A second failure might cause a reboot of the server if the service that failed was critical.

Another new Win32 API function that retrieves information about the service description of failure behavior is QueryServiceConfig2 . Its prototype is:

 BOOL QueryServiceConfig2(SC_HANDLE hService,     DWORD dwInfoLevel,     LPBYTE lpBuffer,     DWORD cbBufSize,     LPDWORD pcbBytesNeeded); 

The meanings of the first three parameters are similar to the meanings of the parameters for ChangeServiceConfig2 . The fourth parameter, cbBufSize , is the size of the buffer pointed to by lpBuffer . The final parameter, pcbBytesNeeded , is a pointer to a DWORD that will contain the size the buffer must be if ChangeServiceConfig2 fails with GetLastError returning ERROR_INSUFFICENT_BUFFER. Because the size of the data returned in lpBuffer is variable and cannot be determined in advance, one strategy is to call the function one time with a zero-length buffer to get the length required and then again using the length returned in pcbBytesNeeded .

For example, the following code gets the length required, allocates a buffer of the correct size, and then calls QueryServiceConfig2 with a properly sized buffer:

 DWORDdwRequired=0; SERVICE_FAILURE_ACTIONS*pActions=NULL; hService=OpenService(hSCM,"MonitorService",SERVICE_ALL_ACCESS); QueryServiceConfig2(hService,SERVICE_CONFIG_FAILURE_ACTIONS, NULL,0,&dwRequired); if(GetLastError()==ERROR_INSUFFICENT_BUFFER) { pActions=(SERVICE_FAILURE_ACTIONS*)BYTE[dwRequired]; if(pActions) { QueryServiceConfig2(hService, SERVICE_CONFIG_FAILURE_ACTIONS, pActions,dwRequired,&dwRequired); //Dosomething... } } 


Inside Server-Based Applications
Inside Server-Based Applications (DV-MPS General)
ISBN: 1572318171
EAN: 2147483647
Year: 1999
Pages: 91

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net