Creating a Service Level Agreement: An Example

   


The final section in this chapter describes a hypothetical scenario that requires a service level agreement to satisfy the needs of the business. It is intended to demonstrate the kind of issues that must be addressed, as well as provide a good example of the complete process.

Scenario

CoverMe, Inc., is a company that provides life insurance services to the general public. The sales personnel can receive orders and queries via a number of channels:

  • Over the telephone

  • From the Internet Web site (which are then forwarded using email)

  • Via direct email

  • In writing by mail (confirmation letters , policy cancellation requests , and so on)

The sales department comprises 25 staff, all of whom have a personal computer connected to the LAN and WAN. Each PC runs an in-house written application to access the sales and inquiry database. The database is held on a host computer system running Solaris 7 and the relational database management software. The host system is physically housed in the computer room, in the basement of the building. The application is GUI -based and makes use of client/server technology.

For some time, the sales staff has been complaining to the manager that it is taking too long to access a customer's record and that they are losing business as a result. This is becoming more evident when dealing with customers on the telephone. The staff members are having to apologize for the delay, which is sometimes greater than two minutes ”some customers are even hanging up the phone. The problem is not so critical when dealing with queries or orders from other mediums because the customer is not being kept waiting.

Several calls have been logged with the help desk system regarding the slow response time of the application, but it has not improved, and some of the calls registered are still " open ."

At a recent management meeting, the issue was highlighted. The system manager had no knowledge of the problem and agreed to carry out investigations, which resulted in a slight improvement in performance. This was due to the rescheduling of other background work that was running on the host system ”it was made to run out of core business hours (which are between 09:00 and 18:30) in an attempt to reduce the load on the system during the working day.

Senior management now has become involved and has suggested that a service level agreement be drawn up between the IT service provider and the sales department. The reasons for this are detailed here:

  • A performance problem is costing the company in lost business. This is not quantifiable, but it could potentially affect repeat business from existing customers.

  • The IT department cannot produce sufficient supporting data to demonstrate that the systems are providing the required level of service.

  • The sales department has not made clear its objectives and priorities.

  • There appears to be a breakdown in communication between the help desk and the IT department.

  • The business requires a definition of what constitutes an acceptable level of service. It also requests that the necessary monitoring tools be put in place to measure the performance against defined targets. The information is to be used to identify where improvements can be made. A cost/benefit analysis can then be carried out to ensure that any expenditure will provide value for money.

The Service Level Agreement

The scenario described in the previous section clearly shows the need for a service level agreement. This section outlines the step-by-step process of creating a mutually acceptable agreement.

Step 1: Obtain the Agreement

First, before any progress can be made, the senior management from the IT department and the sales department agree that there is a requirement for a service level agreement to be put in place. It is clear that the service being provided by IT is insufficient for the sales team and also that the sales department needs to clearly identify what its expectations are and to prioritize them accordingly .

After a brief meeting involving the two senior managers, the system manager and the sales manager, it is accepted by all parties that the agreement is the only way forward. From the IT department's perspective, the creation of the agreement may have a positive outcome in that deficiencies that are deemed unacceptable can be used as justification for securing additional funding to improve performance. The benefit is recognized as being mainly in the area of telephone sales, where business will not be lost as a result of slow system response times.

Step 2: Assemble the Team

The initial team is selected, comprised of the following:

  • The IT department ”This involves the system manager and a senior system administrator. Input will be required from a database administrator, the PC support section, and the network management team. A member of the development services department will also be required to discuss the workings of the application that is causing the problem.

  • The sales department ”This involves the sales manager and a telesales supervisor. Input will be required from members of the sales team, one from telephone sales and one from Internet- related sales.

  • The help desk manager ”This person will be required to attend at some point in the discussions to describe any existing service level agreements between his section and the business, and to be a part of this one. The relevance in this case is the link between the help desk and the IT department.

Step 3: Prepare for Providing Services

The system manager looks at the service being provided to ensure that any services that he is using are guaranteed , or at least subject to an agreement of their own. If this is not the case, then he is not in a position to be able to guarantee a level of service to anyone else. For this example, the provision of network services and PC support services must be addressed. Host system administration, host system management, and database administration are already his direct responsibility. He also must be sure that any service being provided can be measured. Checking with the networking team reveals that a new system for end-to-end monitoring has recently been installed and can cater to his needs.

Available data can monitor the performance between computer systems, on both the LAN and the WAN. Performance figures are available for both the database itself and the Solaris server, which just leaves the PC on the user 's desktop where the application is run from, and for which there is no performance data available. Performance monitoring is enabled on a selection of the PCs to allow information to be gathered. Initial investigations also reveal that the development services section has received problem reports via the help desk, but these have not yet been acted on due to a shortage of resources and other priorities.

The sales team must identify its expectations of the service being provided and also provide clear details of the priorities that exist. Team members decide that the telephone queries must be given the highest priority because there are customers waiting, so the fastest response is required for this type of query or order. Obviously, the team would like an instantaneous response time, but this is not a realistic expectation.

The team carries out some timing tests to identify as accurately as possible the current response times over a period of time, taking the average figure as the current level of service. This figure is calculated as being 1 minute, 35 seconds. The results of the users' tests are shown in Figure 3.4. It is agreed that this is an unacceptable level of service. The level that is deemed to be acceptable is between three and seven seconds. There is concern that the level of improvement being requested will be too great, so the sales manager talks to the accounting department, which often must run similar types of queries. Discussions reveal that for a simple query to display a single customer's record, the response time is usually less than five seconds. The sales team requirement (between three and seven seconds) is agreed as being acceptable and reasonable.

Figure 3.4. The graphical test report clearly shows that the current system is consistently performing well below the expected level.

graphics\03fig04.gif

Internet- and email-related queries and orders are not as critical as telephone orders, although the slow response times of the system are causing the sales team to build up a backlog of requests and orders. Requests received in writing present the same problems as with Internet orders. Response times are the same as with telephone orders; the only difference is the priority. A drastically improved response time is needed to be able to function effectively.

Step 4: Define the Terms and the Scope of the Agreement

The service level agreement is given a start date, in one month's time, and an end date, in two years and one month's time. The two-year contract is agreed to based on the fact that technology is changing rapidly , so it definitely could not be any longer. There is no guarantee that the SLA will still be valid even after two years , but with the first three months allowing for tuning of the agreement parameters as practical experience is gained , two years is deemed to be a reasonable estimate of the life expectancy. It is generally agreed that an agreement of shorter duration would be of less value to both sides. Additionally, all parties agree that if technology changes significantly enough to affect the value of the agreement, then it will be revisited.

The sales team begins by explaining the process that is carried out when using the system and that the query on the database is looking for only a single customer record. The priority of the telephone queries is also explained, as is the steady buildup of a backlog. Team members mention that the system response time has become steadily worse over the last 12 months. Initially, it was seen as acceptable, probably at about 10 to 15 seconds, because it was a dramatic improvement over the previous system. At that time, approximately 5,000 customer records were held on the database; there are now 50,000, a tenfold increase in 12 months. An acceptable level of service is defined as the response time being between three and seven seconds.

The system manager identifies the service being provided as overlapping between more than one area of responsibility. This, by definition, makes management of the end-to-end process more difficult. He outlines the technical issues concerning the path that is taken by the interchange of data between the client (the user's PC) and the host server (the Solaris server containing the database).

The help desk manager describes the process for dealing with the calls relating to the application. The trouble tickets produced have been passed to development services, the formal owner of the application. At the least, this explains why the system manager had no knowledge of the problem! The help desk manager agrees to revisit his own section's SLA so that a more clearly defined follow-up procedure is invoked, along with a clear escalation procedure. The system manager notes this as an external dependency, which is beyond the control of either party and is the subject of a different service level agreement.

A member of the development services department describes in technical detail how the application works. As a client/server, graphical application, it appears that the design of the application may be the cause of the problem. Usually, with this kind of application, a query submitted from the client (the user's PC) would run on the host server (a much more powerful machine) and only transfer the query results back to the PC ”in this case, the customer record. The amount of data being retrieved is trivial and should take only a second or two. The design of the application means that instead of doing this, the entire table is being transferred to the local PC, the query is being run, and the rest of the data is being discarded. It becomes apparent to all parties that the fundamental design of the application is at the root of all the problems being encountered and has caused progressively greater problems as the size of the database has increased.

A member of the PC support services department notes that the PCs on the users' desktops currently have 48MB of memory. For the type of application they are running, they should have an absolute minimum of 64MB, with 128MB being preferred.

Step 5: Document the Agreement

The system manager will not be prepared to agree to the request for the level of service specified by the sales team without additional memory being installed in the users' PCs and the rework of the application so that it performs in an efficient manner. Only then will it be possible to ascertain an acceptable level of service that is practical and achievable. The current situation does not allow him to guarantee a system response time of less than 1 minute 40 seconds, which is clearly unacceptable to the sales department.

The result in this case is that no agreement can be reached without additional expenditure. The installation of the memory can go ahead because it is a relatively small amount, but the development work required must be scheduled and funded . The SLA negotiations are suspended while the work is carried out. Subsequent testing reveals a substantial service improvement; the response time is within the specified tolerances.

Negotiations recommence, and both parties fully understand what is being provided and the tolerances that exist. The service level agreement can now be formally documented and agreed.

Step 6: Agree on Service Level Indicators

At this stage, indicators of nonperformance are identified and agreed to. The following indicators resulted from discussions:

  • Response times for the sales team should not exceed 10 seconds. If nonperformance is encountered, it must occur for at least 10% of all queries during a period of 30 minutes (The 10% threshold allows for blips to occur, which are not indicative of a problem.) The monitoring system provides the necessary management information to measure end-to-end performance.

  • PC failures are to be rectified within two hours, by replacing either the entire unit or a specific component. The help desk trouble tickets are to provide the supporting information for this indicator.

  • The application must be accessible and usable for 99.5% of the year (during working hours). Availability figures produced on a monthly basis are to be used to measure performance against this indicator.

Step 7: Determine Corrective Action

No financial penalty will be levied with the SLA for nonperformance. Instead, any nonperformance exceptions are to be highlighted in a service level exception report, to be distributed to senior management.

Step 8: Review and Refine the SLA

The SLA will be subjected to weekly reviews for the first three months. Subsequent reviews are to be conducted monthly.

Refinements and modifications to the agreement can be made only by mutual agreement of both groups.

Some Publicly Available SLA Samples

It is often valuable to be able to see real-world examples. They can provide an extra dimension of perspective that might otherwise be missed. The following list shows a number of publicly available examples that can be viewed on the World Wide Web and that may serve as useful references:

  • http://cast. stanford .edu/services/

  • http://help.oit.unc.edu/ctc/sla/tsld001.htm

  • http://etc.nih.gov/pages/etcservicelevelagreement.html

  • http://www.ucsf.edu/its/policy/slasum.html


   
Top


Solaris System Management
Solaris System Management (New Riders Professional Library)
ISBN: 073571018X
EAN: 2147483647
Year: 2001
Pages: 101
Authors: John Philcox

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net