APPLICATION TO THE DATA

data mining: opportunities and challenges
Chapter X - Maximum Performance Efficiency Approaches for Estimating Best Practice Costs
Data Mining: Opportunities and Challenges
by John Wang (ed) 
Idea Group Publishing 2003
Brought to you by Team-Fly

In this section, we describe the data and apply the MPE estimation criterion. In the initial application of the MPE criterion one cost rate was estimated to be zero. Here we concentrate on the case in which all estimated cost rates must be positive and show how the MPE criterion is modified for that case.

The data for this study are from Dyson and Thanassoulis (1988) and are reproduced here in Table 1, first six columns. These data were collected for a set of property tax collection offices, called Rates Departments, in the London Boroughs and Metropolitan Districts. A more complete description of the data is given in Thanassoulis, Dyson, and Foster (1987). Total annual costs, measured in units of 100,000 for these offices (units), were collected along with activity driver levels for four activities. The first three activities collection of non-council hereditaments, rate rebates generated, and summonses issued and distress warrants obtained were measured in units of 10,000, 1,000, and 1,000, respectively. The fourth net present value of non-council rates collected, was measured in units of 10,000. This last one was included as a cost driver (called an output by Dyson and Thanassoulis, and Thanassoulis et al.) to reflect the additional administrative effort exerted to ensure the timely payment of large revenue producing cases.

Table 1: British rates departments data based on Dyson and Thanassoulis (1988). Efficiency rating based on model MDE-2 with preemptive positive weights modification


Rates
Department


Total
Costs

Non-cnl
Hereditaments

Rate
rebts
grtd

Summons
& d'ress
wrnts

NPV of
non-cnl
rates

Efficiency
Rating

Lewisham

9.13

7.53

34.11

21.96

3.84

0.7881

Brent

13.60

8.30

23.27

35.97

8.63

0.6920

Stockport

5.76

10.91

13.39

11.53

4.93

1.0000

Bradford

11.24

16.62

36.82

27.55

9.52

1.0000

Leeds

15.57

22.81

95.78

23.61

12.27

1.0000

City of London

5.65

1.78

0.16

1.31

39.01

0.9641

Liverpool

21.60

15.11

70.96

54.22

10.81

0.7577

Walsall

8.57

7.92

48.69

14.03

5.92

0.8361

Rotherham

6.01

7.07

36.30

5.45

2.94

0.7926

Wakefield

8.02

8.86

43.61

13.77

4.27

0.8631

Lambeth

9.93

9.00

36.85

20.66

8.15

0.8122

Sunderland

7.90

8.28

45.22

6.19

5.33

0.7492

Solihull

5.15

6.76

18.70

10.62

3.54

0.8958

Redbridge

6.42

8.98

13.60

12.32

3.75

0.8110

Calderdale

5.94

7.69

25.91

8.24

2.48

0.7994

Haringey

8.68

7.23

16.97

17.58

6.27

0.6864

Barking &Dagenham

4.86

3.36

23.67

4.30

2.48

0.6076

Newcastle-upon-Tyne

10.33

8.56

30.54

17.77

8.01

0.6985

Manchester

21.97

12.23

92.02

29.53

14.76

0.6230

Wolverhampton

9.70

7.67

41.16

13.27

4.50

0.6649

Trafford

6.34

8.17

16.61

8.26

5.05

0.7466

Tameside

7.70

7.88

15.75

14.50

3.03

0.6808

St Helens

5.99

5.67

27.55

5.24

3.41

0.5188

Sutton

5.20

6.92

12.61

4.30

3.04

0.6556

Rochdale

6.36

7.35

23.51

5.74

4.21

0.8471

Barnsley

8.87

6.46

38.10

9.65

3.09

0.5974

Kirklees

10.71

13.64

23.86

14.63

4.63

0.6876

Oldham

6.49

7.68

17.97

8.27

2.76

0.6766

Sheffield

15.32

15.34

55.42

16.36

12.53

0.6905

Havering

7.00

8.37

14.92

9.88

4.33

0.6915

Dudley

10.50

9.61

37.91

13.49

5.04

0.6563

Sefton

10.88

10.65

36.96

14.25

4.84

0.6617

Bexley

8.52

8.97

24.67

11.84

3.75

0.6669

Gateshead

7.61

6.11

31.73

7.66

2.87

0.6031

Wigan

10.91

9.78

42.73

12.17

4.66

0.6363

Kensington & Chelsea

9.72

7.71

5.90

14.60

9.25

0.5646

Coventry

12.63

11.08

41.59

16.42

5.65

0.6289

Sandwell

11.51

9.07

28.49

16.28

5.96

0.5898

Bury

6.22

6.63

14.67

7.70

3.08

0.6294

South Tyneside

5.29

3.96

20.42

1.96

1.84

0.4808

Salford

8.78

6.56

31.72

8.60

4.83

0.5783

Hackney

13.50

4.77

26.47

20.88

4.17

0.4434

Camden

12.60

6.68

30.28

9.09

19.45

0.5478

Hillingdon

8.10

8.10

9.71

8.53

7.50

0.5821

Tower Hamlets

9.67

6.00

19.46

10.71

8.03

0.5187

Barnet

12.37

11.25

28.50

12.53

6.74

0.5604

Bolton

9.50

8.67

23.54

8.99

3.66

0.5411

Ealing

11.47

10.30

15.58

13.74

6.46

0.5388

Bromley

11.78

12.22

14.33

10.10

5.02

0.5039

Wandsworth

12.57

10.43

18.31

16.39

3.92

0.5098

Birmingham

50.26

32.33

150.00

45.10

19.58

0.3565

Enfield

12.70

9.50

22.39

14.90

5.80

0.5030

Southwark

13.30

7.53

21.99

14.66

8.32

0.4608

Knowsley

5.60

3.73

12.21

5.39

2.84

0.4786

Islington

11.75

5.20

13.28

13.62

7.10

0.4079

North Tyneside

8.47

6.15

19.45

6.51

3.30

0.4587

Kingston-upon-Thames

8.36

5.96

17.11

4.66

3.08

0.4107

Hounslow

11.07

7.25

16.34

8.69

6.62

0.4274

Richmond-upon-Thames

10.38

7.76

16.44

6.01

3.31

0.3941

Hammersmith & Fulham

11.83

5.35

12.41

12.24

4.57

0.3622

Newham

12.71

6.32

13.63

8.53

5.16

0.3268

Merton

11.19

6.58

10.90

3.52

3.46

0.2839


Mean

10.45

8.81

29.41

13.33

6.41

0.6314

Standard Deviation

6.21

4.50

23.54

9.38

5.57

0.1699

Thanassoulis et al. (1987) briefly discussed the possible disaggregation of the cost pools. They indicate that this would have been possible to some extent but decided against this for several reasons. First, these costs represented the cost of real resources used and available for management deployment. Next, they felt that the increased number of variables that would result might tend to decrease the discrimination power of the data envelopment analysis (DEA) method they were studying. Next, and importantly, it was felt that the disaggregated data were less reliable. This concern is well founded, particularly in the present context. Datar and Gupta (1994) have shown that disaggregation can actually increase errors.

The method used in both Dyson and Thanassoulis (1988) and Thanassoulis et al (1987) was a modification of data envelopment analysis (DEA), another efficiency estimation technique. It is worthwhile to briefly discuss DEA in regard to the above and other relevant points. An introduction to DEA is contained in Charnes, Cooper, Lewin, and Seiford. (1994). There are several DEA models, but we limit our coverage to that used in Dyson and Thanassoulis. If we use the ar notation for output weights in the DEA model M1 of Dyson and Thanassoulis (page 564), we obtain the model(s):

Unlike the one-pass solution of model MPE, Model M1(jo) is solved for each unit, (jo), in turn. The solutions arjo therefore depend on which unit is being featured in the objective function. Typically, some of the arjo values will be zero for various units. In the present cost rates context, the following interpretation can be given to the M1(jo) DEA model. If unit jo were allowed to choose the activities and drivers to become applicable for the entire group of units, then the M1(jo) model solution obtains these in such a way as to give that unit the most favorable efficiency score. The principal difficulty here is that no consensus is being achieved on the most efficient cost rates. With the M1 (jo) DEA model, each unit can select cost rates that make it look most favorable. Dyson and Thanassoulis (1988) call this phenomenon weights flexibility. Their work was motivated, in part, to modify the M1(jo) DEA technique to limit this flexibility, and provides a more extensive discussion of this DEA limitation.

Except in unusual circumstances, activity cost rates, ar, can only be positive. In this section, we consider this requirement as a preemptive priority. For the solution values, ar*, to be strictly positive, it is necessary that they be basic variables in an optimal solution. This may occur gratuitously. Otherwise one or more may be non-basic, and therefore have value zero. The standard LP solution report provides reduced costs. For a variable with optimal value of zero, the reduced cost may be interpreted as follows. It indicates the smallest amount by which the objective function coefficient for the variable must be increased in order that the variable becomes positive in an optimal solution.

The last column of Table 1 gives the efficiency scores obtained by model MPE with preemptive positive costs. Descriptive statistics have been added as supplemental information. The MPE model was solved using SAS/IML software (1995), which includes a linear programming call function. The initial solution assigned a zero optimal value only to a1*. (The full solution for this case was a1* =0, a2* =0.0882, a3* =0.2671, a4* =0.0664.) Thus, it was deemed necessary to implement the preemptive positive weights modification. The reduced cost for variable a1 was given as 1.220440. The objective coefficient was 55.52898. Therefore, the modified procedure required increasing the coefficient of a1 to 56.747. The resulting estimates were as follows:

The corresponding efficiency scores of the units are shown in the last column of Table 1. Table 2 gives descriptive statistics for the Yr data.

Table 2: Descriptive statistics for the derived data, Yr

Variable

Y1

Y2

Y3

Y4


Mean

0.8948

2.8492

1.2622

0.6665

Standard Deviation

0.3013

1.3959

0.5290

0.8320


While not true for the example presented here, it is possible that more than one cost rate is initially estimated as zero or the reduced cost is also zero when strict positivity is required. Then, it is necessary to proceed as follows. Suppose an auxiliary variable m, and nr new constraints ar m are joined to the MPE model. If the optimal value of m is positive, then so it must be also for all the cost rates. Let λ be a non-negative parameter chosen by the analyst and consider the modified objective function given by

When the value of λ is zero, the objective function is the same as the original one with m* = 0. We have the following theorem whose proof is given the Appendix.

Theorem 1: Let z *(λ), ar*(λ), and m* (λ) be the solution of the MPE (λ) model:

Then (1): z*(λ) is monotone non-decreasing in λ and z*(λ) →∞ as λ→0; and (2): Σ ar*(λ) Yrj is monotone non-increasing in λ.

We propose the solution for positive weights to be that corresponding to the greatest lower bound of λ values for which m*(λ) > 0. This may be estimated by trial and error. We develop a rationale for such positive weights procedures by writing the MPE model objective function in the form ar ( Yrj). If maximization of this objective function yields a1* = 0, then evidently the coefficient, Y1j is too small in some sense relative to the other coefficients. It may be noted that the Yrj = yrj/xj data are in the nature of reciprocal costs. When the sum of these is too small, the implication is that their own reciprocals, the xj/yrj, are on average too large. This is suggestive of inefficiency with respect to the r-th activity. In such a case, assignment of a zero optimal cost estimate would mask this kind of inefficiency. By using the reduced cost adjustment, or adding the λm term to the objective function, a compensation is made for coefficients that are apparently too small in that sense.

Some comparisons with the results of Dyson and Thanassoulis (1988) can now be discussed. As part of their study, a regression through the origin was obtained. The coefficients of that regression model can be interpreted as average cost rates for these activities. The results were as follows:

It will be noted that these average rates are uniformly higher than the presently estimated rates in (3.4), giving a measure of face validity. That is, it is necessary that the cost rates of the most efficient units be lower than the average cost rates for all the units. Also, the four departments rated as most efficient by the present method are the same as those indicated by the Dyson and Thanassoulis (1988) approach.

It may be observed that the preliminary regression step also gives information on the question of positivity of the cost rates. The positivity of gives further evidence on that for a1*. Here is positive and significant. (The significance level was not specified in Dyson and Thanassoulis). Since the units are assumed to be comparable, it appears unlikely that one or a few could perform activity one with no cost while the typical unit does incur cost for the activity. If a particular unit could produce an activity with zero cost (ar* = 0) while the average unit does incur cost for the activity, then it must have a radically superior process not actually comparable with the others. Similarly, this regression model also validates activity four as influential on costs. The next section discusses model aptness.

Brought to you by Team-Fly


Data Mining(c) Opportunities and Challenges
Data Mining: Opportunities and Challenges
ISBN: 1591400511
EAN: 2147483647
Year: 2003
Pages: 194
Authors: John Wang

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net