A WORD OF CAUTION ON 2


A WORD OF CAUTION ON 2

Chi-square is a convenient measure of association between two factors when the factors are not quantitative. It indicates the degree to which the frequencies in a cross-tabulation of the two factors deviate from what they would be if no interrelation existed between the factors. The computed chi-square has a specific level of statistical significance that you can look up in a standard table.

Suppose we ask 300 testers to rate two brands of a product (A and B) both in terms of overall preference and preference regarding "comfort." By a convenient coincidence , the " comfort " preference divides exactly even, with 100 preferring A, 100 preferring B, and 100 having no preference.

Comfort level

A

B

No preference

A

70

30

100

No preference

55

45

100

B

40

60

100

Total

165

135

300

Clearly, a strong association exists between the preference on "comfort" and overall preference; chi-square is 18.4, indicating a significance level of 99%+. However, let us assume that we suspect the results and upon further investigation, we find that we have recorded the data in the wrong cell . The table should have looked like:

Comfort level

A

B

No preference

A

70

30

100

B

55

45

100

No preference

40

60

100

Total

165

135

300

Now, let us see what we have. It still looks like a strong association for A but not for B, so we should have a lower chi-square, right?

No. Chi-square is still 18.4. As long as the numbers stay the same, it does not matter how they are labeled. Like the scarecrow in The Wizard of Oz , chi-square does not have a brain. It is merely an algorithm, a mechanical process based on numbers regardless of what they represent. By itself, it never can take the place of a regression or correlation because it cannot describe the relationship; it can only gauge its statistical significance, entirely regardless of logic or sense.

Chi-square is nonparametric. To describe a relationship in numerical terms, we need numerical values ” that is, parameters. If we arbitrarily assign value +1 to preference for A and -1 to preference for B, we can compute a correlation coefficient of r = +.246 for the original tabulation and exactly half that for the corrected distribution. The parametric regression/correlation, unlike chi-square, is affected by the way the rows and columns are labeled because each label has a specific value.

So chi-square is a very useful index when we cannot assign values, but it is very easy to misuse it; it does not have a brain, so the analyst has to use his or her own brain to interpret it correctly.




Six Sigma and Beyond. Statistics and Probability
Six Sigma and Beyond: Statistics and Probability, Volume III
ISBN: 1574443127
EAN: 2147483647
Year: 2003
Pages: 252

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net