AKA | Hypothesis Testing (Correlation) |
Classification | Decision Making (DM) |
The correlation analysis (hypothesis testing) procedure is utilized to measure the strength of the relationship or correlation (if any) between two variables or data sets of interest. A scatter diagram is usually completed to show, visually, the approximate correlation before the correlation coefficient is calculated.
To measure the strength of a relationship (correlation) between two variables of interest.
To calculate the correlation coefficient in order to accept or reject the stated null hypothesis (H0), or, in other words, to test whether or not a statistically significant relationship exists between two variables.
Select and define problem or opportunity | |
→ | Identify and analyze causes or potential change |
Develop and plan possible solutions or change | |
Implement and evaluate solution or change | |
→ | Measure and report solution or change results |
Recognize and reward team efforts |
1 | Research/statistics |
Creativity/innovation | |
2 | Engineering |
Project management | |
Manufacturing | |
Marketing/sales | |
Administration/documentation | |
Servicing/support | |
3 | Customer/quality metrics |
Change management |
before
Data Collection Strategy
Sampling Method
Descriptive Statistics
Scatter Diagram
Standard Deviation
after
Information Needs Analysis
Trend Analysis
Response Matrix Analysis
SWOT analysis
Presentation
Sufficient supporting information is presented here to provide a good overview of the hypothesis testing procedure using a correlation test to illustrate the sequential steps involved to arrive at a decision. It is suggested, however, that the reader refer to a text on statistics for additional information and examples.
This is the recommended eight-step procedure for testing a null hypothesis (H0)
(Note: Pearson's r, the product-moment correlation coefficient, is used for this example).
Data Source: Errors made in document processing
Variable X = number of documents processed per day
Variable Y = number of errors per day
Research and null hypothesis (H1 - H0)
H1: There is a statistically significant relationship (correlation) in an increase of documents processed with an increase in errors per day.
H0: There is no statistically significant relationship (correlation) in an increase of documents processed with an increase of errors per day measured at .05 level of significance using a Pearson's product-moment correlation test.
Test used: Simple PPM two-tailed correlation test.
Level of significance used: .05
Degree of freedom: 10 (n-2), 12 pairs in our example.
Test result: r = .853
Critical value: .576 (See Pearson's Table in the Appendix, Table E.)
Decision: Reject the H0! (If the test result is higher than the critical value, the H0 is rejected. The test result is in the rejection region under the curve.)
Pearson's product-moment equations:
Critical Values Table for Correlation Coefficient
No. of Pairs | (df)Degrees of Freedom | Level of Significance | ||||
---|---|---|---|---|---|---|
.20 | .10 | .05 | .01 | .001 | ||
3 | 1 | 0.951 | .988 | .997 | 1.000 | 1.000 |
4 | 2 | 0.800 | .900 | .950 | .990 | .999 |
5 | 3 | 0.687 | .805 | .878 | .959 | .991 |
6 | 4 | 0.608 | .729 | .811 | .917 | .974 |
7 | 5 | 0.551 | .669 | .755 | .875 | .951 |
8 | 6 | 0.507 | .621 | .707 | .834 | .925 |
9 | 7 | 0.472 | .582 | .666 | .798 | .898 |
10 | 8 | 0.443 | .549 | .632 | .765 | .872 |
11 | 9 | 0.419 | .521 | .602 | .735 | .847 |
12 | 10 | 0.398 | .497 | .576 | .708 | .823 |
13 | 11 | 0.380 | .476 | .553 | .684 | .801 |
14 | 12 | 0.365 | .457 | .532 | .661 | .780 |
15 | 13 | 0.351 | .441 | .514 | .641 | .760 |
16 | 14 | 0.338 | .426 | .497 | .623 | .742 |
17 | 0.327 | .412 | .482 | .606 | .725 |
STEP 1 Data has been collected in order to check if there is any correlation in documents processed and errors found in processing. See example Errors Made in Document Processing—Is There a Statistically Significant Correlation?
STEP 2 A scatter diagram is prepared as shown in this example.
Note: Refer to scatter diagram in this book for additional information.
STEP 3 Prepare a table for calculating the correlation coefficient r. Insert the data (docs and errors) into columns X and Y as shown.
Calculate the average of column X, and of column Y.
Subtract from X scores and get small x, the deviation score.
Subtract from Y scores and get small y, the deviation score.
Square small x to get x2.
Square small y to get y2.
Multiply small x times small y to get xy.
Total column xy and insert into r equation.
Note: Refer to standard deviation in this handbook to calculate the standard deviation Sx and Sy.
STEP 4 Complete the calculations to get r, the correlation coefficient. Refer to the hypothesis testing steps as outlined in notes and key points on the previous page.
Errors Made in Document Processing—
Is There a Statistically Significant Correlation?