Chapter 51: Randomized Blocks and Two-Way ANOVA


Overview

  • I am trying to analyze the effectiveness of my sales force. The problem is that in addition to a sales representative’s effectiveness, the amount that a representative sells depends on the district to which he or she is assigned. How can I incorporate the district assignments of my representatives into my analysis?

  • Based on my knowledge of sales representatives and districts, how can I forecast sales? How accurate are my sales forecasts?

  • How can I determine whether varying the price and the amount of advertising affects the sales of a video game? How can I determine whether price and advertising interact significantly?

  • How can I interpret the effects of price and advertising on sales when there is the absence of significant interaction between price and advertising?

In many sets of data, two factors can influence a dependent variable. Here are some examples.

Open table as spreadsheet

Factors

Dependent variable

Sales representative and district assignment

Sales

Product price and advertising expenditure

Sales

Temperature and pressure

Production yield

Surgeon and brand of stent used

Health of patient after open-heart surgery

When two factors might influence a dependent variable, randomized blocks or two-way analysis of variance (ANOVA) can easily be used to determine which, if any, of the factors have a significant influence on the dependent variable. With two-way ANOVA, you can also determine whether two factors exhibit a significant interaction. For example, suppose we are trying to predict sales by using product price and advertising budget. Price and advertising interact significantly if the effect of advertising depends on the product price.

In a randomized block model, we observe each possible combination of factors exactly once. You can’t test for interactions in a randomized block design. In a two-way ANOVA model, we observe each combination of factors the same number of times (call it k). In this case, k must be greater than 1. In a two-way ANOVA model, you can easily test for interactions.

  • I am trying to analyze the effectiveness of my sales force. The problem is that in addition to a sales representative’s effectiveness, the amount that a representative sells depends on the district to which he or she is assigned. How can I incorporate the district assignments of my representatives into my analysis?

  • Suppose we want to determine how a sales representative and the sales district to which the representative is assigned influence product sales. To answer the question in this example, we can have each of four sales reps spend a month selling in each of five sales districts. The resulting sales are given in the Randomized Blocks worksheet in the file Two-wayanova.xlsx, shown in Figure 51-1. For example, Rep 1 sold 20 units during the month she was assigned to District 4.

    image from book
    Figure 51-1: Data for the randomized blocks example

  • This model is called a two-way ANOVA without replication because two factors (district and sales representative) can potentially influence sales, and we have only a single instance pairing each representative with each district. This model is also called a randomized block design because we’d like to randomize (chronologically) the assignment of representatives to districts. In other words, we’d like to ensure that the month during which Rep 1 is assigned to District 1 is equally likely to be the first, second, third, fourth, or fifth month. This randomization hopefully lessens the effect of time (a representative presumably becomes better over time) on our analysis; in a sense, we are “blocking” the effect of districts when we try to compare sales representatives.

  • To analyze this data in Microsoft Office Excel 2007, click Data Analysis on the Data tab, and then select the Anova: Two-Factor Without Replication option. Then fill in the dialog box as shown in Figure 51-2.

    image from book
    Figure 51-2: Anova: Two-Factor Without Replication dialog box for setting up a randomized blocks model

  • We use the following information to set up our analysis:

    • Our input range data is in cells C5:G10.

    • I’ve checked Labels because the first row of the input range contains labels.

    • I entered B12 as the upper-left cell of our output range.

    • The alpha value is not important. You can use the default value.

  • The output we obtain is shown in Figure 51-3. (The results in cells G12:G24 were not created by the Excel Data Analysis feature. I entered formulas in these cells, as I’ll explain later in the chapter.)

    image from book
    Figure 51-3: Randomized blocks output

  • To determine whether the row factor (districts) or column factor (sales representatives) has a significant effect on sales, just look at the p-value. If the p-value for a factor is low (less than .15), the factor has a significant effect on sales. The row p-value (.0000974) and column p-value (.024) are both less than .15, so both the district and the representative have a significant effect on sales.

  • Based on my knowledge of sales representatives and districts, how can I forecast sales? How accurate are my sales forecasts?

  • How should we predict product sales? We can predict sales during a month by using equation 1, shown here:

     Predicted sales=Overall average+(Rep effect)+(District effect)

  • In this equation, Rep effect equals 0 if the sales rep factor is not significant. If the sales rep factor is significant, Rep effect equals the mean for the given rep minus the overall average. Likewise, District effect equals 0 if the district factor is not significant. If the district factor is significant, District effect equals the mean for the given district minus the overall average.

  • I computed the overall average (17.6) in cell G12 by using the formula AVERAGE(D6:G10). The representative and district effects are computed by copying from cell G15 to G16:G24 the formula E15–$G$12. As an example, you can compute predicted sales by Rep 4 in District 2 as 17.6–2.85+3.6=18.35. This value is computed in cell D38 (see Figure 51-4) with the formula G12+G16+G24. If the district effect was significant and the sales representative effect was not, our predicted sales for Rep 4 in District 2 would be 17.6– 2.85=14.75.

    image from book
    Figure 51-4: Forecast for sales in District 2 by sales Rep 4

  • As in one-way ANOVA, the standard deviation of our forecast errors is the square root of the mean square error shown in cell E31. I computed this standard deviation in cell E32 with the formula SQRT(E31). Thus, we are 95 percent sure that if Rep 4 is assigned to District 2, monthly sales will be between 18.35–2(3.99)=10.37 and 18.35+2(3.99)=26.33. These limits are computed in cell D39 and D40 with the formulas D38–2*E32 and D38+2*E32, respectively.

  • How can I determine whether varying the price and the amount of advertising affects the sales of a video game? How can I determine whether price and advertising interact significantly?

  • When you have more than one observation for each combination of the row and column factors, you have a two-factor ANOVA with replication. To perform this sort of analysis, Excel requires that you have the same number of observations for each row-and-column combination.

  • In addition to testing for the significance of the row and column factors, we can also test for significant interaction between them. For example, if we want to understand how price and advertising affect sales, an interaction between price and advertising would indicate that the effect of an advertising change would depend on the price level (or equivalently, the effect of a price change would depend on the advertising level). A lack of interaction between price and advertising would mean that the effect of a price change would not depend on the level of advertising.

  • As an example of two-factor ANOVA with replication, suppose we want to determine how price and advertising level affect the monthly sales of a video game. In the Two Way ANOVA No Interaction worksheet in the file Twowayanova.xlsx, we have the data shown in Figure 51-5. During the three months in which we had low advertising and a medium price, for example, we sold 21, 20, and 16 units.

    image from book
    Figure 51-5: Video game sales data; no interaction

  • Notice that for each price/advertising combination, we have exactly three observations. In cell D1, I’ve computed the overall average (25.037) of all observations with the formula AVERAGE(D4:F12). In cells G4, G7, and G10, I computed the effect for each level of advertising. For example, the effect of having a low level of advertising equals the average for low advertising minus the overall average. In cell G4, I computed the low advertising effect of –5.59 with the formula AVERAGE(D4:F6)–$D$1. In a similar fashion, I computed the effect of each price level by copying from D13 to E13:F13 the formula AVERAGE(D4:D12)–$D$1.

  • To analyze this data, click Data Analysis on the Data tab, and then select Anova: Two-Factor With Replication in the Data Analysis dialog box. Fill in the dialog box as shown in Figure 51-6.

    image from book
    Figure 51-6: Anova: Two-Factor With Replication dialog box for running a two-factor ANOVA with replication

  • We use the following information to set up our analysis:

    • Our input range data, including labels, is in C3:F12. In two-way ANOVA with replication, Excel requires a label for each level of the column effect in the first row of each column in the input range. Thus, we entered Low, Medium, and High in cells D3:F3 to indicate the possible price levels. Excel also requires a label for each level of the row effect in the first column of the input range. These labels must appear in the row that marks the beginning of the data for each level. Thus we placed labels corresponding to low, medium, and high levels of advertising in cells C4, C7, and C10.

    • In the Rows Per Sample box, I’ve entered 3 because we have three replications for each combination of price and advertising level.

    • The upper-left cell of our output range is B14.

  • The only important portion of the output is the ANOVA table, which is shown in Figure 51-7.

    image from book
    Figure 51-7: Two-way ANOVA with replication output; no interaction

  • As with randomized blocks, an effect (including interactions) is significant if it has a p-value that’s less than .15. We find that Sample (this is the row for advertising effect) and Price (shown in the row labeled Columns) are highly significant and also that there is no significant interaction. (The interaction p-value is .79!) Therefore, we can conclude that price and advertising influence sales and that the effect of advertising on sales does not depend on the price level. Figure 51-8 graphically demonstrates the fact that price and advertising do not exhibit a significant interaction.

    image from book
    Figure 51-8: Price and advertising do not interact in this data set.

  • Notice that as advertising increases, sales increase at roughly the same rate, whether the price level is low, medium, or high.

  • How can I interpret the effects of price and advertising on sales when there is the absence of significant interaction between price and advertising?

  • In the absence of a significant interaction, we can forecast sales in a two-factor ANOVA with replication in the same way that we do in a two-factor ANOVA without replication. Here’s the equation we use (equation 2):

     Predicted sales=Overall average+[Row or advertising effect(if significant)]+[Column or price effect(if significant)]

  • Our analysis assumes that price and advertising are the only factors that affect sales. If sales are highly seasonal, seasonality would need to be incorporated into the analysis. (Seasonality will be discussed in Chapter 53, “Winter’s Method,” and Chapter 54, “Forecasting in the Presence of Special Events.” For example, when price is high and advertising is medium, our predicted sales are given by 25.04+(–1.814)+(–8.7)=14.52. (See cell E54 in Figure 51-9.) In Figure 51-5, shown earlier, you can see that we found the overall average is equal to 25.037, the medium advertising effect equals –1.814, and the high price effect=–8.704.

    image from book
    Figure 51-9: Forecasts for sales with high price and medium advertising

  • The standard deviation of our forecast errors equals the square root of our mean squared within error.

    image from book

  • We are 95 percent sure that our forecast is accurate within 10.92 units. In other words, we are 95 percent sure that sales during a month with high price and medium advertising will be between 3.60 and 25.43 units.

  • In the Two Way ANOVA with Interaction worksheet, I’ve changed the data from the previous example to the data shown in Figure 51-10. After running our analysis for a two-factor ANOVA with replication, we obtain the results shown in Figure 51-11 on the next page.

    image from book
    Figure 51-10: Sales data with interaction between price and advertising

    image from book
    Figure 51-11: Output for the two-factor ANOVA with interaction

  • In this data set, we find the p-value for interaction is .001. When we see a low p-value (less than .15) for interaction, we do not even check p-values for row and column factors! We simply forecast sales for any price and advertising combination to equal the mean of the three observations involving that price and advertising combination. For example, our best forecast for sales during a month with high advertising and medium price is:

    image from book

  • The standard deviation of our forecast errors is again the square root of the mean square within

    image from book

  • Thus we are 95 percent sure that our sales forecast is accurate within 8.26 units.

  • Figure 51-12 illustrates why this data exhibits a significant interaction between price and advertising. For a low and medium price, increased advertising increases sales, but if price is high, increased advertising has no effect on sales. This explains why we cannot use equation 2 to forecast sales when a significant interaction is present. After all, how can we talk about an advertising effect when the effect of advertising depends on the price?

    image from book
    Figure 51-12: Price and advertising exhibit a significant interaction in this set of data.




Microsoft Press - Microsoft Office Excel 2007. Data Analysis and Business Modeling
MicrosoftВ® Office ExcelВ® 2007: Data Analysis and Business Modeling (Bpg -- Other)
ISBN: 0735623961
EAN: 2147483647
Year: 2007
Pages: 200

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net