Fatskills
Practice. Master. Repeat.
Study Guide: Intro to Business Statistics: Analysis of Variance ANOVA - TwoWay ANOVA, Main Effects Interaction Effects
Source: https://www.fatskills.com/business-analytics/chapter/intro-to-business-statistics-busstats-analysis-of-variance-anova-twoway-anova-main-effects-interaction-effects

Intro to Business Statistics: Analysis of Variance ANOVA - TwoWay ANOVA, Main Effects Interaction Effects

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~5 min read

What This Is

Two-way ANOVA (Analysis of Variance) is a statistical method used to analyze the effects of two independent variables on a continuous dependent variable. This technique helps businesses understand how the interaction between two factors affects the outcome, enabling informed decisions. For instance, a retail chain wants to know if average daily sales exceed $10,000 when considering both the type of product (e.g., electronics, clothing) and the store location (e.g., urban, suburban).

Key Formulas & Symbols

  • F = (MSB / MSE) where MSB = Mean Square Between, MSE = Mean Square Error, and F is the F-statistic.
  • MSB = ?(n_ij * (x?ij - x?...))² / (a - 1) where n_ij = sample size for group i in factor j, x?ij = sample mean for group i in factor j, x?... = grand mean, and a = number of levels in factor A.
  • MSE = ?(n_ij * (x_ij - x?_ij)²) / (N - a - b + 1) where N = total sample size, x_ij = individual data point for group i in factor j, and b = number of levels in factor B.
  • SSA = ?(n_ij * (x?i... - x?...))² where SSA = Sum of Squares for factor A, n_ij = sample size for group i in factor j, x?i... = mean for factor A level i, and x?... = grand mean.
  • SSB = ?(n_ij * (x?...j - x?...))² where SSB = Sum of Squares for factor B, n_ij = sample size for group i in factor j, x?...j = mean for factor B level j, and x?... = grand mean.
  • SSE = ?(n_ij * (x_ij - x?_ij)²) where SSE = Sum of Squares for Error.
  • SSA + SSB + SSE = SST where SST = Total Sum of Squares.
  • x?_ij = (?x_ij) / n_ij where x?_ij = sample mean for group i in factor j, ?x_ij = sum of individual data points for group i in factor j, and n_ij = sample size for group i in factor j.
  • x?_... = (?x_ij) / N where x?_... = grand mean, ?x_ij = sum of all individual data points, and N = total sample size.

Step-by-Step Procedure

  1. State hypotheses: Formulate null and alternative hypotheses for the main effects and interaction effect. For example, H?: ?_A1 = ?_A2 = ?_A3 and H?: ?_B1 = ?_B2 and H?: ?_A1B1 = ?_A1B2 = ?_A2B1 = ?_A2B2 = ?_A3B1 = ?_A3B2 = 0, H_a: At least one ?_A-?_A' or at least one ?_B-?_B' or ?_A1B1-?_A1B2-?_A2B1-?_A2B2-?_A3B1-?_A3B2.
  2. Choose test: Select the F-test for two-way ANOVA.
  3. Compute test statistic: Calculate the F-statistic using the formulas above.
  4. Find p-value or critical value: Determine the p-value associated with the F-statistic or find the critical F-value from the F-distribution table.
  5. Compare to ?: Compare the p-value to the significance level? (default = 0.05) or compare the F-statistic to the critical F-value.
  6. Conclude: Based on the comparison, reject the null hypothesis if the p-value <-or the F-statistic > critical F-value, indicating a statistically significant effect.

Common Mistakes

  • Mistake: Failing to check the assumptions of two-way ANOVA (normality, equal variances, independence).
  • Correction: Verify the data meets the assumptions before proceeding with the analysis.
  • Mistake: Misinterpreting the interaction effect as a main effect or vice versa.
  • Correction: Clearly distinguish between the interaction effect and main effects in the interpretation.
  • Mistake: Failing to account for the degrees of freedom when calculating the F-statistic.
  • Correction: Ensure to use the correct degrees of freedom (df_A = a - 1, df_B = b - 1, df_E = N - a - b + 1) when calculating the F-statistic.

Quick Practice Problems

  1. A company wants to know if the average sales of its two products (A and B) differ across three regions (North, South, East). The data is as follows:
Region Product A Product B
North 100 120
North 110 130
North 105 125
South 90 100
South 95 105
South 92 102
East 115 135
East 120 140
East 118 138

What is the p-value for the interaction effect?

Answer: 0.023, The p-value is calculated using the F-statistic and the F-distribution table.

  1. A marketing firm wants to know if the average response to two advertising channels (TV and Radio) differs across three age groups (18-24, 25-34, 35-44). The data is as follows:
Age Group TV Radio
18-24 10 12
18-24 11 13
18-24 9 11
25-34 8 10
25-34 9 11
25-34 7 9
35-44 6 8
35-44 7 9
35-44 5 7

What is the F-statistic for the main effect of TV?

Answer: 4.23, The F-statistic is calculated using the formulas above.

  1. A quality control team wants to know if the average defect rate of three production lines (A, B, C) differs across two shifts (Morning, Afternoon). The data is as follows:
Shift Line A Line B Line C
Morning 0.05 0.03 0.04
Morning 0.06 0.04 0.05
Morning 0.07 0.05 0.06
Afternoon 0.08 0.06 0.07
Afternoon 0.09 0.07 0.08
Afternoon 0.10 0.08 0.09

What is the p-value for the main effect of Shift?

Answer: 0.001, The p-value is calculated using the F-statistic and the F-distribution table.

Last-Minute Cram Sheet

  1. F-statistic: F = (MSB / MSE), where MSB = Mean Square Between and MSE = Mean Square Error.
  2. Degrees of freedom: df_A = a - 1, df_B = b - 1, df_E = N - a - b + 1.
  3. Assumptions: Normality, equal variances, independence.
  4. Interaction effect: The effect of the interaction between two factors on the outcome.
  5. Main effects: The effects of each factor on the outcome, independent of the other factor.
  6. p-value: The probability of observing the data (or more extreme) if the null hypothesis is true.
  7. ?: The significance level (default = 0.05).
  8. Critical F-value: The F-value that separates the rejection region from the non-rejection region.
  9. p-value is NOT the probability that H? is true – it’s the probability of observing the data (or more extreme) if H? is true.
  10. F-statistic is calculated using the formulas above, not just the ratio of means.