Fatskills
Practice. Master. Repeat.
Study Guide: Intro to Business Statistics: Chi Square Tests ChiSquare Test of Independence Contingency Tables Expected Frequencies Degrees of Freedom
Source: https://www.fatskills.com/business-analytics/chapter/intro-to-business-statistics-busstats-chi-square-tests-chisquare-test-of-independence-contingency-tables-expected-frequencies-degrees-of-freedom

Intro to Business Statistics: Chi Square Tests ChiSquare Test of Independence Contingency Tables Expected Frequencies Degrees of Freedom

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~4 min read

What This Is

The Chi-Square Test of Independence is a statistical method used to determine if there's a significant association between two categorical variables in a contingency table. For example, a retail chain wants to know if there's a relationship between the type of product (e.g., electronics, clothing, home goods) and the average daily sales. By using the Chi-Square Test of Independence, the retail chain can determine if the observed frequencies in the contingency table are significantly different from what would be expected by chance.

Key Formulas & Symbols

  • χ² = Σ [(observed frequency - expected frequency)^2 / expected frequency] where observed frequency = observed count in a cell, expected frequency = expected count in a cell based on row and column marginal totals.
  • df = (r-1) × (c-1) where r = number of rows, c = number of columns in the contingency table.
  • χ² distribution: a theoretical distribution used to determine the probability of observing a given χ² value or more extreme.
  • p-value: the probability of observing the data (or more extreme) if the null hypothesis is true.
  • α: the significance level (default = 0.05).
  • H₀: the null hypothesis (e.g., no association between the two variables).
  • H₁: the alternative hypothesis (e.g., an association between the two variables).
  • χ² critical value: the χ² value that separates the rejection region from the non-rejection region.

Step-by-Step Procedure

  1. State hypotheses: Define the null and alternative hypotheses (e.g., H₀: no association between product type and average daily sales, H₁: an association between product type and average daily sales).
  2. Choose test: Select the Chi-Square Test of Independence as the appropriate statistical method.
  3. Compute test statistic: Calculate the χ² value using the observed and expected frequencies.
  4. Find p-value or critical value: Determine the p-value or critical value using the χ² distribution with the calculated degrees of freedom.
  5. Compare to α: Compare the p-value or critical value to the significance level (α).
  6. Conclude: If the p-value is less than α or the critical value is exceeded, reject the null hypothesis and conclude that there's a significant association between the two variables.

Common Mistakes

  • Mistake: Misinterpreting the p-value as the probability that the null hypothesis is true.
  • Correction: The p-value is the probability of observing the data (or more extreme) if the null hypothesis is true. It does not provide information about the probability of the null hypothesis being true.
  • Mistake: Failing to check the assumptions of the Chi-Square Test of Independence (e.g., independence of observations, expected frequencies > 5).
  • Correction: Verify that the observations are independent and that the expected frequencies are greater than 5 in each cell.
  • Mistake: Using the χ² distribution with an incorrect degrees of freedom.
  • Correction: Calculate the degrees of freedom using the formula (r-1) × (c-1) and use the correct χ² distribution with the calculated degrees of freedom.

Quick Practice Problems

  1. A marketing firm wants to know if there's a relationship between the type of advertising (e.g., print, online, TV) and the response rate. The contingency table shows the following frequencies:
Advertising Type Response Rate
Print 20
Online 15
TV 10

Calculate the χ² value.

Answer: χ² = 2.44, Explanation: Calculate the expected frequencies using the row and column marginal totals, then compute the χ² value using the observed and expected frequencies.


  1. A quality control engineer wants to know if there's a relationship between the manufacturing process (e.g., machine A, machine B) and the defect rate. The contingency table shows the following frequencies:
Manufacturing Process Defect Rate
Machine A 10
Machine B 15

Determine the p-value.

Answer: p-value = 0.12, Explanation: Calculate the χ² value using the observed and expected frequencies, then determine the p-value using the χ² distribution with the calculated degrees of freedom.


  1. A retail chain wants to know if there's a relationship between the type of product (e.g., electronics, clothing, home goods) and the average daily sales. The contingency table shows the following frequencies:
Product Type Average Daily Sales
Electronics 1000
Clothing 500
Home Goods 200

Determine the critical value.

Answer: χ² critical value = 9.21, Explanation: Calculate the degrees of freedom using the formula (r-1) × (c-1), then determine the critical value using the χ² distribution with the calculated degrees of freedom.

Last-Minute Cram Sheet

  1. χ² = Σ [(observed frequency - expected frequency)^2 / expected frequency] where observed frequency = observed count in a cell, expected frequency = expected count in a cell based on row and column marginal totals.
  2. df = (r-1) × (c-1) where r = number of rows, c = number of columns in the contingency table.
  3. χ² distribution: a theoretical distribution used to determine the probability of observing a given χ² value or more extreme.
  4. p-value: the probability of observing the data (or more extreme) if the null hypothesis is true.
  5. α: the significance level (default = 0.05).
  6. H₀: the null hypothesis (e.g., no association between the two variables).
  7. H₁: the alternative hypothesis (e.g., an association between the two variables).
  8. χ² critical value: the χ² value that separates the rejection region from the non-rejection region.
  9. ⚠️ p-value is NOT the probability that H₀ is true – it’s the probability of observing the data (or more extreme) if H₀ is true.
  10. ⚠️ Failing to check the assumptions of the Chi-Square Test of Independence (e.g., independence of observations, expected frequencies > 5).