Fatskills
Practice. Master. Repeat.
Study Guide: How to Solve: Hypothesis Testing (One- and Two-Tailed Tests, p-Values, Critical Regions)
Source: https://www.fatskills.com/gcse-math/chapter/how-to-solve-hypothesis-testing-one-and-two-tailed-tests-p-values-critical-regions

How to Solve: Hypothesis Testing (One- and Two-Tailed Tests, p-Values, Critical Regions)

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~5 min read

How to Solve: Hypothesis Testing (One- and Two-Tailed Tests, p-Values, Critical Regions)

GCSE / A-Level Maths


Introduction

"Mastering hypothesis testing lets you prove—with maths—whether a new drug works, a coin is biased, or a school’s exam results are truly improving. On your GCSE/A-Level exam, this topic is worth 10-15% of your stats paper—and one wrong step can cost you 5+ marks. Today, you’ll learn the exact method to solve any hypothesis test question, step by step."


What You Need To Know First

Before starting, you must understand: 1. Probability distributions (binomial, normal) and how to calculate probabilities. 2. Significance levels (α) – what 5% or 1% means in context. 3. Critical values – how to find them from tables or your calculator.


Key Vocabulary

Term Plain-English Definition Quick Example
Null Hypothesis (H₀) The default assumption: "nothing unusual is happening." H₀: "The coin is fair (p = 0.5)."
Alternative Hypothesis (H₁) The claim we’re testing: "something unusual is happening." H₁: "The coin is biased (p ≠ 0.5)."
Significance Level (α) The probability threshold for "unlikely." If p ≤ α, we reject H₀. α = 5% means we reject H₀ if the result is in the rarest 5% of cases.
p-value The probability of getting a result as extreme as the one observed, assuming H₀ is true. If p-value = 0.03, there’s a 3% chance of seeing this result if H₀ is true.
Critical Region The set of values that would lead us to reject H₀. For a 5% two-tailed test, the critical region is the top 2.5% and bottom 2.5%.
Test Statistic The number calculated from your sample (e.g., number of successes). If 12 out of 20 coin flips are heads, the test statistic is 12.

Formulas To Know

Formula Variables Notes
p-value = P(X ≥ x) or P(X ≤ x) X = random variable, x = observed test statistic MEMORISE THIS: Use ≥ for upper-tailed, ≤ for lower-tailed, both for two-tailed.
Critical value = invCDF(α) α = significance level "Given on exam sheet": Use binomial/normal tables or calculator (e.g., invNorm).
Standardised test statistic (z) z = (x̄ – μ) / (σ/√n) MEMORISE THIS: For normal distributions (A-Level only).

Step-by-Step Method

Step 1: Define H₀ and H₁

  • H₀: Always includes "=" (e.g., p = 0.5, μ = 10).
  • H₁: Depends on the question:
  • One-tailed (upper): H₁: p > 0.5 or μ > 10.
  • One-tailed (lower): H₁: p < 0.5 or μ < 10.
  • Two-tailed: H₁: p ≠ 0.5 or μ ≠ 10.

Step 2: State the Significance Level (α)

  • Given in the question (e.g., 5%, 1%).
  • If not given, assume 5%.

Step 3: Calculate the Test Statistic

  • For binomial: Number of successes (e.g., 12 heads out of 20).
  • For normal: Sample mean (x̄) or standardised z-score.

Step 4: Find the p-value or Critical Region

  • p-value method:
  • Calculate P(X ≥ test statistic) for upper-tailed, P(X ≤ test statistic) for lower-tailed, or 2 × P(X ≥ test statistic) for two-tailed.
  • Critical region method:
  • Find the critical value(s) from tables (e.g., for α = 5%, two-tailed, critical values are the 2.5th and 97.5th percentiles).

Step 5: Compare p-value to α or Test Statistic to Critical Region

  • p-value ≤ α → Reject H₀.
  • Test statistic in critical region → Reject H₀.

Step 6: Write Your Conclusion

  • If you reject H₀: "There is sufficient evidence at the [α]% level to suggest [H₁]."
  • If you do NOT reject H₀: "There is insufficient evidence at the [α]% level to suggest [H₁]."

Worked Examples

Example 1 – Basic (Binomial, One-Tailed)

Question: A coin is flipped 20 times, landing heads 14 times. Test at the 5% level whether the coin is biased towards heads.

Solution: 1. H₀: p = 0.5 (fair coin).
H₁: p > 0.5 (biased towards heads). 2. Significance level: α = 5%. 3. Test statistic: 14 heads. 4. p-value: P(X ≥ 14) where X ~ B(20, 0.5).
- From tables: P(X ≥ 14) = 1 – P(X ≤ 13) = 1 – 0.9423 = 0.0577. 5. Compare: 0.0577 > 0.05 → Do NOT reject H₀. 6. Conclusion: "There is insufficient evidence at the 5% level to suggest the coin is biased towards heads."

What we did and why: - We used a one-tailed test because the question asked "biased towards heads" (not just "biased"). - The p-value (0.0577) was just above 5%, so we couldn’t reject H₀.


Example 2 – Medium (Binomial, Two-Tailed)

Question: A dice is rolled 30 times, showing a 6 on 8 occasions. Test at the 1% level whether the dice is biased.

Solution: 1. H₀: p = 1/6 (fair dice).
H₁: p ≠ 1/6 (biased). 2. Significance level: α = 1% → 0.5% in each tail (two-tailed). 3. Test statistic: 8 sixes. 4. p-value: 2 × P(X ≥ 8) where X ~ B(30, 1/6).
- P(X ≥ 8) = 1 – P(X ≤ 7) = 1 – 0.8801 = 0.1199.
- p-value = 2 × 0.1199 = 0.2398. 5. Compare: 0.2398 > 0.01 → Do NOT reject H₀. 6. Conclusion: "There is insufficient evidence at the 1% level to suggest the dice is biased."

What we did and why: - Two-tailed test because the question asked "biased" (not "biased towards 6"). - We doubled the p-value because extreme results could be in either tail.


Example 3 – Exam-Style (Normal, One-Tailed)

Question: A factory claims its lightbulbs last 1000 hours on average. A sample of 50 bulbs has a mean lifetime of 990 hours with a standard deviation of 30 hours. Test at the 5% level whether the bulbs last less than claimed.

Solution: 1. H₀: μ = 1000 (bulbs last 1000 hours).
H₁: μ < 1000 (bulbs last less). 2. Significance level: α = 5%. 3. Test statistic: x̄ = 990, n = 50, σ = 30.
- z = (990 – 1000) / (30/√50) = -10 / 4.243 ≈ -2.36. 4. p-value: P(Z < -2.36) = 0.0091 (from normal tables). 5. Compare: 0.0091 < 0.05 → Reject H₀. 6. Conclusion: "There is sufficient evidence at the 5% level to suggest the bulbs last less than 1000 hours."

What we did and why: - Normal distribution because sample size > 30 (Central Limit Theorem). - One-tailed test because the question asked "less than."


Common Mistakes

Mistake Why it Happens Correct Approach
Using the wrong tail Confusing "greater than" with "less than." Read H₁ carefully: ">" = upper-tailed, "<" = lower-tailed, "≠" = two-tailed.
Forgetting to double p-value Only calculating one tail for a two-tailed test. For two-tailed tests, double the p-value (or halve α for critical regions).
Mixing up H₀ and H₁ Writing H₀ as "p > 0.5" instead of "p = 0.5." H₀ always has "="; H₁ has ">", "<", or "≠".
Using the wrong distribution Using binomial when normal is needed (or vice versa). Check sample size: n > 30 → normal; n ≤ 30 → binomial (if discrete).
Ignoring the significance level Comparing p-value to 0.05 when α = 1%. Always check α—if not given, assume 5%.

Exam Traps

Trap How to Spot it How to Avoid it
"Test at the 10% level" The question gives an unusual α (e.g., 10%, 2%). Circle α immediately—don’t assume 5%.
"Two-tailed but disguised" The question says "different" or "changed" (not "greater/less"). H₁ will have "≠"—double the p-value or halve α.
"Critical region vs. p-value" The question asks for the critical region but you calculate p-value (or vice versa). Read the question carefully—if it says "critical region," find the critical value(s).

1-Minute Recap

"Here’s what you need to remember the night before your exam: 1. H₀ is always "="; H₁ is ">", "<", or "≠". 2. One-tailed test? Compare p-value to α. Two-tailed? Double the p-value or halve α. 3. p-value ≤ α? Reject H₀. Test statistic in critical region? Reject H₀. 4. Always write a conclusion—examiners love this! 5. Check the distribution: Binomial for small samples, normal for large (n > 30). Now go ace that exam!