AP Statistics
Random


Click random to get a fresh chapter.

AP Statistics (AP Stats): Unbiased Estimators and Variability




AP Statistics – Unbiased Estimators and Variability

AP Statistics: Unbiased Estimators and Variability – Exam-Ready Study Guide

What This Is

Unbiased estimators and variability are foundational concepts in statistical inference. An unbiased estimator is a statistic (like a sample mean or proportion) that, on average, equals the true population parameter. Variability measures how much sample statistics (e.g., means, proportions) spread out due to random sampling. These ideas are critical for constructing confidence intervals and hypothesis tests—both major AP exam topics. Real-world example: A factory tests whether a new machine produces fewer defective lightbulbs. They take a sample of 200 bulbs and find 8% defective. Is this sample proportion an unbiased estimate of the true defect rate? How much might it vary from the true rate?


Key Terms & Formulas

  • Unbiased estimator: A statistic whose sampling distribution has a mean equal to the population parameter it estimates. Example: The sample mean (x?) is an unbiased estimator of the population mean (?).
  • Bias: Systematic over- or under-estimation of a parameter. Example: Using the sample range to estimate the population standard deviation introduces bias.
  • Variability of a statistic: Measured by the standard error (SE), which quantifies how much the statistic varies from sample to sample.
  • SE for p? (sample proportion): SE = ?[p?(1?p?)/n] (used in confidence intervals for p).
  • SE for x? (sample mean): SE = s/?n (used in t-intervals for-when-is unknown).
  • Central Limit Theorem (CLT): For large n (typically n-30), the sampling distribution of x? is approximately normal, even if the population isn’t.
  • 10% Condition: When sampling without replacement, n-0.10N (to ensure independence).
  • Large Counts Condition (for proportions): np-10 and n(1?p)-10 (to ensure normality of p?).
  • Normal/Large Sample Condition (for means): Either the population is normal or n-30 (for CLT).
  • Calculator command for normal probabilities: normalcdf(lower, upper, ?, ?) (e.g., normalcdf(-1E99, 1.96, 0, 1) for P(Z < 1.96)).
  • Calculator command for t-critical values: invT(area to left, df) (e.g., invT(0.975, 19) for a 95% t-interval with df = 19).

Step-by-Step / Process Flow

How to solve an FRQ about unbiased estimators and variability (e.g., confidence intervals or hypothesis tests):

  1. Identify the parameter and statistic:
  2. Parameter: What are you estimating? (e.g., p = true proportion of defective bulbs,-= true mean blood pressure).
  3. Statistic: What’s your sample estimate? (e.g., p? = 0.08, x? = 120 mmHg).

  4. Check conditions for inference:

  5. Random: Data comes from a random sample or randomized experiment.
  6. 10% Condition: If sampling without replacement, n-0.10N.
  7. Normal/Large Sample:

    • For proportions: np?-10 and n(1?p?)-10.
    • For means: Population is normal or n-30 (CLT).
  8. Calculate the standard error (SE):

  9. For proportions: SE = ?[p?(1?p?)/n].
  10. For means: SE = s/?n (if-is unknown).

  11. Construct a confidence interval (if asked):

  12. Formula: Statistic ± (critical value)(SE).
  13. For p: p? ± z* × ?[p?(1?p?)/n].
  14. For ?: x? ± t × (s/?n) (use t*-distribution if-is unknown).

  15. Interpret in context:

  16. “We are 95% confident that the true proportion of defective bulbs is between 5% and 11%.”
  17. Never say “the parameter will be in the interval” or “95% of samples will give this interval.”

Common Mistakes

  • Mistake: Forgetting to check the 10% Condition when sampling without replacement.
  • Correction: Always verify n-0.10N (e.g., if sampling 50 students from a school of 500, 50-50-fails!).

  • Mistake: Using z instead of t for means when-is unknown.

  • Correction: Use t-distribution for means unless-is given (rare on AP exam). Why? The t-distribution accounts for extra variability from estimating-with s.

  • Mistake: Misinterpreting the confidence level (e.g., “There’s a 95% chance the interval contains the true mean”).

  • Correction: Say, “If we took many samples and constructed intervals this way, 95% would capture the true mean.” The parameter is fixed; the interval varies.

  • Mistake: Assuming the sampling distribution is normal without checking conditions.

  • Correction: For proportions, verify np-10 and n(1?p)-10. For means, check n-30 or normality.

  • Mistake: Confusing standard deviation (?) with standard error (SE).

  • Correction:-measures spread in the population; SE measures spread in the sampling distribution of the statistic.

AP Exam Insights

  • Tricky Distinction: The AP exam loves to test whether a statistic is unbiased. Example: The sample median is not an unbiased estimator of the population mean (unless the population is symmetric).
  • Common FRQ Setup: You’ll often be given a scenario (e.g., “A sample of 50 students has a mean SAT score of 1200 with s = 200”) and asked to:
  • Check conditions.
  • Calculate a confidence interval.
  • Interpret the interval in context.
  • Calculator Pitfall: Using normalcdf for t-distributions (or vice versa). Remember: normalcdf is for z-scores; tcdf is for t-scores.
  • Variability Focus: Expect questions about how sample size (n) affects SE (e.g., “If n quadruples, how does SE change?” Answer: SE is halved).

Quick Check Questions

  1. Multiple Choice: A random sample of 100 voters finds that 58% support a policy. Which of the following is the standard error of the sample proportion?
  2. (A) ?(0.58 × 0.42 / 100)
  3. (B) ?(0.58 × 0.42) / 100
  4. (C) 0.58 / ?100
  5. (D) ?(0.58 × 0.42 × 100) Answer: (A). SE for p? = ?[p?(1?p?)/n].

  6. FRQ Part: A factory claims its new machine produces-2% defective items. A sample of 400 items has 12 defects. Is the sample proportion an unbiased estimator of the true defect rate? Explain. Answer: Yes, because the sample was randomly selected, so p? is an unbiased estimator of p. (Unbiasedness doesn’t depend on the sample size or the value of p?.)

  7. Multiple Choice: Which of the following is not a condition for constructing a confidence interval for a population mean?

  8. (A) The sample is random.
  9. (B) The population is normally distributed.
  10. (C) The sample size is at least 30.
  11. (D) The 10% condition is satisfied. Answer: (B). The population doesn’t need to be normal if n-30 (CLT).

Last-Minute Cram Sheet

  1. Unbiased estimator: Statistic = parameter on average (e.g., x? for ?, p? for p).
  2. SE for p?: ?[p?(1?p?)/n] (use z* for CI).
  3. SE for x?: s/?n (use t* for CI if-unknown).
  4. Conditions for inference: Random, 10% (if sampling w/o replacement), Normal/Large Sample.
  5. Large Counts for p: np-10 and n(1?p)-10.
  6. CLT for x?: n-30-sampling distribution-normal.
  7. Calculator: normalcdf for z, tcdf for t, invT for t*-critical.
  8. 10% Condition: Always check when sampling without replacement!
  9. z vs t: Use z for proportions, t for means (unless-is given).
  10. Interpret CI: “We are [C%] confident the true [parameter] is between [lower] and [upper].”