Fatskills
Practice. Master. Repeat.
Study Guide: AP Statistics (AP Stats): Scatterplots (Direction, Form, Strength, Unusual Features)
Source: https://www.fatskills.com/ap-statistics/chapter/ap-stats-ap-statistics-scatterplots-direction-form-strength-unusual-features

AP Statistics (AP Stats): Scatterplots (Direction, Form, Strength, Unusual Features)

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~5 min read

AP Statistics – Scatterplots (Direction, Form, Strength, Unusual Features)

AP Statistics: Scatterplots (Direction, Form, Strength, Unusual Features) – Exam-Ready Study Guide

What This Is

A scatterplot is a graphical display of the relationship between two quantitative variables. It helps us assess direction (positive/negative), form (linear/nonlinear), strength (weak/moderate/strong), and unusual features (outliers, clusters, influential points). On the AP exam, you’ll analyze scatterplots to describe associations, check conditions for regression, and interpret correlation. Real-world example: A real estate agent wants to predict home prices based on square footage—does a larger home generally cost more, and how strong is that relationship?


Key Terms & Formulas

  • Scatterplot: A graph of paired (x, y) data points used to assess the relationship between two quantitative variables.
  • Direction:
  • Positive association: As x increases, y tends to increase.
  • Negative association: As x increases, y tends to decrease.
  • Form:
  • Linear: Points roughly follow a straight line.
  • Nonlinear: Points follow a curve (e.g., quadratic, exponential).
  • Strength: How closely points follow a pattern (weak, moderate, strong).
  • Correlation (r): Measures the direction and strength of a linear relationship between two quantitative variables. Ranges from -1 to 1.
  • r > 0-Positive association
  • r < 0-Negative association
  • |r|-1-Strong linear relationship
  • |r|-0-Weak or no linear relationship
  • Outlier: A point that falls outside the overall pattern of the data.
  • Influential point: An outlier that, if removed, dramatically changes the correlation or regression line.
  • Lurking variable: A variable not included in the study that may influence the relationship between x and y.
  • Calculator command (TI-84):
  • LinReg(ax+b) (STAT-CALC-4): Computes the least-squares regression line and r (turn DiagnosticOn in 2nd-0-CATALOG first).
  • STAT PLOT (2nd-Y=): Set up a scatterplot (choose the first plot type, enter x and y lists).

Step-by-Step / Process Flow

How to analyze a scatterplot on the AP exam (FRQ or MC):

  1. Describe the relationship:
  2. Direction: Positive, negative, or none?
  3. Form: Linear, curved, or no clear pattern?
  4. Strength: Weak, moderate, or strong? (Use r if given.)
  5. Unusual features: Outliers, clusters, or influential points?

  6. Check for linearity (if regression is involved):

  7. Does the scatterplot look roughly linear? If not, linear regression may not be appropriate.

  8. Interpret correlation (r):

  9. State the sign (positive/negative) and strength (weak/moderate/strong).
  10. Example: "There is a strong, positive, linear relationship between study time and test scores (r = 0.85)."

  11. Identify unusual points:

  12. Outliers: Points far from the pattern.
  13. Influential points: Points that change the regression line significantly if removed.

  14. Avoid causation claims:

  15. Never say "x causes y" unless it’s a controlled experiment. Use phrases like "associated with" or "linked to."

Common Mistakes

  • Mistake: Saying a relationship is "strong" just because it’s linear. Correction: Strength depends on how tightly points cluster around the line, not just the form. Use r to quantify strength.

  • Mistake: Ignoring nonlinear patterns and forcing a linear interpretation. Correction: If the scatterplot curves, don’t use linear regression—mention the nonlinear form instead.

  • Mistake: Confusing r with slope. Correction: r measures strength and direction, while slope (b) measures the rate of change in y per unit x.

  • Mistake: Claiming causation from correlation. Correction: Correlation-causation! Always mention lurking variables (e.g., "Ice cream sales and drowning deaths are correlated, but hot weather is a lurking variable.").

  • Mistake: Forgetting to check for influential points. Correction: Always look for points that could distort the regression line—especially if they’re far from the rest of the data in the x-direction.


AP Exam Insights

  • Frequent FRQ setups:
  • "Describe the relationship between [x] and [y] in context."
  • "Is a linear model appropriate? Justify your answer."
  • "Identify any outliers or influential points and explain their impact."
  • "Interpret the correlation coefficient in context."

  • Tricky distinctions:

  • Correlation vs. causation: The AP exam loves testing this. Always say "associated" unless it’s an experiment.
  • Strength vs. slope: A steep slope doesn’t mean a strong correlation (e.g., r = 0.3 with a steep slope is still weak).
  • Outliers vs. influential points: All influential points are outliers, but not all outliers are influential.

  • Calculator pitfalls:

  • DiagnosticOn must be enabled to see r and in LinReg.
  • Mixing up x and y lists in STAT PLOT or LinReg will give wrong results.

Quick Check Questions

  1. Multiple Choice: A scatterplot shows a clear curved pattern with no outliers. Which of the following is the most appropriate conclusion? (A) There is a strong, positive, linear relationship. (B) A linear model is not appropriate for these data. (C) The correlation coefficient is close to 1. (D) Removing an outlier would improve the fit.

Answer: (B) A linear model is not appropriate for these data. Explanation: The scatterplot is curved, so linear regression shouldn’t be used.

  1. FRQ Part: A study records the number of hours students sleep (x) and their test scores (y). The correlation is r = -0.45. (a) Interpret r in context. (b) Does this prove that less sleep causes lower test scores? Explain.

Answer: (a) There is a moderate, negative, linear relationship between hours of sleep and test scores. As sleep increases, test scores tend to decrease (or vice versa). (b) No, correlation does not imply causation. There could be lurking variables (e.g., stress, study habits) affecting both sleep and test scores.


Last-Minute Cram Sheet

  1. Direction: Positive (), negative (), or none.
  2. Form: Linear (straight), nonlinear (curved), or no pattern.
  3. Strength: Weak (r-0), moderate (r-±0.5), strong (r-±1).
  4. Correlation (r): Measures linear strength/direction (-1 to 1).
  5. Outlier: Point far from the pattern.
  6. Influential point: Outlier that changes the regression line if removed.
  7. Correlation-causation! Always mention lurking variables.
  8. Calculator: LinReg(ax+b)-gives r (turn DiagnosticOn first!).
  9. Check scatterplot form before regression—don’t force a line on a curve.
  10. Interpret r in context: Sign (direction), strength, and linearity.

Good luck—you’ve got this! ?