Fatskills
Practice. Master. Repeat.
Study Guide: AP Statistics (AP Stats): Residuals and Residual Plots – Checking Linearity
Source: https://www.fatskills.com/ap-statistics/chapter/ap-stats-ap-statistics-residuals-and-residual-plots-checking-linearity

AP Statistics (AP Stats): Residuals and Residual Plots – Checking Linearity

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~5 min read

AP Statistics – Residuals and Residual Plots – Checking Linearity

AP Statistics Study Guide: Residuals and Residual Plots – Checking Linearity

What This Is

Residuals are the differences between observed values and predicted values from a regression line. Residual plots help assess whether a linear model is appropriate for the data. On the AP exam, you’ll use residuals to check linearity, a key condition for inference in regression. Real-world example: A real estate agent wants to predict home prices based on square footage. If the residual plot shows a curved pattern, a linear model may not be the best fit, and predictions could be inaccurate.


Key Terms & Formulas

  • Residual (e): e = y – ?, where y is the observed value and ? is the predicted value from the regression line.
  • Residual plot: A scatterplot of residuals (e) vs. the explanatory variable (x) or predicted values (?). Used to check linearity.
  • Linear regression model: ? = a + bx, where a is the y-intercept, b is the slope, and ? is the predicted value.
  • Least-squares regression line (LSRL): The line that minimizes the sum of squared residuals (?e²).
  • Linearity condition: The relationship between x and y should be approximately linear (checked via residual plot).
  • TI-84: Residuals list: After running LinReg(a+bx), residuals are stored in RESID (access via 2nd-STAT-NAMES-RESID).
  • TI-84: Residual plot: STAT PLOT-Plot1-Xlist: L1, Ylist: RESID-ZoomStat.
  • Random scatter in residual plot: Indicates linearity is reasonable.
  • Patterned residual plot (e.g., curve, funnel): Suggests a nonlinear model may be better.
  • Outliers in residual plot: Points with unusually large residuals (potential influential points).
  • Standardized residual: Residual divided by its standard error (used to identify outliers; values > |2| or |3| are concerning).
  • R² (coefficient of determination): Proportion of variation in y explained by the regression line (not directly about linearity but often tested alongside residuals).

Step-by-Step / Process Flow

How to check linearity using residuals (AP FRQ-style):

  1. Run the regression:
  2. Enter data into L1 (x) and L2 (y).
  3. Use STAT-CALC-8:LinReg(a+bx) L1, L2, Y1 to store the regression equation in Y1.

  4. Store and plot residuals:

  5. After running LinReg, residuals are automatically stored in RESID.
  6. Set up a residual plot: 2nd-Y= (STAT PLOT)-Plot1-Xlist: L1, Ylist: RESID-ZoomStat.

  7. Interpret the residual plot:

  8. Good linearity: Residuals are randomly scattered around y = 0 with no clear pattern.
  9. Nonlinearity: Residuals show a curved pattern (e.g., U-shape or inverted U).
  10. Heteroscedasticity (unequal variance): Residuals fan out or funnel (violates equal variance condition).

  11. Conclude on linearity:

  12. If the residual plot shows random scatter, the linearity condition is met.
  13. If not, consider a transformation (e.g., log, square root) or a nonlinear model.

  14. Report findings in context:

  15. Example: "The residual plot shows a random scatter around 0, so a linear model is appropriate for predicting home prices from square footage."

Common Mistakes

  • Mistake: Confusing residuals with errors. Correction: Residuals are observed errors (y – ?), while errors (?) are theoretical and unobservable. Always use residuals for plots.

  • Mistake: Ignoring the scale of the residual plot. Correction: Zoom in/out to see patterns clearly. A residual plot with a tiny scale might hide curvature.

  • Mistake: Assuming a linear model is always best if is high. Correction: measures strength of fit, not linearity. A high with a curved residual plot still violates linearity.

  • Mistake: Forgetting to check for outliers in the residual plot. Correction: Large residuals (far from 0) may indicate influential points that distort the regression line.

  • Mistake: Using x vs. y instead of residuals vs. x for the plot. Correction: The residual plot must be residuals vs. x (or ?) to assess linearity.


AP Exam Insights

  • Frequent FRQ setup: You’ll be given a scatterplot and regression output (or asked to generate it) and must:
  • Create a residual plot.
  • Interpret it to assess linearity.
  • Justify whether a linear model is appropriate.
  • Tricky distinction: Residual plots check linearity, not strength of relationship. A weak linear relationship can still have a random residual plot.
  • Calculator pitfall: Students forget to store residuals after running LinReg. Always use LinReg(a+bx) L1, L2, Y1 to store the equation and residuals.
  • Common trap: The AP exam may show a residual plot with a slight curve but ask if linearity is "reasonable." Answer based on the overall pattern—minor deviations are often acceptable.

Quick Check Questions

  1. Multiple Choice: A residual plot for a regression of y on x shows a clear U-shaped pattern. Which of the following is the most appropriate conclusion? (A) The linearity condition is met. (B) A linear model is appropriate. (C) A nonlinear model may fit the data better. (D) The residuals are randomly scattered.

Answer: (C) A U-shaped pattern suggests the relationship is nonlinear, so a linear model may not be appropriate.

  1. FRQ Part: A student runs a regression to predict test scores (y) from hours studied (x). The residual plot is shown below (imagine a random scatter around 0).
  2. Does the residual plot suggest that a linear model is appropriate? Justify your answer.

Answer: Yes, the residual plot shows a random scatter around 0 with no clear pattern, indicating that the linearity condition is met.

  1. Multiple Choice: Which of the following is not a purpose of a residual plot? (A) To check for linearity. (B) To identify outliers. (C) To determine the slope of the regression line. (D) To assess equal variance.

Answer: (C) The residual plot does not determine the slope; it assesses model fit.


Last-Minute Cram Sheet

  1. Residual formula: e = y – ? (observed – predicted).
  2. Residual plot: Residuals vs. x (or ?); random scatter = good linearity.
  3. TI-84 residual plot: STAT PLOT-Xlist: L1, Ylist: RESID-ZoomStat.
  4. Linearity condition: Check residual plot for patterns (curves = bad).
  5. Outliers in residuals: Large residuals (far from 0) may be influential.
  6. Heteroscedasticity: Residuals fan out/funnel-unequal variance (bad).
  7. Don’t confuse with linearity: High -linear relationship.
  8. Always store residuals: Use LinReg(a+bx) L1, L2, Y1 to save residuals to RESID.
  9. Residual plot-scatterplot: Must plot residuals vs. x, not y vs. x.
  10. AP FRQ tip: Always interpret residual plots in context (e.g., "The residual plot suggests a linear model is reasonable for predicting [response variable] from [explanatory variable].").