By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.
The coefficient of determination (r²) measures the proportion of variability in the response variable (y) that can be explained by the linear relationship with the explanatory variable (x). It’s a key tool for assessing how well a least-squares regression line (LSRL) fits the data. On the AP exam, you’ll need to interpret r² in context, compare models, and explain its meaning in real-world scenarios (e.g., predicting house prices from square footage, explaining test scores based on study hours, or modeling crop yield from rainfall).
SST (Total Sum of Squares): Sum of squared deviations from the mean (( \sum (y - \bar{y})^2 )).
Interpretation of r²: "r²% of the variability in [response variable] is explained by the linear relationship with [explanatory variable]."
Correlation coefficient (r): Measures strength and direction of a linear relationship. ( r^2 = (\text{correlation})^2 ).
Residual (e): ( e = y - \hat{y} ) (observed – predicted). Used to assess model fit.
LSRL (Least-Squares Regression Line): ( \hat{y} = a + bx ), where ( b = r \cdot \frac{s_y}{s_x} ) and ( a = \bar{y} - b\bar{x} ).
Calculator command (TI-84):
STAT-CALC-8:LinReg(a+bx)
Y1
Store RegEQ: Y1
r² value: Automatically displayed in the output (or VARS-Statistics-EQ-r²).
VARS-Statistics-EQ-r²
Conditions for regression inference (LINER):
Random: Data comes from a random sample or experiment.
Hypothesis test for slope (?):
How to interpret r² in an FRQ:1. Identify variables: - Explanatory (x) and response (y) variables in context. - Example: x = hours studied, y = test score.
LinReg(a+bx)
Example output: r² = 0.72.
r² = 0.72
Interpret r² in context:
Avoid: "72% of the data fits the model" (incorrect).
Compare models (if asked):
Example: If r² = 0.85 for Model A and r² = 0.60 for Model B, Model A explains more variability.
Check conditions (if inference is required):
Verify LINER conditions (especially linearity and equal variance).
Conclude in context:
Mistake: Saying r² measures the strength of the relationship (like r). Correction: r² measures the proportion of variability explained by the model. Use r for strength/direction.
Mistake: Interpreting r² as a percentage of data points that fit the model. Correction: r² is about variability, not individual points. Say: "X% of the variability in y is explained by x."
Mistake: Ignoring units or context in interpretation. Correction: Always include the response/explanatory variables.-"r² = 0.64"-"64% of the variability in house prices is explained by square footage."
Mistake: Assuming a high r² means causation. Correction: r² only measures association. Correlation-causation (e.g., ice cream sales and drowning deaths are correlated but not causal).
Mistake: Forgetting to check LINER conditions before making inferences. Correction: Always verify conditions if the question involves hypothesis tests or confidence intervals for slope.
Connecting r² to residual plots (e.g., "Does a high r² guarantee a good fit?").
Tricky distinctions:
Residuals vs. r²: Even with a high r², residuals might show patterns (e.g., curvature), indicating a poor linear fit.
Calculator pitfalls:
Y=
LinReg
Misinterpreting the output: r² is labeled as "r²" in the TI-84 output, not "r."
Common FRQ setups:
Answer: B. r² measures the proportion of variability in the response variable explained by the explanatory variable.
Answer: a) "81% of the variability in house prices is explained by the linear relationship with square footage." b) No. r² measures explained variability, not the accuracy of individual predictions. Some houses may still be over/underpriced.
Answer: B. A U-shaped pattern indicates a nonlinear relationship, so r² may overstate the linear fit.
LinRegTTest
Join 4M+ learners. Unlock unlimited quizzes, wrong-answer tracking, flashcards + reminders, study guides, and 1-on-1 challenges.