By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.
Correlation (r) measures the strength and direction of a linear relationship between two quantitative variables. It’s essential on the AP exam because it’s the foundation for regression, residual analysis, and interpreting relationships in data. For example, a researcher might study whether study hours (x) and exam scores (y) are positively correlated—meaning more study time tends to lead to higher scores. However, correlation alone does not prove causation (e.g., ice cream sales and drowning deaths are correlated, but one doesn’t cause the other).
Formula (for reference, not memorization): [ r = \frac{1}{n-1} \sum \left( \frac{x_i - \bar{x}}{s_x} \right) \left( \frac{y_i - \bar{y}}{s_y} \right) ]
Calculator command (TI-84): LinReg(a+bx) L?, L?, Y? (stores regression equation in Y? and calculates r if "DiagnosticsOn" is enabled).
LinReg(a+bx) L?, L?, Y?
Enable diagnostics: 2nd-0 (Catalog)-DiagnosticOn-ENTER.
2nd-0 (Catalog)-DiagnosticOn-ENTER
Coefficient of determination (r²): The proportion of variation in y explained by the linear relationship with x.
Example: If r = 0.8, then r² = 0.64-64% of the variability in y is explained by x.
Lurking variable: A hidden variable that influences both x and y, creating a false appearance of causation.
Example: Shoe size and reading ability in children are correlated, but age is the lurking variable.
Extrapolation: Using a regression line to predict y for x-values outside the range of the data. Dangerous! (AP loves testing this.)
Residual: Observed y – Predicted y (y – ?). A pattern in residuals suggests a nonlinear relationship.
Outlier in regression: A point with a large residual or high leverage (far from the mean of x). Can strongly influence r and the regression line.
Influential point: An outlier that, if removed, dramatically changes the regression line or r.
Correlation vs. causation: Correlation does not imply causation! Just because two variables are correlated doesn’t mean one causes the other.
Conditions for correlation (LINER):
2nd-Y= (Stat Plot)-Plot1-On-Type: Scatter-Xlist: L?, Ylist: L?-Zoom-9 (ZoomStat)
Describe the direction (positive/negative), form (linear/nonlinear), and strength (weak/moderate/strong).
Calculate r and r².
Stat-CALC-8: LinReg(a+bx)-Enter L?, L?-Calculate
Interpret r²: "r²% of the variation in y is explained by the linear relationship with x."
Check LINER conditions.
Stat-EDIT-L? = RESID-2nd-Y=-Plot2-Histogram-ZoomStat
Stat-EDIT-L? = RESID-2nd-Y=-Plot1-Scatter-Ylist: L?-ZoomStat
Random: Was the data collected via random sampling/experiment?
Interpret the slope (if regression is involved).
Example: If the regression equation is ? = 2.5 + 0.8x (where x = study hours, y = exam score), the slope (0.8) means "For each additional hour studied, the exam score is predicted to increase by 0.8 points, on average."
Discuss limitations.
LinReg
DiagnosticOn
A study finds that the correlation between daily screen time (hours) and sleep duration (hours) is r = -0.45. Which of the following is the best interpretation of this value? (A) Increasing screen time causes a decrease in sleep duration. (B) There is a moderate negative linear relationship between screen time and sleep duration. (C) 45% of the variation in sleep duration is explained by screen time. (D) For each additional hour of screen time, sleep duration decreases by 0.45 hours.
Answer: (B) Explanation: r measures strength/direction, not causation or slope. r² = 0.2025 (20.25% explained), so (C) is wrong. (D) describes the slope, not r.
A researcher collects data on the number of hours students spend studying for an exam (x) and their exam scores (y). The regression output is shown below:
r = 0.82
(a) Interpret the value of r in context. (b) Interpret the slope of the regression line in context. (c) The researcher claims that studying more causes higher exam scores. Is this claim justified? Why or why not?
Answers: (a) There is a strong positive linear relationship between study hours and exam scores. (b) For each additional hour studied, the exam score is predicted to increase by 4.8 points, on average. (c) No, the claim is not justified. Correlation does not imply causation. There may be lurking variables (e.g., prior knowledge, sleep, IQ) that explain the relationship.
LinReg(a+bx) L?, L?
Join 4M+ learners. Unlock unlimited quizzes, wrong-answer tracking, flashcards + reminders, study guides, and 1-on-1 challenges.