Fatskills
Practice. Master. Repeat.
Study Guide: Intro to Business Statistics: Correlation and Regression - Correlation Coefficient, Pearson r Calculation Interpretation Hypothesis Test Coefficient of Determination R²
Source: https://www.fatskills.com/business-analytics/chapter/intro-to-business-statistics-busstats-correlation-and-regression-correlation-coefficient-pearson-r-calculation-interpretation-hypothesis-test-coefficient-of-determination-r%C2%B2

Intro to Business Statistics: Correlation and Regression - Correlation Coefficient, Pearson r Calculation Interpretation Hypothesis Test Coefficient of Determination R²

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~3 min read

What This Is

The correlation coefficient, also known as Pearson's r, measures the strength and direction of a linear relationship between two continuous variables. A retail chain wants to know if there's a relationship between the average temperature in a region and the average daily sales of its stores. By calculating the correlation coefficient, the chain can determine if warmer temperatures lead to higher sales.

Key Formulas & Symbols

  • Pearson's r = ?[(xi - x?)(yi - ?)] / (?[?(xi - x?)²] * ?[?(yi - ?)²]) where xi = individual data point, x? = sample mean, yi = individual data point,-= sample mean.
  • r² = 1 - [(?(xi - x?)² * ?(yi - ?)²) / (?(xi - x?)² * ?(yi - ?)²)] where r² is the coefficient of determination.
  • t = r * ?[(n - 2) / (1 - r²)] where t is the test statistic, r is the correlation coefficient, n is the sample size.
  • p-value = 2 * P(t > |t|) where p-value is the probability of observing the test statistic (or more extreme) if the null hypothesis is true.
  • Null Hypothesis (H?):-= 0 where-is the population correlation coefficient.
  • Alternative Hypothesis (H?):-? 0 where-is the population correlation coefficient.
  • Critical Value (t): depends on the degrees of freedom (n - 2) where n is the sample size.
  • Coefficient of Determination (r²): measures the proportion of the variance in one variable that is predictable from the other variable.

Step-by-Step Procedure

  1. State hypotheses: H?:-= 0 (no correlation) vs. H?:-? 0 (correlation exists).
  2. Choose test: Use the t-test for correlation coefficient.
  3. Compute test statistic: t = r * ?[(n - 2) / (1 - r²)].
  4. Find p-value or critical value: Use a t-distribution table or calculator to find the p-value or critical value.
  5. Compare to ?: Compare the p-value to? (default = 0.05) or compare the test statistic to the critical value.
  6. Conclude: If p-value <-or |t| > critical value, reject H? and conclude that a significant correlation exists.

Common Mistakes

  • Mistake: Misinterpreting the p-value as the probability that H? is true.
  • Correction: The p-value is the probability of observing the test statistic (or more extreme) if H? is true. It does not directly measure the probability of H? being true.
  • Mistake: Using the Z-test when the population standard deviation is unknown.
  • Correction: Use the t-test when the population standard deviation is unknown.
  • Mistake: Failing to check the assumptions of the t-test (normality, independence, and equal variances).
  • Correction: Check the assumptions before conducting the t-test.

Quick Practice Problems

  1. A marketing firm wants to know if there's a relationship between the number of hours spent watching TV and the number of hours spent browsing the internet. The correlation coefficient is 0.8. What is the p-value?

p-value = 2 * P(t > |t|) = 2 * P(t > 4.2)-0.0001

  1. A quality control team wants to know if there's a relationship between the temperature of a manufacturing process and the quality of the output. The correlation coefficient is 0.5. What is the coefficient of determination?

r² = 1 - [(?(xi - x?)² * ?(yi - ?)²) / (?(xi - x?)² * ?(yi - ?)²)]-0.25

  1. A sales manager wants to know if there's a relationship between the average price of a product and the average sales volume. The correlation coefficient is 0.9. What is the test statistic?

t = r * ?[(n - 2) / (1 - r²)] = 0.9 * ?[(100 - 2) / (1 - 0.9²)]-6.3

Last-Minute Cram Sheet

  1. Pearson's r measures the strength and direction of a linear relationship.
  2. r² measures the proportion of the variance in one variable that is predictable from the other variable.
  3. t-test is used to test the correlation coefficient when the population standard deviation is unknown.
  4. p-value is the probability of observing the test statistic (or more extreme) if H? is true.
  5. H?:-= 0 (no correlation) vs. H?:-? 0 (correlation exists).
  6. Critical value depends on the degrees of freedom (n - 2).
  7. Assumptions of the t-test: normality, independence, and equal variances.
  8. p-value is NOT the probability that H? is true – it's the probability of observing the data (or more extreme) if H? is true.
  9. Use the t-test when the population standard deviation is unknown.
  10. Check the assumptions before conducting the t-test.