Fatskills
Practice. Master. Repeat.
Study Guide: Intro to Marketing Research: Correlation and Regression - Assumptions of Linear Regression, Linearity Independence Homoscedasticity Normality of Residuals
Source: https://www.fatskills.com/marketing-management/chapter/marketing-research-mktresearch-correlation-and-regression-assumptions-of-linear-regression-linearity-independence-homoscedasticity-normality-of-residuals

Intro to Marketing Research: Correlation and Regression - Assumptions of Linear Regression, Linearity Independence Homoscedasticity Normality of Residuals

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~4 min read

What It Is

Assumptions of Linear Regression are fundamental conditions that must be met for a linear regression model to accurately predict the relationship between a dependent variable and one or more independent variables. A classic example is the famous study by Galton (1886) on the relationship between the height of parents and their children. Galton's study demonstrated the importance of understanding the assumptions of linear regression, as he found that the relationship between parent and child height was not perfectly linear, but rather exhibited a pattern of diminishing returns. This matters for marketing decision-making because it highlights the need to carefully evaluate the assumptions of linear regression before using it to make predictions or recommendations.

Key Terms & Concepts

  • Linearity: The relationship between the independent variable(s) and the dependent variable is linear, meaning that a straight line can be drawn through the data points.
    • Example: A study on the relationship between the price of a product and its demand, where a 10% increase in price leads to a 5% decrease in demand.
  • Independence: Each observation is independent of the others, meaning that the value of one observation does not affect the value of another.
    • Example: A survey of customers where each respondent is randomly selected and does not influence the responses of other respondents.
  • Homoscedasticity: The variance of the residuals is constant across all levels of the independent variable(s).
    • Example: A study on the relationship between the amount of money spent on advertising and sales, where the variance of the residuals is the same for all levels of advertising spend.
  • Normality of Residuals: The residuals are normally distributed, meaning that they follow a bell-shaped curve.
    • Example: A study on the relationship between the number of hours worked and employee satisfaction, where the residuals are normally distributed.
  • Regression Equation: Y = ?0 + ?1X + ?, where Y is the dependent variable, X is the independent variable, ?0 is the intercept, ?1 is the slope, and-is the error term.
    • Formula: Y = ?0 + ?1X + ?
  • Coefficient of Determination (R-squared): Measures the proportion of the variance in the dependent variable that is explained by the independent variable(s).
    • Formula: R-squared = 1 - (?(?^2) / ?(Y^2))
  • Standard Error of the Estimate: Measures the variability of the regression line.
    • Formula: Standard Error of the Estimate = ?(?(?^2) / (n - 2))
  • Heteroscedasticity: The variance of the residuals is not constant across all levels of the independent variable(s).
    • Example: A study on the relationship between the amount of money spent on advertising and sales, where the variance of the residuals increases as the level of advertising spend increases.
  • Multicollinearity: The independent variables are highly correlated with each other.
    • Example: A study on the relationship between the price of a product and its demand, where the price is highly correlated with the product's quality.
  • Autocorrelation: The residuals are correlated with each other.
    • Example: A study on the relationship between the number of hours worked and employee satisfaction, where the residuals are correlated with each other.

Common Misunderstandings

  • Misunderstanding: Homoscedasticity means that the variance of the residuals is the same for all levels of the independent variable(s).
  • Correction: Homoscedasticity means that the variance of the residuals is constant across all levels of the independent variable(s), not necessarily the same.
  • Misunderstanding: A linear regression model can handle non-normal residuals.
  • Correction: A linear regression model assumes normal residuals, and non-normal residuals can lead to biased estimates and incorrect conclusions.
  • Misunderstanding: Multicollinearity is not a problem in linear regression.
  • Correction: Multicollinearity can lead to unstable estimates and incorrect conclusions in linear regression.

Quick Application / Identification

Scenario: A marketing manager wants to predict the sales of a new product based on its price and advertising spend. The data shows a linear relationship between the price and sales, but the variance of the residuals increases as the price increases. What assumption of linear regression is violated?

Answer: Homoscedasticity is violated, as the variance of the residuals is not constant across all levels of the price.

Explanation: The marketing manager needs to consider alternative models, such as a non-linear regression model or a model that accounts for heteroscedasticity.

Last-Minute Revision

  • Linearity: Y = ?0 + ?1X + ?
  • Independence: Each observation is independent of the others
  • Homoscedasticity: Var(?) = ?^2
  • Normality of Residuals:-~ N(0, ?^2)
  • Regression Equation: Y = ?0 + ?1X + ?
  • Coefficient of Determination (R-squared): R-squared = 1 - (?(?^2) / ?(Y^2))
  • Standard Error of the Estimate: Standard Error of the Estimate = ?(?(?^2) / (n - 2))
  • Heteroscedasticity: Var(?)-?^2
  • Multicollinearity: Cor(X1, X2)-1
  • Autocorrelation: Cor(?_i, ?_{i-1})-0 : Homoscedasticity is not the same as constant variance. : Multicollinearity can lead to unstable estimates. : Autocorrelation can lead to biased estimates.