Fatskills
Practice. Master. Repeat.
Study Guide: Intro to Marketing Research: Correlation and Regression Multiple Linear Regression Model Y β₀ β₁X₁ β kX k ε Coefficients Adjusted Rsquared Multicollinearity VIF
Source: https://www.fatskills.com/marketing-management/chapter/marketing-research-mktresearch-correlation-and-regression-multiple-linear-regression-model-y-%CE%B2%E2%82%80-%CE%B2%E2%82%81x%E2%82%81-%CE%B2-kx-k-%CE%B5-coefficients-adjusted-rsquared-multicollinearity-vif

Intro to Marketing Research: Correlation and Regression Multiple Linear Regression Model Y β₀ β₁X₁ β kX k ε Coefficients Adjusted Rsquared Multicollinearity VIF

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~5 min read

What It Is

Multiple Linear Regression (MLR) is a statistical method used to model the relationship between a dependent variable (Y) and one or more independent variables (X). It estimates the coefficients of each independent variable, allowing marketers to understand the impact of each variable on the dependent variable. A classic example of MLR in marketing is the study by Hauser and Wernerfelt (1989) on the relationship between advertising and sales. They used MLR to analyze the effect of advertising on sales for a sample of 100 brands in the US. This study matters for marketing decision-making because it helps marketers understand the optimal advertising budget and media mix to achieve desired sales outcomes.

Key Terms & Concepts

  • Multiple Linear Regression (MLR): A statistical method that models the relationship between a dependent variable (Y) and one or more independent variables (X).
    • Formula: Y = β₀ + β₁X₁ + … + β_kX_k + ε
    • β₀: Intercept or constant term
    • β₁, …, β_k: Coefficients of each independent variable
    • ε: Error term or residual
  • Coefficients: The estimated values of the independent variables in the MLR equation.
    • Example: In the Hauser and Wernerfelt study, the coefficient for advertising was 0.5, indicating that a 1% increase in advertising led to a 0.5% increase in sales.
  • Adjusted R-squared: A measure of the goodness of fit of the MLR model, adjusted for the number of independent variables.
    • Formula: Adjusted R² = 1 - (n-1)/(n-k-1) * (1-R²)
    • n: Sample size
    • k: Number of independent variables
    • R²: Coefficient of determination
  • Multicollinearity: A condition where two or more independent variables are highly correlated, leading to unstable estimates of the coefficients.
    • Example: If two independent variables, X₁ and X₂, are highly correlated (e.g., R² > 0.9), the estimates of their coefficients may be unstable.
  • Variance Inflation Factor (VIF): A measure of multicollinearity, calculated as the ratio of the variance of the regression coefficient to the variance of the coefficient if the independent variables were uncorrelated.
    • Formula: VIF = 1 / (1 - R²)
    • R²: Coefficient of determination between the independent variables
  • Assumptions of MLR: The data should be normally distributed, homoscedastic, and have no multicollinearity.
    • Example: If the data is not normally distributed, the estimates of the coefficients may be biased.
  • Heteroscedasticity: A condition where the variance of the error term is not constant across all levels of the independent variables.
    • Example: If the variance of the error term increases with the level of the independent variable, the estimates of the coefficients may be biased.
  • Outliers: Data points that are significantly different from the rest of the data.
    • Example: If an outlier is present in the data, it may affect the estimates of the coefficients and the goodness of fit of the model.
  • Residuals: The difference between the observed and predicted values of the dependent variable.
    • Example: Residuals can be used to detect outliers and assess the goodness of fit of the model.
  • Cook's Distance: A measure of the influence of each data point on the estimates of the coefficients.
    • Formula: Cook's Distance = (n-1)/(n-k-1) * (r²/(1-r²))
    • n: Sample size
    • k: Number of independent variables
    • r²: Coefficient of determination
  • Leverage: A measure of the influence of each data point on the estimates of the coefficients.
    • Formula: Leverage = 1 / (n-k-1)
    • n: Sample size
    • k: Number of independent variables
  • Partial Regression Coefficient: The change in the dependent variable for a one-unit change in the independent variable, holding all other independent variables constant.
    • Example: In the Hauser and Wernerfelt study, the partial regression coefficient for advertising was 0.5, indicating that a 1% increase in advertising led to a 0.5% increase in sales, holding all other independent variables constant.

Common Misunderstandings

  • Misunderstanding: MLR is a type of exploratory data analysis.
  • Correction: MLR is a type of confirmatory data analysis, used to test hypotheses and estimate the relationships between variables.
  • Misunderstanding: The coefficients in MLR are always significant.
  • Correction: The coefficients in MLR are only significant if the p-value is less than the chosen significance level (e.g., 0.05).
  • Misunderstanding: Multicollinearity is not a problem in MLR.
  • Correction: Multicollinearity can be a problem in MLR, leading to unstable estimates of the coefficients.

Quick Application / Identification

Scenario: A marketing manager wants to estimate the effect of advertising and sales promotions on sales for a new product. The data includes the following variables: advertising expenditure, sales promotions expenditure, and sales. Which statistical method should the manager use to analyze the data?

Answer: Multiple Linear Regression (MLR) is the appropriate method to analyze the data.

Explanation: MLR is a statistical method that models the relationship between a dependent variable (sales) and one or more independent variables (advertising expenditure and sales promotions expenditure).

Scenario: A marketing manager wants to identify the most important independent variable in an MLR model. Which measure should the manager use to identify the most important variable?

Answer: The manager should use the partial regression coefficient to identify the most important variable.

Explanation: The partial regression coefficient measures the change in the dependent variable for a one-unit change in the independent variable, holding all other independent variables constant.

Scenario: A marketing manager wants to check for multicollinearity in an MLR model. Which measure should the manager use to check for multicollinearity?

Answer: The manager should use the Variance Inflation Factor (VIF) to check for multicollinearity.

Explanation: The VIF measures the ratio of the variance of the regression coefficient to the variance of the coefficient if the independent variables were uncorrelated.

Last-Minute Revision

  • ⚠️ The assumptions of MLR include normality, homoscedasticity, and no multicollinearity.
  • The formula for the adjusted R-squared is: Adjusted R² = 1 - (n-1)/(n-k-1) * (1-R²)
  • ⚠️ The coefficients in MLR are only significant if the p-value is less than the chosen significance level (e.g., 0.05).
  • The Variance Inflation Factor (VIF) is calculated as: VIF = 1 / (1 - R²)
  • ⚠️ Multicollinearity can be a problem in MLR, leading to unstable estimates of the coefficients.
  • The partial regression coefficient measures the change in the dependent variable for a one-unit change in the independent variable, holding all other independent variables constant.
  • Cook's Distance is a measure of the influence of each data point on the estimates of the coefficients.
  • Leverage is a measure of the influence of each data point on the estimates of the coefficients.
  • The residual is the difference between the observed and predicted values of the dependent variable.
  • ⚠️ Outliers can affect the estimates of the coefficients and the goodness of fit of the model.
  • The sample size (n) and the number of independent variables (k) are used to calculate the adjusted R-squared and the Variance Inflation Factor (VIF).


ADVERTISEMENT