Skip to content

7.1.3 MLR - Coefficient of Determination

Introduction

  • Objective: This session delves into the coefficient of determination (\(R^2\)) in the context of multiple linear regression, explaining its significance and how it is calculated.
  • Context: Understanding \(R^2\) is essential for evaluating the explanatory power of a regression model involving multiple predictors.

Explaining the Coefficient of Determination

  • Definition: \(R^2\) measures the proportion of the total variance in the dependent variable that is predictable from the independent variables.
  • Formula:
  • In multiple linear regression, \(R^2\) is still calculated as the ratio of the sum of squares due to regression (SSR) to the total sum of squares (SST): [ R^2 = \frac{SSR}{SST} ]
  • SST represents the total variation around the mean of the dependent variable, while SSR indicates how much of that variation is explained by the model.

Calculation in Multiple Linear Regression

  • Components:
  • SSR (Sum of Squares due to Regression): Measures the explained variation by the regression model.
  • SSE (Sum of Squares due to Error): Measures the unexplained variation, which is the difference between SST and SSR.
  • Interpretation: \(R^2\) values range from 0 to 1, where higher values indicate a model with greater explanatory power.

Application and Example

  • Practical Use: Demonstrates how \(R^2\) is used to assess model fit in a scenario involving multiple predictors, such as annual income and household size, influencing a dependent variable like monthly spend.
  • Numerical Example:
  • With \(SST = 114197923\), \(SSR = 91468852\), and \(SSE = 22729070\), the \(R^2\) is calculated as approximately 0.80, suggesting that 80% of the variability in monthly spend is explained by the model.

Adjusted \(R^2\)

  • Purpose: Adjusted \(R^2\) accounts for the number of predictors in the model, providing a more accurate measure of model fit, especially when the number of predictors is large.

Conclusion

  • Significance: \(R^2\) and Adjusted \(R^2\) are critical metrics for evaluating the effectiveness of a multiple regression model, helping to determine how well the model captures the dynamics of the data.
  • Next Steps: The course will continue to explore deeper aspects of regression analysis, enhancing the participants' ability to handle complex models and derive insightful conclusions from their analyses.