7.1.3 MLR - Coefficient of Determination¶
Introduction¶
- Objective: This session delves into the coefficient of determination (\(R^2\)) in the context of multiple linear regression, explaining its significance and how it is calculated.
- Context: Understanding \(R^2\) is essential for evaluating the explanatory power of a regression model involving multiple predictors.
Explaining the Coefficient of Determination¶
- Definition: \(R^2\) measures the proportion of the total variance in the dependent variable that is predictable from the independent variables.
- Formula:
- In multiple linear regression, \(R^2\) is still calculated as the ratio of the sum of squares due to regression (SSR) to the total sum of squares (SST): [ R^2 = \frac{SSR}{SST} ]
- SST represents the total variation around the mean of the dependent variable, while SSR indicates how much of that variation is explained by the model.
Calculation in Multiple Linear Regression¶
- Components:
- SSR (Sum of Squares due to Regression): Measures the explained variation by the regression model.
- SSE (Sum of Squares due to Error): Measures the unexplained variation, which is the difference between SST and SSR.
- Interpretation: \(R^2\) values range from 0 to 1, where higher values indicate a model with greater explanatory power.
Application and Example¶
- Practical Use: Demonstrates how \(R^2\) is used to assess model fit in a scenario involving multiple predictors, such as annual income and household size, influencing a dependent variable like monthly spend.
- Numerical Example:
- With \(SST = 114197923\), \(SSR = 91468852\), and \(SSE = 22729070\), the \(R^2\) is calculated as approximately 0.80, suggesting that 80% of the variability in monthly spend is explained by the model.
Adjusted \(R^2\)¶
- Purpose: Adjusted \(R^2\) accounts for the number of predictors in the model, providing a more accurate measure of model fit, especially when the number of predictors is large.
Conclusion¶
- Significance: \(R^2\) and Adjusted \(R^2\) are critical metrics for evaluating the effectiveness of a multiple regression model, helping to determine how well the model captures the dynamics of the data.
- Next Steps: The course will continue to explore deeper aspects of regression analysis, enhancing the participants' ability to handle complex models and derive insightful conclusions from their analyses.