7.3.1 Applications with Examples - I¶
Introduction¶
- Objective: This session illustrates the application of multiple linear regression (MLR) using real-world data to determine factors influencing household debt.
- Context: Manjula Nayak from 99acres is analyzing a dataset from a household survey in Bengaluru to understand the determinants of household debt, specifically looking at household income and monthly payments.
Data Overview¶
- Dataset: Consists of approximately 500 observations from a household survey in Bengaluru.
- Variables: Includes household size, residential location, monthly income, and monthly payments.
- Goal: To identify if household income and monthly payments significantly determine household debt.
Methodology¶
- Regression Analysis Setup:
- Dependent Variable (Y): Household debt.
- Independent Variables (X1 and X2): First income and monthly payments, respectively.
- Data Columns: Debt (Column K), first income (Column H), and monthly payments (Column I).
Analytical Tools¶
- Tool Used: Analysis tool pack in Excel.
- Procedure: Conduct regression analysis with household debt as the dependent variable and first income and monthly payments as independent variables.
Statistical Analysis¶
- Residuals and Standardization:
- Calculation of residuals and standardized residuals to assess model fit and assumptions.
- Creation of plots for visual inspection of residual distribution and anomalies.
Results and Interpretation¶
- R-Squared Value: Increased from 0.31 (with only first income) to 0.45 in the MLR model, indicating an improvement in model fit with the addition of monthly payments.
- Adjusted R-Squared: Close to the R-squared value at approximately 0.448, indicating that the model adequately adjusts for the number of predictors.
- Standard Error: Decreased compared to the individual regression models, suggesting better precision with the inclusion of two independent variables.
ANOVA and Regression Output¶
- SSR and SSE: Noted significant sums of squares for regression and error, with SSR increasing in the MLR model compared to single-variable models.
- F-Statistic: Indicates the overall significance of the regression model with a very low p-value, leading to rejection of the null hypothesis that both beta coefficients are zero.
Coefficient Analysis¶
- Interpretation of Coefficients:
- B1 (First Income): A unit increase in first income, holding monthly payments constant, increases debt by 0.025.
- B2 (Monthly Payments): A unit increase in monthly payments, holding first income constant, increases debt by 2.09.
- Significance Tests: Both coefficients are significant with very low p-values, confirming their influence on household debt.
Conclusion¶
- Implications: The analysis demonstrates a significant relationship between household debt and both first income and monthly payments.
- Further Analysis: Suggests conducting further MLR with additional variables like utilities to explore other potential influences on debt.