4.7.1 Distribution of Sample Variance¶
Introduction¶
- Objective: Explore the sampling distribution of sample variance (s²) and its properties, particularly how it is calculated and its statistical significance.
Understanding Sample Variance¶
- Calculation: Sample variance (s²) is calculated by summing the squared deviations of each sample point from the sample mean (x̄), then dividing by (n-1) where n is the sample size.
- Rationale for n-1: Using n-1 (degrees of freedom) in the denominator corrects for bias in estimating population variance, as each data point contributes to the calculation of x̄.
Statistical Properties¶
- Distribution: Sample variance is a random variable with its own distribution, which depends on the underlying population distribution.
- Chi-Square Distribution: If the sample comes from a normally distributed population, (n-1)s²/σ² follows a chi-square distribution with (n-1) degrees of freedom, where σ² is the population variance.
Practical Implications¶
- Estimation Accuracy: The distribution allows for estimating how accurately s² approximates the population variance σ².
- Confidence Intervals: Using the chi-square distribution, we can construct confidence intervals for the population variance based on sample variance.
Application Example¶
- Scenario: In a customer satisfaction survey, understanding the variance in responses helps gauge the consistency of satisfaction across the sample.
- Calculation: Using the chi-square distribution to assess the probability that the sample variance is within a certain range of the true population variance.
Conclusion¶
- Significance: Knowledge of the distribution of sample variance is crucial for making informed decisions about the variability in data and understanding the reliability of variance estimates from samples.
- Limitations: Unlike the mean or proportion, there is no Central Limit Theorem for sample variance, making its distribution less straightforward to approximate without assuming normality in the underlying population.