Skip to content

4.7.1 Distribution of Sample Variance

Introduction

  • Objective: Explore the sampling distribution of sample variance (s²) and its properties, particularly how it is calculated and its statistical significance.

Understanding Sample Variance

  • Calculation: Sample variance (s²) is calculated by summing the squared deviations of each sample point from the sample mean (x̄), then dividing by (n-1) where n is the sample size.
  • Rationale for n-1: Using n-1 (degrees of freedom) in the denominator corrects for bias in estimating population variance, as each data point contributes to the calculation of x̄.

Statistical Properties

  • Distribution: Sample variance is a random variable with its own distribution, which depends on the underlying population distribution.
  • Chi-Square Distribution: If the sample comes from a normally distributed population, (n-1)s²/σ² follows a chi-square distribution with (n-1) degrees of freedom, where σ² is the population variance.

Practical Implications

  • Estimation Accuracy: The distribution allows for estimating how accurately s² approximates the population variance σ².
  • Confidence Intervals: Using the chi-square distribution, we can construct confidence intervals for the population variance based on sample variance.

Application Example

  • Scenario: In a customer satisfaction survey, understanding the variance in responses helps gauge the consistency of satisfaction across the sample.
  • Calculation: Using the chi-square distribution to assess the probability that the sample variance is within a certain range of the true population variance.

Conclusion

  • Significance: Knowledge of the distribution of sample variance is crucial for making informed decisions about the variability in data and understanding the reliability of variance estimates from samples.
  • Limitations: Unlike the mean or proportion, there is no Central Limit Theorem for sample variance, making its distribution less straightforward to approximate without assuming normality in the underlying population.