Chi-square
distribution
If Z is a random variable with a standardized normal distribution (mean zero and variance 1), then Z2 has a chi-square distribution with 1 degree of freedom. If Z1, Z2, ……, Zk, are a set of independent and identically distributed standard normal variables, then the sum of their squares has a chi-square distribution with k degrees of freedom.
NOTE: Let Xi be independent and normally distributed random variables with mean µ and variance 1. Then (Xi - µ) has mean zero and has a chi-square distribution with k degrees of freedom.
EXAMPLES
1) Car
speed is measured using a radar unit. In an urban area, the radar readings Xi are
normally distributed with a mean of 25 mph and a standard deviation of 3 mph. A recent
source claims that the standard deviation of his radar is 1. Imagine your manager
wishes to test the hypothesis H0: σ = 1 versus the
alternative hypothesis H1: σ ≠ 1. To carry out this test,
your manager has asked you to take five speed measurements using a test car
that was programmed to travel at 25 mph. This was done and the speed
measurements you recorded were 25, 24, 27, 25, 26. Should the hypothesis H0: σ = 1 be rejected at the
significance level α = 10%?
Let's calculate the variance of X:
The variance is
and the standard deviation is
You've found out that the sum of squared deviations, Σ (Xi-µ)2 equals 6. For a significance test at α = 10% and 5 degrees of freedom you will find in a chi-square distribution table:
The calculated value of 6 is inside the acceptance region. H0: σ = 1 cannot be rejected.
Let's calculate the variance of X:
You found out that the sum of squared deviations, Σ (Xi-µ)2 equals 63. For a significance test at α = 10% and 5 degrees of freedom you will find in a chi-square distribution table:
Since Σ (X- µ)2= 63 >11.07, the hypothesis the radar has standard deviation σ =1 was rejected at the 10% significance level. You have two samples and two different conclusions.
In fact, these two examples illustrate how random sampling variability can lead to different conclusions when testing a hypothesis with samples of the same size and the same significance level, from the same population in the same conditions. This occurs because sample statistics are not perfect estimates of their corresponding population parameters.
3) But your manager argued, pointing out that radar readings in urban areas usually follow a normal distribution with a mean speed of 25 mph and a standard deviation of 3 mph. He suggests taking another five measurements and test again the hypotheses H0: σ = 1 against H1: σ ≠ 1. Following this suggestion, you record five additional speeds: 29, 27, 29, 26, and 29. Should the hypothesis H0: σ = 1 be rejected at a significance level α = 10%?
The
numerator of the fraction should fall between 1.145 and 11.07 to support the
hypothesis H0: σ = 1, but
53 is unquestionably out. Therefore, you must reject the null hypothesis.
Are
you satisfied with your manager's sugestions? Perhaps you should look the given examples from a different perspective.
To test H0: σ =1, what
if you use the sample mean instead of the hypothesized mean µ = 25 mph? Let's calculate the sample mean and the sample variance:
By using the sample mean, the degrees of freedom are reduced to n - 1. Consequently,
the sum of squared deviations follows a chi-square distribution with 4 degrees
of freedom. Refer to the chi-square table to find the critical values for α=10 with 4 degrees of freedom.
.
Given that the sum of squared deviations equals 8 is less than 9.48, it is reasonable to conclude that σ = 1 at the 10% significance level, as claimed.
If you accept σ = 1, we can look at
You’ve made the assumption that the radar readings follow a normal
distribution with a mean μ = 25 and a standard deviation σ = 1. Then follow a normal
distribution with a mean μ = 25 and a standard deviation
The random sample of 5 observations you’ve collected has mean 28. Hence, you are able to test the
hypothesis that the mean of the radar readings is 25 against the alternative
hypothesis that it is greater than 25 using a z-test.
To calculate the z-score, we use the formula
Given that the z-score is greater than 1.645, you must reject the null hypothesis that the mean is 25 at 10 % level of significance.
Therefore, based on this sample of radar readings (29, 27, 29, 26, 29), it
is reasonable to conclude that the new radar is not calibrated (mean μ = 25 and a standard deviation σ = 1).
Now, you could think: would a radar with σ = 3 produce measurements as above? If σ = 3, σ2 = 9. How will the numerator of
behave? Remember: chi-square only tell what happens when E(X-µ)2 =1. But here E(X-µ)2 =9. What to do?
The
new test statistic is:
where s2 comes
from the sample and σ2 comes from H0.
The degrees of freedom associated with the test statistic (for finding the
critical statistic) is (n-1). For this test to be valid, the population must be normally
distributed. So, calculate
The test statistic chi square equals which is in the 95% region of acceptance: [0.4844: 11.1433]. H0 cannot be rejected.
IMPORTANT:
This is an exercise of statistics; it doesn’t tell you radar calibration
procedures. It is therefore useful to have a set of accepted standards and
protocols for maintaining the quality of radars.
I