The distribution of the p-value under the null hypothesis.
Suppose we are sampling from a Normal population with mean \(\mu_0 = 50\) and standard deviation \(\sigma = 10\). We will sample 100 observations from this population and test the null hypothesis that \(\mu = 50\) against the alternative hypothesis that \(\mu \neq 50\). We will use a significance level of \(\alpha = 0.05\).
set.seed(1234)x <-rnorm(100, 50, 10)t.test(x, mu =50)
One Sample t-test
data: x
t = -1.5607, df = 99, p-value = 0.1218
alternative hypothesis: true mean is not equal to 50
95 percent confidence interval:
46.43942 50.42534
sample estimates:
mean of x
48.43238
We note that the p-value for the test can be accessed from the output directly.
t.test(x, mu =50)$p.value
[1] 0.1217758
Now will will repeat this experiment 10000 times and record the p-value for each experiment. We examine the distribution of the p-value under the null hypothesis, \(H_0: \mu = 50\).
Answer: The p-value is uniformly distributed under the null hypothesis.
p <-replicate(10000, { x <-rnorm(100, 50, 10)t.test(x, mu =50)$p.value})hist(p, prob =TRUE, breaks =20)lines(density(p))curve(dunif(x), add =TRUE, col ="red")
Now will will repeat this experiment 1000 times, changing the value of \(\mu_1\) to 75, and record the p-value for each experiment. We examine the distribution of the p-value under the null hypothesis, \(H_0: \mu = 50\).
For different values of \(\mu_1\), the p-value is not uniformly distributed under the null hypothesis.
Answer: The p-value is not uniformly distributed under the alternative hypothesis.
p <-replicate(10000, { x <-rnorm(100, 53, 10) # mu_1 = 53t.test(x, mu =50)$p.value})hist(p, prob =TRUE, breaks =20)lines(density(p))curve(dunif(x), add =TRUE, col ="red")