---
title: "p-value"
author: "Prof. Eric A. Suess"
format:
  html:
    self-contained: true
---

## The distribution of the *p-value* under the null hypothesis.

Suppose we are sampling from a Normal population with mean $\mu_0 = 50$ and standard deviation $\sigma = 10$.  We will sample 100 observations from this population and test the null hypothesis that $\mu = 50$ against the alternative hypothesis that $\mu \neq 50$.  We will use a significance level of $\alpha = 0.05$.

```{r}
set.seed(1234)
x <- rnorm(100, 50, 10)
t.test(x, mu = 50)
```

We note that the p-value for the test can be accessed from the output directly.

```{r}
t.test(x, mu = 50)$p.value

```


Now will will repeat this experiment 10000 times and record the *p-value* for each experiment.  We examine the distribution of the p-value under the null hypothesis, $H_0: \mu = 50$.

**Answer:** The p-value is uniformly distributed under the null hypothesis.

```{r}  
p <- replicate(10000, {
  x <- rnorm(100, 50, 10)
  t.test(x, mu = 50)$p.value
})

hist(p, prob = TRUE, breaks = 20)
lines(density(p))
curve(dunif(x), add = TRUE, col = "red")
```

Now will will repeat this experiment 1000 times, changing the value of $\mu_1$ to 75, and record the *p-value* for each experiment.  We examine the distribution of the p-value under the null hypothesis, $H_0: \mu = 50$.

For different values of $\mu_1$, the p-value is not uniformly distributed under the null hypothesis.  

**Answer:** The p-value is **not** uniformly distributed under the alternative hypothesis.

```{r}
p <- replicate(10000, {
  x <- rnorm(100, 53, 10)   # mu_1 = 53
  t.test(x, mu = 50)$p.value
})

hist(p, prob = TRUE, breaks = 20)
lines(density(p))
curve(dunif(x), add = TRUE, col = "red")

```