---
title: "p-value"
author: "Prof. Eric A. Suess"
format:
  html:
    self-contained: true
---

## Simulation of the Confidence level of a Confidence Interval.

Suppose we are sampling from a normal distribution with mean $\mu$ and standard deviation $\sigma$.  We want to estimate $\mu$ with a confidence interval.  We will use the sample mean $\bar{X}$ as our estimator.  We will use the sample standard deviation $s$ as our estimator of $\sigma$.  We will use the t-distribution to construct a confidence interval.  We will use the following formula to construct a confidence interval:

$$\bar{x} \pm t_{\alpha/2, n-1} \frac{s}{\sqrt{n}}$$

where $t_{\alpha/2, n-1}$ is the $\alpha/2$ quantile of the t-distribution with $n-1$ degrees of freedom.

We will simulate the Confidence level of this Confidence Interval.  We will do this by simulating many samples from a normal distribution with mean $\mu$ and standard deviation $\sigma$.  We will then construct a confidence interval for each sample.  We will then count the number of confidence intervals that contain $\mu$.  This will give us an estimate of the confidence level of the confidence interval.

We will use the following parameters:

* $\mu = 0$
* $\sigma = 1$
* $\alpha = 0.05$
* $n = 10$

We will simulate 10000 samples.

We will use the following R code to simulate the confidence level of the confidence interval.

```{r}
# Confidence level of a confidence interval
# Simulate many samples from a normal distribution
# Construct a confidence interval for each sample
# Count the number of confidence intervals that contain the mean
# This will give us an estimate of the confidence level of the confidence interval

# Parameters
mu <- 0
sigma <- 1
alpha <- 0.05
n <- 10
n.sim <- 10000

count <- 0
for (i in 1:n.sim) {
  # Simulate a sample from a normal distribution
  x <- rnorm(n, mu, sigma)
  # Construct a confidence interval
  ci <- mean(x) + qt(1 - alpha/2, n - 1) * sd(x) / sqrt(n) * c(-1, 1)
  # Count the number of confidence intervals that contain the mean
  if (ci[1] <= mu && mu <= ci[2]) {
    count <- count + 1
  }
}
  
# Estimate the confidence level of the confidence interval
count / n.sim

# The confidence level of the confidence interval is approximately 0.95
```

We rewrite the code using the `replicate` function.

```{r}
# Confidence level of a confidence interval
# Simulate many samples from a normal distribution
# Construct a confidence interval for each sample
# Count the number of confidence intervals that contain the mean
# This will give us an estimate of the confidence level of the confidence interval

# Parameters
mu <- 0
sigma <- 1
alpha <- 0.05
n <- 10
n.sim <- 10000

count <- sum(replicate(n.sim, {
  # Simulate a sample from a normal distribution
  x <- rnorm(n, mu, sigma)
  # Construct a confidence interval
  ci <- mean(x) + qt(1 - alpha/2, n - 1) * sd(x) / sqrt(n) * c(-1, 1)
  # Count the number of confidence intervals that contain the mean
  if (ci[1] <= mu && mu <= ci[2]) 1 else 0
}))

# Estimate the confidence level of the confidence interval
count / n.sim

# The confidence level of the confidence interval is approximately 0.95
```

Rewrite the code using the Tidyverse functions and nesting the samples in a list column of a tibble.

```{r}
library(tidyverse)

# Confidence level of a confidence interval
# Simulate many samples from a normal distribution
# Construct a confidence interval for each sample
# Count the number of confidence intervals that contain the mean
# This will give us an estimate of the confidence level of the confidence interval

# Parameters
mu <- 0
sigma <- 1
alpha <- 0.05
n <- 10
n.sim <- 10000

# Simulate many samples from a normal distribution
# Construct a confidence interval for each sample
# Count the number of confidence intervals that contain the mean
# This will give us an estimate of the confidence level of the confidence interval

count <- tibble(sim = 1:n.sim) |> 
  mutate(x = map(sim, ~rnorm(n, mu, sigma))) |> 
  mutate(ci = map(x, ~mean(.) + qt(1 - alpha/2, n - 1) * sd(.) / sqrt(n) * c(-1, 1))) |> 
  mutate(count = map_dbl(ci, ~if (.x[1] <= mu && mu <= .x[2]) 1 else 0)) |> 
  summarize(count = sum(count)) |> 
  pull(count)

# Estimate the confidence level of the confidence interval
count / n.sim

# The confidence level of the confidence interval is approximately 0.95

```

Now we use ggplot to visualize the simulation of the confidence intervals.

```{r}
library(tidyverse)

# Confidence level of a confidence interval
# Simulate many samples from a normal distribution
# Construct a confidence interval for each sample
# Count the number of confidence intervals that contain the mean
# This will give us an estimate of the confidence level of the confidence interval

# Parameters
mu <- 0
sigma <- 1
alpha <- 0.05
n <- 10
n.sim <- 100

# Simulate many samples from a normal distribution
# Construct a confidence interval for each sample
# Plot the confidence intervals on a plot, connecting the high and low values of the confidence interval

tibble(sim = 1:n.sim) |> 
  mutate(x = map(sim, ~rnorm(n, mu, sigma))) |> 
  mutate(ci = map(x, ~mean(.) + qt(1 - alpha/2, n - 1) * sd(.) / sqrt(n) * c(-1, 1))) |> 
  unnest(ci) |> 
  mutate(sim = factor(sim)) |> 
  ggplot(aes(x = sim, y = ci, group = sim)) +
  geom_line() +
  geom_point() +
  geom_hline(yintercept = mu, color = "red") +
  labs(x = "Simulation", y = "Confidence Interval", title = "Confidence Intervals for the Mean") +
  theme_bw()

```

# The confidence level of the confidence interval is approximately 0.95


## CI Simulation

```{r}
library(BSDA)
```


```{r}
CIsim(100, 30, 100, 10)
    # Simulates 100 samples of size 30 from 
    # a normal distribution with mean 100
    # and standard deviation 10.  From the
    # 100 simulated samples, 95% confidence
    # intervals for the Mean are constructed 
    # and depicted in the graph. 

CIsim(100, 30, 100, 10, type="Var")
    # Simulates 100 samples of size 30 from 
    # a normal distribution with mean 100
    # and standard deviation 10.  From the
    # 100 simulated samples, 95% confidence
    # intervals for the variance are constructed 
    # and depicted in the graph.
    
CIsim(100, 50, .5, type="Pi", conf.level=.90)     
    # Simulates 100 samples of size 50 from 
    # a binomial distribution where the population
    # proportion of successes is 0.5.  From the
    # 100 simulated samples, 90% confidence
    # intervals for Pi are constructed 
    # and depicted in the graph.  
```