Stat. 316: Sampling

Author

Prof. Eric A. Suess

Published

March 20, 2024

Central Limit Theorem (CLT)

The CLT states that when taking a random sample \(X_1, X_2, ..., X_n\) from any population with population mean \(\mu\) and population standard deviation \(\sigma\), then sample mean \(\bar{X}\) is Normally distributed with mean \(\mu\) and standard deviation \(\sigma/ \sqrt{n}\).

Or in repeated sampling the z-score is distributed standard normal.

\[ Z = \frac{\bar{X} - \mu}{\sigma/ \sqrt{n}} \]

Simulation:

Suppose we repeatedly take a samples of size \(n = 12\) from a Normal population with mean \(\mu = 1\) and standard deviation \(\sigma = 3\).

B <- 10000

n <- 12
mu <- 1
sigma <- 3

Z <- replicate(B, {
  x <- rnorm(n, mu, sigma)  # I need mu to simulate the data
  Xbar <- mean(x)           # Now assume I do not know mu
  (Xbar - mu) / (sigma/sqrt(n))
})

hist(Z)

plot(density(Z),
     main = "Standardized mean of 12 normal rvs", xlab = "Z"
)
curve(dnorm(x), add = TRUE, col = "red")

T distribution, Sampling Distribution

Substitute the sample standard deviation for the population standard deviation.

Simulation:

Suppose we repeatedly take a samples of size \(n = 12\) from a Normal population with mean \(\mu = 1\) and standard deviation \(\sigma = 3\).

Or in repeated sampling the z-score is distributed standard normal.

\[ T = \frac{\bar{X} - \mu}{S/ \sqrt{n}} \]

B <- 10000

n <- 12
mu <- 1
sigma <- 3

T <- replicate(B, {
  x <- rnorm(n, mu, sigma)
  Xbar <- mean(x)
  Xsd <- sd(x)
  SE <- Xsd / sqrt(n)
  (Xbar - mu) / SE
})

hist(T)

plot(density(T),
     main = "Standardized mean of 12 normal rvs", xlab = "T")
curve(dt(x, df = n-1), add = TRUE, col = "red")