The CLT states that when taking a random sample \(X_1, X_2, ..., X_n\) from any population with population mean \(\mu\) and population standard deviation \(\sigma\), then sample mean \(\bar{X}\) is Normally distributed with mean \(\mu\) and standard deviation \(\sigma/ \sqrt{n}\).
Or in repeated sampling the z-score is distributed standard normal.
\[ Z = \frac{\bar{X} - \mu}{\sigma/ \sqrt{n}} \]
Simulation:
Suppose we repeatedly take a samples of size \(n = 12\) from a Normal population with mean \(\mu = 1\) and standard deviation \(\sigma = 3\).
B <-10000n <-12mu <-1sigma <-3Z <-replicate(B, { x <-rnorm(n, mu, sigma) # I need mu to simulate the data Xbar <-mean(x) # Now assume I do not know mu (Xbar - mu) / (sigma/sqrt(n))})hist(Z)
plot(density(Z),main ="Standardized mean of 12 normal rvs", xlab ="Z")curve(dnorm(x), add =TRUE, col ="red")
T distribution, Sampling Distribution
Substitute the sample standard deviation for the population standard deviation.
Simulation:
Suppose we repeatedly take a samples of size \(n = 12\) from a Normal population with mean \(\mu = 1\) and standard deviation \(\sigma = 3\).
Or in repeated sampling the z-score is distributed standard normal.
\[ T = \frac{\bar{X} - \mu}{S/ \sqrt{n}} \]
B <-10000n <-12mu <-1sigma <-3T <-replicate(B, { x <-rnorm(n, mu, sigma) Xbar <-mean(x) Xsd <-sd(x) SE <- Xsd /sqrt(n) (Xbar - mu) / SE})hist(T)
plot(density(T),main ="Standardized mean of 12 normal rvs", xlab ="T")curve(dt(x, df = n-1), add =TRUE, col ="red")