MLE Normal

Author

Prof. Eric A. Suess

Here we fit the MLE for the normal

\(X_1, X_2, ...,X_n\) iid \(Normal(\mu, \sigma^2)\), find the MMEs.

\(f(x|\mu,\sigma)= \frac{1}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\)

Consider a random sample from the normal distribution

n <- 30
mu <- 70
sigma <- 3
sigma2 <- 3^2

x <- rnorm(n, mu, sigma)

x.orig <- x

x.mean <- mean(x)
x.sd <- sd(x)
x.sigma <- ((n-1)/n)*x.sd

x.mean
[1] 70.46267
x.sd
[1] 2.886367
x.sigma
[1] 2.790154

make a histogram

use the density estimator since this is a continuous random variable

hist(x.orig, probability=T)
lines(density(x.orig))

MLE fitted normal

mu.hat.mle <- mean(x)
mu.hat.mle
[1] 70.46267
sigma.hat.mle <- sqrt((1/length(x))*sum((x-mean(x))^2))
sigma.hat.mle
[1] 2.837853
library(fitdistrplus)
Loading required package: MASS
Loading required package: survival
fit.norm <- fitdist(x.orig, "norm", method = "mle")
summary(fit.norm)
Fitting of the distribution ' norm ' by maximum likelihood 
Parameters : 
      estimate Std. Error
mean 70.462668  0.5181187
sd    2.837853  0.3663650
Loglikelihood:  -73.85959   AIC:  151.7192   BIC:  154.5216 
Correlation matrix:
     mean sd
mean    1  0
sd      0  1

plot the fitted model to the simulated data

denscomp(list(fit.norm), legendtext = c("Normal"))

cdfcomp(list(fit.norm), legendtext = c("Normal"))

qqcomp(list(fit.norm), legendtext = c("Normal"))

ppcomp(list(fit.norm), legendtext = c("Normal"))

exact confidence interval for mu

mu.ci.exact <- c(x.mean - 1.96*x.sd/sqrt(n), x.mean + 1.96*x.sd/sqrt(n))
mu.ci.exact
[1] 69.42980 71.49554

large sample confidence interval for mu

mu.ci.large <- c(mu.hat.mle - 1.96*sigma.hat.mle/sqrt(n), mu.hat.mle +
    1.96*sigma.hat.mle/sqrt(n))
mu.ci.large
[1] 69.44716 71.47818

parametric bootstrap the MLE

The AIC and BIC are defined as follows, with \(n\) the sample size and \(k\) the number of parameters in the model.

\[AIC = -2\log(L(\hat{\theta})) + 2k\] \[BIC = -2\log(L(\hat{\theta})) + k\log(n)\]

fit.norm <- fitdist(x.orig, "norm", method = "mle")
summary(fit.norm)
Fitting of the distribution ' norm ' by maximum likelihood 
Parameters : 
      estimate Std. Error
mean 70.462668  0.5181187
sd    2.837853  0.3663650
Loglikelihood:  -73.85959   AIC:  151.7192   BIC:  154.5216 
Correlation matrix:
     mean sd
mean    1  0
sd      0  1
b1 <- bootdist(fit.norm, niter=101)
print(b1)
Parameter values obtained with parametric bootstrap 
      mean       sd
1 70.85213 2.488993
2 70.56543 2.838896
3 70.56841 2.708783
4 70.73827 2.265646
5 71.27191 3.462995
6 71.77356 3.152377
plot(b1)

plot(b1, enhance=TRUE)

summary(b1)
Parametric bootstrap medians and 95% percentile CI 
       Median      2.5%     97.5%
mean 70.41450 69.461549 71.478334
sd    2.77378  2.078527  3.415867
quantile(b1)
(original) estimated quantiles for each specified probability (non-censored data)
            p=0.1    p=0.2   p=0.3    p=0.4    p=0.5    p=0.6    p=0.7    p=0.8
estimate 66.82581 68.07427 68.9745 69.74371 70.46267 71.18163 71.95084 72.85107
            p=0.9
estimate 74.09952
Median of bootstrap estimates
            p=0.1    p=0.2    p=0.3    p=0.4   p=0.5    p=0.6    p=0.7    p=0.8
estimate 66.83716 68.06072 68.93807 69.69861 70.4145 71.08974 71.86047 72.74163
            p=0.9
estimate 73.93848

two-sided 95 % CI of each quantile
          p=0.1    p=0.2    p=0.3    p=0.4    p=0.5    p=0.6    p=0.7    p=0.8
2.5 %  65.63042 67.05293 68.03640 68.80646 69.46155 70.01995 70.76205 71.52514
97.5 % 68.03967 69.10065 69.81496 70.67473 71.47833 72.20494 73.09006 74.13255
          p=0.9
2.5 %  72.57280
97.5 % 75.55012
CIcdfplot(b1, CI.output = "quantile")

Try fitting the Gamma model and compare.

fit.gamma <- fitdist(x.orig, "gamma", method = "mle")
summary(fit.gamma)
Fitting of the distribution ' gamma ' by maximum likelihood 
Parameters : 
        estimate Std. Error
shape 560.535627 144.370337
rate    7.956477   2.050171
Loglikelihood:  -73.88597   AIC:  151.7719   BIC:  154.5743 
Correlation matrix:
          shape      rate
shape 1.0000000 0.9995521
rate  0.9995521 1.0000000
b1 <- bootdist(fit.gamma, niter=101)
print(b1)
Parameter values obtained with parametric bootstrap 
     shape      rate
1 768.7199 10.922655
2 787.7607 11.081488
3 485.4683  6.940146
4 353.9997  5.023321
5 559.3497  7.953198
6 489.1456  7.003604
plot(b1)

plot(b1, enhance=TRUE)

summary(b1)
Parametric bootstrap medians and 95% percentile CI 
          Median      2.5%     97.5%
shape 603.111667 365.80645 912.94277
rate    8.471848   5.21425  13.04793
quantile(b1)
(original) estimated quantiles for each specified probability (non-censored data)
            p=0.1    p=0.2    p=0.3    p=0.4    p=0.5    p=0.6    p=0.7
estimate 66.66471 67.93443 68.85995 69.65742 70.40834 71.16464 71.97977
            p=0.8    p=0.9
estimate 72.94159 74.28957
Median of bootstrap estimates
            p=0.1    p=0.2    p=0.3    p=0.4    p=0.5    p=0.6    p=0.7
estimate 66.74114 68.02156 68.94469 69.70293 70.39044 71.12713 71.90729
            p=0.8    p=0.9
estimate 72.81991 74.16705

two-sided 95 % CI of each quantile
          p=0.1    p=0.2    p=0.3    p=0.4    p=0.5    p=0.6    p=0.7    p=0.8
2.5 %  65.33319 66.86202 67.88430 68.72539 69.38418 70.09700 70.86296 71.76849
97.5 % 68.07380 69.24629 70.10916 70.85212 71.55127 72.25501 73.07352 74.13034
          p=0.9
2.5 %  72.95977
97.5 % 75.61271
CIcdfplot(b1, CI.output = "quantile")