Fit MLEs

Author

Prof. Eric A. Suess

Exercises

Exercise 1

a.

\[f(x|\theta) = \theta x^{\theta−1}\]

where \(0 < x < 1\), \(0 < \theta\), and 0 otherwise.

The model is \(Beta(\theta, 1)\), where \(\theta\) is the parameter of interest.

library(fitdistrplus)

Loading required package: MASS

Loading required package: survival

n <- 100
theta <- 0.5

x <- rbeta(n, theta, 1)

plotdist(x)

fit <- fitdist(x, "beta", start = list(shape1 = 1), fix.arg = list(shape2 = 1))

summary(fit)

Fitting of the distribution ' beta ' by maximum likelihood 
Parameters : 
        estimate Std. Error
shape1 0.5368586 0.05368567
Fixed parameters:
       value
shape2     1
Loglikelihood:  24.0668   AIC:  -46.13359   BIC:  -43.52842

llplot(fit, pal.col = heat.colors(100), fit.show = TRUE)

fit$estimate

   shape1 
0.5368586

fit$sd

    shape1 
0.05368567

confint(fit)

           2.5 %    97.5 %
shape1 0.4316366 0.6420806

Compare with the math.

Estimate

theta.hat <- -mean(log(x))^{-1}
theta.hat

[1] 0.5368582

Estimate AV

av.hat <- theta.hat^2/n
av.hat

[1] 0.002882167

se.hat <- sqrt(av.hat)
se.hat

[1] 0.05368582

confint <- c(theta.hat - 1.96*se.hat, theta.hat + 1.96*se.hat)
confint

[1] 0.4316340 0.6420824

b.

\[f(x|\theta) = (\theta + 1) x^{−\theta−2}\]

where \(1 < x\), \(0 < \theta\), and 0 otherwise.

The model is \(Pareto( (\theta+1), 1)\), where \(\theta\) is the parameter of interest.

library(fitdistrplus)
library(Pareto)

n <- 100
theta <- 5

x <- rPareto(n, t = 1, alpha = (theta + 1))

plotdist(x)

fit <- fitdist(x, dPareto, start = list(alpha = 10), fix.arg = list(t = 1))

Warning in checkparamlist(arg_startfix$start.arg, arg_startfix$fix.arg, : Some
parameter names have no starting/fixed value but have a default value:
truncation.

Warning in fitdist(x, dPareto, start = list(alpha = 10), fix.arg = list(t =
1)): The pPareto function should have its first argument named: q as in base R

summary(fit)

Fitting of the distribution ' Pareto ' by maximum likelihood 
Parameters : 
      estimate Std. Error
alpha 5.779624  0.5779624
Fixed parameters:
  value
t     1
Loglikelihood:  58.13171   AIC:  -114.2634   BIC:  -111.6582

llplot(fit, pal.col = heat.colors(100), fit.show = TRUE)

fit$estimate

   alpha 
5.779624

fit$sd

    alpha 
0.5779624

confint(fit)

         2.5 %  97.5 %
alpha 4.646839 6.91241

Compare with the math.

Note: \(\theta = \alpha - 1\). So subtract 1 from the MLE and from the Confidence Interval to the the MLE and CI for \(\theta\).

Estimate

theta.hat <- (n/sum(log(x))) - 1

theta.hat

[1] 4.779624

Estimate AV

av.hat <- ((theta.hat+1)^2)/n
av.hat

[1] 0.3340406

se.hat <- sqrt(av.hat)
se.hat

[1] 0.5779624

confint <- c(theta.hat - 1.96*se.hat, theta.hat + 1.96*se.hat) 
confint

[1] 3.646818 5.912431

c.

\[f(x|\theta) = \theta^{2} x e^{−\theta x}\]

where \(0 < x\), \(0 < \theta\), and 0 otherwise.

The model is \(Gamma(2, \theta)\), where \(\theta\) is the parameter of interest.

library(fitdistrplus)

n <- 100
theta <- 5

x <- rgamma(n, shape = 2, rate = theta)

plotdist(x)

fit <- fitdist(x, "gamma", start = list(rate = 2), fix.arg = list(shape = 2))

summary(fit)

Fitting of the distribution ' gamma ' by maximum likelihood 
Parameters : 
     estimate Std. Error
rate 4.554833  0.3220753
Fixed parameters:
      value
shape     2
Loglikelihood:  -3.607613   AIC:  9.215227   BIC:  11.8204

llplot(fit, pal.col = heat.colors(100), fit.show = TRUE)

fit$estimate

    rate 
4.554833

fit$sd

     rate 
0.3220753

confint(fit)

        2.5 %   97.5 %
rate 3.923577 5.186089

Compare with the math.

Estimate

theta.hat <- 2/mean(x)

theta.hat

[1] 4.554833

Estimate AV

av.hat <- (theta.hat^2)/(2*n)
av.hat

[1] 0.1037325

se.hat <- sqrt(av.hat)
se.hat

[1] 0.3220753

confint <- c(theta.hat - 1.96*se.hat, theta.hat + 1.96*se.hat)
confint

[1] 3.923565 5.186100

d.

\[f(x|\theta) = \theta(1 − \theta)^{x−1}\]

for \(x = 1, 2,...\), \(0 < \theta < 1\), and 0 otherwise

Note that the model given is for

\(X =\) number of trials until the first success

\[ f(x|\theta) = \theta (1-\theta)^{x-1} \]

where \(x = 1, 2, 3, \ldots\).

But the model in R is for

\(Y = X - 1 =\) number of failures before the first success

\[ f(x|\theta) = \theta (1-\theta)^{x} \]

where \(x = 0, 1, 2, 3, \ldots\).

The model is \(Geometric(\theta)\), where \(\theta\) is the parameter of interest. Use the model in R.

library(fitdistrplus)

n <- 100
theta <- 0.5

x <- rgeom(n, theta)

plotdist(x)

fit <- fitdist(x, "geom", start = list(prob = 0.5))

summary(fit)

Fitting of the distribution ' geom ' by maximum likelihood 
Parameters : 
      estimate Std. Error
prob 0.4608293 0.03383775
Loglikelihood:  -149.7464   AIC:  301.4927   BIC:  304.0979

llplot(fit, pal.col = heat.colors(100), fit.show = TRUE)

fit$estimate

     prob 
0.4608293

fit$sd

      prob 
0.03383775

confint(fit)

         2.5 %    97.5 %
prob 0.3945085 0.5271501

Compare with the math.

Since the model is different, we need to adjust the math.

The MLE is \(\hat{\theta } = \frac{1}{\bar{x} + 1}\).

Estimate

theta.hat <- (mean(x) +  1)^{-1}
theta.hat

[1] 0.4608295

Estimate AV

av.hat <- (theta.hat^2)*(1-theta.hat)/n
av.hat

[1] 0.001145003

se.hat <- sqrt(av.hat)
se.hat

[1] 0.03383789

confint <- c(theta.hat - 1.96*se.hat, theta.hat + 1.96*se.hat)
confint

[1] 0.3945072 0.5271518