---
title: "Cramér–Rao Lower Bound — From Handwritten Notes (Revised)"
author: "Converted from IMG_1173–IMG_1176"
format:
  pdf:
    toc: true
    number-sections: true
    embed-resources: true
engine: knitr
execute:
  echo: false
---

# Assumptions (from IMG_1173)

We consider a regular parametric model with density $f(x\mid\theta)$ and log-likelihood $\log f(x\mid\theta)$. The assumptions are the standard “regular case of estimation”:

1. Differentiability in the parameter:
   $$
   \frac{\partial}{\partial\theta} f(x\mid\theta) \ \text{exists}, \quad \theta \text{ lies in an open interval.}
   $$

2. The log-density is differentiable and we may treat $f(X\mid\theta)$ and $\log f(X\mid\theta)$ as random variables whose expectations can be differentiated w.r.t. $\theta$. In particular, we can move the derivative inside the expectation:
   $$
   \frac{\partial}{\partial\theta}\, \mathbb{E}_\theta[ g(X,\theta) ] \,=\, \mathbb{E}_\theta\!\left[ \frac{\partial}{\partial\theta} g(X,\theta) \right],
   $$
   for $g$ equal to $f(\cdot\mid\theta)$ or $\log f(\cdot\mid\theta)$, whenever the objects exist.

3. Finite Fisher's information:
   $$
   \mathbb{E}_\theta\!\left[\left(\frac{\partial}{\partial\theta} \log f(X\mid\theta)\right)^2\right] \,<\, \infty.
   $$

We will write the score for one observation as
$$
U_\theta(X) = \frac{\partial}{\partial\theta}\,\log f(X\mid\theta).
$$

# Theorem (from IMG_1174)

Let $T=t(X_1,\ldots,X_n)$ be an unbiased estimator of $\theta$, so $\mathbb{E}_\theta[T]=\theta$. Under Assumptions 1–3, the variance of $T$ satisfies the **Cramér–Rao lower bound**
$$
\operatorname{Var}_\theta(T) \,\ge\, \frac{1}{n\,\mathbb{E}_\theta\!\left[\left(\frac{\partial}{\partial\theta}\log f(X\mid\theta)\right)^2\right]}
\,=\, \frac{1}{n\, I(\theta)},
$$
where $I(\theta)=\mathbb{E}_\theta[U_\theta(X)^2]$ is the Fisher information for a single observation.

# Proof — page 1 (setup)

Because $T$ is unbiased,
$$
\mathbb{E}_\theta[T] = \theta.
$$
Differentiate both sides w.r.t. $\theta$ and use the interchange of derivative and expectation together with the joint density $f(x_1,\ldots,x_n\mid\theta)$:
$$
1
= \int t\, \frac{\partial}{\partial\theta} f(x_1,\ldots,x_n\mid\theta)\,dx
= \int t\, \frac{\partial}{\partial\theta}\log f(x_1,\ldots,x_n\mid\theta)\, f(x_1,\ldots,x_n\mid\theta)\,dx
= \mathbb{E}_\theta\big[ T\, U_\theta(X_1,\ldots,X_n) \big],
$$
where the joint score is the sum of the marginal scores,
$$
U_\theta(X_1,\ldots,X_n)=\sum_{i=1}^n \frac{\partial}{\partial\theta}\log f(X_i\mid\theta).
$$

# Proof — page 2 (from IMG_1175)

Normalization of the density gives
$$
1 = \int f(x\mid\theta)\,dx \quad \Rightarrow \quad 0 = \int \frac{\partial}{\partial\theta} f(x\mid\theta)\,dx
= \int \frac{\partial}{\partial\theta}\log f(x\mid\theta)\, f(x\mid\theta)\,dx
= \mathbb{E}_\theta[ U_\theta(X) ].
$$
Hence,
$$
\operatorname{Cov}_\theta\big( T,\ U_\theta(X_1,\ldots,X_n) \big) = 1,
$$
and
$$
\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)
= \mathbb{E}_\theta\!\left[ U_\theta(X_1,\ldots,X_n)^2 \right]
= n\, I(\theta) \,<\, \infty.
$$

# Proof — page 3 (from IMG_1176, corrected last step)

Start with the **correlation** between $T$ and the joint score $U_\theta(X_1,\ldots,X_n)$:
$$
\rho\big(T, U_\theta\big) \equiv
\frac{\operatorname{Cov}_\theta\!\left(T, U_\theta(X_1,\ldots,X_n)\right)}
{\sqrt{\operatorname{Var}_\theta(T)}\,\sqrt{\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)}}.
$$
From the previous page we have $\operatorname{Cov}_\theta(T, U_\theta)=1$, hence
$$
\rho\big(T, U_\theta\big) = \frac{1}{\sqrt{\operatorname{Var}_\theta(T)}\,\sqrt{\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)}}.
$$
Because $|\rho|\le 1$,
$$
1 \ge \rho\big(T, U_\theta\big)^2
= \frac{1}{\operatorname{Var}_\theta(T)\;\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)}.
$$
Invert both sides (all terms are positive) to obtain
$$
\operatorname{Var}_\theta(T)\;\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right) \ge 1.
$$
Substituting $\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)=n I(\theta)$ yields
$$
\operatorname{Var}_\theta(T) \ge \frac{1}{n\, I(\theta)}.
$$
$\square$