--- title: "Cramér–Rao Lower Bound — From Handwritten Notes (Revised)" author: "Converted from IMG_1173–IMG_1176" format: pdf: toc: true number-sections: true embed-resources: true engine: knitr execute: echo: false --- # Assumptions (from IMG_1173) We consider a regular parametric model with density $f(x\mid\theta)$ and log-likelihood $\log f(x\mid\theta)$. The assumptions are the standard “regular case of estimation”: 1. Differentiability in the parameter: $$ \frac{\partial}{\partial\theta} f(x\mid\theta) \ \text{exists}, \quad \theta \text{ lies in an open interval.} $$ 2. The log-density is differentiable and we may treat $f(X\mid\theta)$ and $\log f(X\mid\theta)$ as random variables whose expectations can be differentiated w.r.t. $\theta$. In particular, we can move the derivative inside the expectation: $$ \frac{\partial}{\partial\theta}\, \mathbb{E}_\theta[ g(X,\theta) ] \,=\, \mathbb{E}_\theta\!\left[ \frac{\partial}{\partial\theta} g(X,\theta) \right], $$ for $g$ equal to $f(\cdot\mid\theta)$ or $\log f(\cdot\mid\theta)$, whenever the objects exist. 3. Finite Fisher's information: $$ \mathbb{E}_\theta\!\left[\left(\frac{\partial}{\partial\theta} \log f(X\mid\theta)\right)^2\right] \,<\, \infty. $$ We will write the score for one observation as $$ U_\theta(X) = \frac{\partial}{\partial\theta}\,\log f(X\mid\theta). $$ # Theorem (from IMG_1174) Let $T=t(X_1,\ldots,X_n)$ be an unbiased estimator of $\theta$, so $\mathbb{E}_\theta[T]=\theta$. Under Assumptions 1–3, the variance of $T$ satisfies the **Cramér–Rao lower bound** $$ \operatorname{Var}_\theta(T) \,\ge\, \frac{1}{n\,\mathbb{E}_\theta\!\left[\left(\frac{\partial}{\partial\theta}\log f(X\mid\theta)\right)^2\right]} \,=\, \frac{1}{n\, I(\theta)}, $$ where $I(\theta)=\mathbb{E}_\theta[U_\theta(X)^2]$ is the Fisher information for a single observation. # Proof — page 1 (setup) Because $T$ is unbiased, $$ \mathbb{E}_\theta[T] = \theta. $$ Differentiate both sides w.r.t. $\theta$ and use the interchange of derivative and expectation together with the joint density $f(x_1,\ldots,x_n\mid\theta)$: $$ 1 = \int t\, \frac{\partial}{\partial\theta} f(x_1,\ldots,x_n\mid\theta)\,dx = \int t\, \frac{\partial}{\partial\theta}\log f(x_1,\ldots,x_n\mid\theta)\, f(x_1,\ldots,x_n\mid\theta)\,dx = \mathbb{E}_\theta\big[ T\, U_\theta(X_1,\ldots,X_n) \big], $$ where the joint score is the sum of the marginal scores, $$ U_\theta(X_1,\ldots,X_n)=\sum_{i=1}^n \frac{\partial}{\partial\theta}\log f(X_i\mid\theta). $$ # Proof — page 2 (from IMG_1175) Normalization of the density gives $$ 1 = \int f(x\mid\theta)\,dx \quad \Rightarrow \quad 0 = \int \frac{\partial}{\partial\theta} f(x\mid\theta)\,dx = \int \frac{\partial}{\partial\theta}\log f(x\mid\theta)\, f(x\mid\theta)\,dx = \mathbb{E}_\theta[ U_\theta(X) ]. $$ Hence, $$ \operatorname{Cov}_\theta\big( T,\ U_\theta(X_1,\ldots,X_n) \big) = 1, $$ and $$ \operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right) = \mathbb{E}_\theta\!\left[ U_\theta(X_1,\ldots,X_n)^2 \right] = n\, I(\theta) \,<\, \infty. $$ # Proof — page 3 (from IMG_1176, corrected last step) Start with the **correlation** between $T$ and the joint score $U_\theta(X_1,\ldots,X_n)$: $$ \rho\big(T, U_\theta\big) \equiv \frac{\operatorname{Cov}_\theta\!\left(T, U_\theta(X_1,\ldots,X_n)\right)} {\sqrt{\operatorname{Var}_\theta(T)}\,\sqrt{\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)}}. $$ From the previous page we have $\operatorname{Cov}_\theta(T, U_\theta)=1$, hence $$ \rho\big(T, U_\theta\big) = \frac{1}{\sqrt{\operatorname{Var}_\theta(T)}\,\sqrt{\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)}}. $$ Because $|\rho|\le 1$, $$ 1 \ge \rho\big(T, U_\theta\big)^2 = \frac{1}{\operatorname{Var}_\theta(T)\;\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)}. $$ Invert both sides (all terms are positive) to obtain $$ \operatorname{Var}_\theta(T)\;\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right) \ge 1. $$ Substituting $\operatorname{Var}_\theta\!\left(U_\theta(X_1,\ldots,X_n)\right)=n I(\theta)$ yields $$ \operatorname{Var}_\theta(T) \ge \frac{1}{n\, I(\theta)}. $$ $\square$