---
title: "Chebyshev and Markov Inequalities"
format:
  html:
    code-fold: true
    toc: true
    toc-depth: 3
    embed-resources: true
  pdf:
    code-fold: true
    toc: true
    toc-depth: 3
    embed-resources: true
    includes:
    - \usepackage{amsmath}
    - \usepackage{amssymb}
    - \usepackage{mathtools}
    - \usepackage{bm}
---

# Probability Inequalities

## Markov's Inequality

For a positive random variable $X$ and any $a>0$, we cannot put unlimited probability in the right tail. The precise statement is:

$$
\Pr(X \ge a) \le \frac{\mathbb{E}[X]}{a}.
$$

**Proof idea.** 

```{r}
#| echo: FALSE
#| messages: FALSE
#| 
library(ggplot2)
library(grid)  # for arrow()

# ---- Parameters (shape > 1 so f(0) = 0) ----
shape <- 2.2
rate  <- 1.0
a     <- 1.2   # vertical split

# ---- Data ----
x_max <- qgamma(0.999, shape = shape, rate = rate)         # right endpoint for plotting
df    <- data.frame(x = seq(0, x_max, length.out = 1200))
df$fx <- dgamma(df$x, shape = shape, rate = rate)

df_left  <- subset(df, x <= a)                              # region X < a
df_right <- subset(df, x >= a)                              # region X >= a

# Helpful values for placing labels/arrows
y_max   <- max(df$fx)
y_lbl   <- 0.85 * y_max
x_lbl_L <- 0.5 * a
x_lbl_R <- a + 0.45 * (x_max - a)

# ---- Plot ----
p <- ggplot(df, aes(x, fx)) +
  geom_line(linewidth = 1) +
  geom_area(data = df_left,  aes(x, fx), alpha = 0.35) +
  geom_area(data = df_right, aes(x, fx), alpha = 0.15) +
  geom_vline(xintercept = a, linetype = "dashed", linewidth = 0.8) +
  labs(
    title = "Gamma PDF with Split at a",
    subtitle = paste0("Gamma(shape = ", shape, ", rate = ", rate, ")"),
    x = "x",
    y = expression(f[X](x))
  ) +
  theme_classic(base_size = 13)

# ---- Annotations (labels + arrows) ----
p +
  annotate("text", x = x_lbl_L, y = y_lbl, label = "P(X < a)", parse = TRUE) +
  annotate("segment",
           x = x_lbl_L, xend = a * 0.85,
           y = y_lbl - 0.03 * y_max, yend = y_lbl - 0.20 * y_max,
           arrow = arrow(type = "closed", length = unit(0.18, "cm"))) +
  annotate("text", x = x_lbl_R, y = y_lbl, label = "P(X >= a)", parse = TRUE) +
  annotate("segment",
           x = x_lbl_R, xend = a + 0.15 * (x_max - a),
           y = y_lbl - 0.03 * y_max, yend = y_lbl - 0.20 * y_max,
           arrow = arrow(type = "closed", length = unit(0.18, "cm")))


```

\newpage 

Define the indicator function as

$$
I = 1  \ \ X \ge a\
$$
$$
I  = 0 \ \ X < a
$$

$$
\mathbb{E}[X] = \mathbb{E}[  \mathbb{E}[X|I]] =  \mathbb{E}[g(I)]  =  \mathbb{E}[X|I=0] Pr(X < a) +  \mathbb{E}[X|I=1] Pr(X \ge a).
$$

This is an application of the **Law of Total Expectation**.

**Note:** $P(X < a) \ge 0$ and $\mathbb{E}[X|I = 0] \ge 0$ since $X > 0$.

So

$$
\mathbb{E}[X] \ge \mathbb{E}[X|I=1] Pr(X \ge a)
$$

When $I = 1$, $X \ge a$.  So


$$
\mathbb{E}[X] \ge \mathbb{E}[a|I=1] Pr(X \ge a) = a Pr(X \ge a)
$$


Therefore,

$$
\Pr(X\ge a) \le \frac{\mathbb{E}[X]}{a}.
$$

An equivalent parameterization sets $a=b \mathbb{E}[X]$ (with $b>0$), which yields

$$
\Pr\!\bigl(X \ge b\,\mathbb{E}[X]\bigr) \le \frac{1}{b}.
$$

Some quick consequences (from the notes):

| $b$ | Bound on $P(X \ge b \mu)$ |
|---:|:---|
| 1 | $P(X \ge \mu) \le 1$ |
| 2 | $P(X \ge 2 \mu) \le \tfrac12$ |
| 3 | $P(X \ge 3 \mu) \le \tfrac13$ |

## Chebyshev's Inequality

For any random variable $X$ with finite mean $\mu$ and finite variance $\sigma^2$, measurement error is bounded in the sense that for $k>0$:

$$
\Pr\!\bigl(|X-\mu| \ge k\,\sigma\bigr) \le \frac{1}{k^2}.
$$

**Proof via Markov.**

```{r}
#| echo: FALSE
#| warning: FALSE
#| message: FALSE

library(tidyverse)

# --- Parameters ---
sigma <- 2       # standard deviation
k     <- 1.5     # multiple of sigma to mark
kp    <- k * sigma

# --- Data (Normal(0, sigma^2)) ---
x_max <- max(qnorm(0.999, mean = 0, sd = sigma), 1.5 * kp)
pdf_tbl <- tibble(
  x  = seq(-x_max, x_max, length.out = 2000),
  fx = dnorm(x, mean = 0, sd = sigma)
)

# --- Plot ---
ggplot(pdf_tbl, aes(x, fx)) +
  geom_line(linewidth = 1) +
  geom_vline(xintercept = c(-kp, kp), linetype = "dashed", linewidth = 0.8) +
  scale_x_continuous(
    breaks = c(-kp, 0, kp),
    labels = c(expression(-k*sigma), "0", expression(k*sigma))
  ) +
  labs(
    title = "Symmetric PDF Centered at Zero",
    subtitle = bquote(X %~% N(0, sigma^2) ~ ", marks at " ~ -k*sigma ~ " and " ~ k*sigma),
    x = "x",
    y = expression(f[X](x))
  ) +
  theme_classic(base_size = 13)

```


Let $Y=(X-\mu)^2$, which is nonnegative. By Markov's inequality,

$$
\Pr\!\bigl(Y \ge k^2 \sigma^2\bigr) \le \frac{\mathbb{E}[Y]}{k^2 \sigma^2} = \frac{\sigma^2}{k^2 \sigma^2} = \frac{1}{k^2}.
$$

Since $\{Y \ge k^2 \sigma^2\} \equiv \{|X-\mu| \ge k \sigma\}$, the Chebyshev bound follows.

Equivalently, writing $k=b$ gives

$$
\Pr\!\bigl(|X-\mu| \ge b\,\sigma\bigr) \le \frac{1}{b^2},
$$

and hence

$$
\Pr\!\bigl(|X-\mu| < b\,\sigma\bigr) \ge 1-\frac{1}{b^2}.
$$

Typical values (as in the notes):

| $b$ | $\Pr(|X-\mu| \ge b\sigma)$ | $\Pr(|X-\mu| < b\sigma)$ |
|---:|:---:|:---:|
| 1 | $\le 1$ | $\ge 0$ |
| 2 | $\le \tfrac14$ | $\ge \tfrac34$ |
| 3 | $\le \tfrac19$ | $\ge \tfrac89$ |

## One–point probability interval (single draw)

From Chebyshev's inequality,

$$
\Pr\!\bigl(\mu - b\sigma < X < \mu + b\sigma\bigr) \ge 1 - \frac{1}{b^2}.
$$

This rearranges the absolute deviation statement to a two–sided interval around $\mu$.

## Confidence interval for the mean with a single observation $(n=1)$

Using the same bound,

$$
\Pr\!\bigl(|X-\mu| < b\sigma\bigr) \ge 1 - \frac{1}{b^2},
$$

which is the same interval as above.

So we are at least $1 - \frac{1}{b^2}$ percent confident that the interval $(x - b\sigma, x + b\sigma)$ contains $\mu$.

## Confidence interval for the mean of $n$ i.i.d. observations (known $\sigma$)

Let $\bar X$ be the sample mean of $X_1,\dots,X_n$ with common mean $\mu$ and variance $\sigma^2$. Since $\mathrm{Var}(\bar X)=\sigma^2/n$, Chebyshev gives, for any $b>0$,

$$
\Pr\!\bigl(|\bar X - \mu| \le b\,\frac{\sigma}{\sqrt{n}}\bigr) \ge 1 - \frac{1}{b^2}.
$$

Equivalently, with probability at least $1-\frac{1}{b^2}$, the interval

$$
\left(\bar X - b\,\frac{\sigma}{\sqrt{n}},\ \bar X + b\,\frac{\sigma}{\sqrt{n}}\right)
$$

contains $\mu$.

### Example (from the notes)

Taking $b=2$ (and $\sigma$ known), a Chebyshev interval

$$
\left(\bar X - 2\,\frac{\sigma}{\sqrt{n}},\ \bar X + 2\,\frac{\sigma}{\sqrt{n}}\right)
$$

has **at least**

$$
1 - \frac{1}{2^2} = \frac{3}{4} = 75\%
$$

confidence.

> **Note:** Chebyshev bounds are distribution-free and can be very conservative.