Example: Polar Coordinates from Independent Normals
Let \(X\) and \(Y\) be independent \(\mathcal{N}(0,1)\) random variables. Find the joint density of the polar coordinates \(R\) and \(\Theta\). Recall polar coordinates:
We can illustrate the mapping of a random Euclidean point \((X,Y)\) to polar coordinates \((R,\Theta)\), with \(\Theta\) measured counter-clockwise from the positive \(X\)-axis.
Code
set.seed(123)# Generate a single random point from N(0,1) x N(0,1)x <-rnorm(1)y <-rnorm(1)# Polar conversionr <-sqrt(x^2+ y^2)theta <-atan2(y, x) # in (-pi, pi]theta_ccw <-ifelse(theta <0, theta +2*pi, theta) # map to [0, 2*pi)# Plot baseplot(0, 0, type ="n", xlim =c(-3,3), ylim =c(-3,3),xlab ="X", ylab ="Y", main ="Euclidean to Polar Coordinates (CCW angle)")abline(h =0, v =0, lty =3) # dashed axes for reference# Point and radius vectorpoints(x, y, col ="red", pch =19, cex =1.5)arrows(0, 0, x, y, col ="blue", lwd =2)# Dashed circle at radius Rsymbols(0, 0, circles = r, inches =FALSE, add =TRUE, lty =2)# Dashed line through (0,0) and (x,y) using angle (avoids slope issues when x ~ 0)L <-3lines(c(-L, L) *cos(theta), c(-L, L) *sin(theta), lty =3)# Annotate Rtext(x, y, labels ="(X,Y)", pos =4)text(x/2, y/2, labels =expression(R), pos =2)# Counter-clockwise arc from 0 to theta_ccwarc_r <-max(0.8, min(1.2, 0.8* r)) # keep the arc visiblearc_t <-seq(0, theta_ccw, length.out =200)lines(arc_r *cos(arc_t), arc_r *sin(arc_t), col ="darkgreen", lwd =2)# Angle label at midpoint of the arcmid_t <- theta_ccw /2text(arc_r *cos(mid_t), arc_r *sin(mid_t), labels =expression(theta),col ="darkgreen", pos =3)
Euclidean point and its polar representation
This plot shows:
\((X,Y)\) as the red dot in Euclidean space.
\(R\) as the radius of the dashed circle.
\(\Theta\) measured counter-clockwise from the positive \(X\)-axis, indicated by the green arc.
Dashed lines through the origin along the axes and through \((X,Y)\).
This plot shows:
\((X,Y)\) as the red dot in Euclidean space.
\(R\) as the radius of the dashed circle.
\(\Theta\) measured counter-clockwise from the positive \(X\)-axis, indicated by the green arc.
Dashed lines through the origin along the \(X\)-axis and through \((X,Y)\).
Box–Muller Method
Generate standard normal random values \((X,Y)\) from independent uniforms \((U_1,U_2)\).
\(-2\,\log U_1 \sim \operatorname{Exp}(\tfrac{1}{2})\) and \(2\pi U_2 \sim \operatorname{Unif}(0,2\pi)\).
Set \(X = \sqrt{-2\,\log U_1}\,\cos(2\pi U_2)\) and \(Y = \sqrt{-2\,\log U_1}\,\sin(2\pi U_2)\), which yields \(X\sim\mathcal{N}(0,1)\) and \(Y\sim\mathcal{N}(0,1)\).
R Implementation
Code
set.seed(42)# Box–Muller generator: returns n pairs (X, Y)box_muller <-function(n) { u1 <-runif(n) u2 <-runif(n) r <-sqrt(-2*log(u1)) theta <-2* pi * u2 x <- r *cos(theta) y <- r *sin(theta)data.frame(x = x, y = y)}# Example: draw 10,000 samples and check momentssamples <-box_muller(10000)c(mean_x =mean(samples$x), var_x =var(samples$x),mean_y =mean(samples$y), var_y =var(samples$y))
# Quick visual checkspar(mfrow =c(1, 3))hist(samples$x, breaks =40, main ="X ~ N(0,1)")hist(samples$y, breaks =40, main ="Y ~ N(0,1)")plot(samples$x, samples$y, pch =16, cex =0.4,main ="Scatter of (X,Y)")
Diagnostics for Box–Muller samples
Code
par(mfrow =c(1, 1))
Notes: The derivations above follow the change-of-variables formula with the polar transformation, showing that \(R\) is Rayleigh and \(\Theta\) is Uniform, which leads directly to the Box–Muller algorithm.
Derivation of the Rayleigh Distribution
1. From Uniform to Exponential via the Inverse CDF Method
Let \(U \sim \operatorname{Unif}(0,1)\). To generate an exponential random variable \(T\) with rate \(\lambda = \tfrac{1}{2}\) (mean \(2\)), we use the Inverse CDF Method.
The CDF of \(\operatorname{Exp}(\tfrac{1}{2})\) is
\(F_T(t) = 1 - e^{-t/2}, \quad t \ge 0.\)
Set \(F_T(t) = U\) and solve for \(t\):
\(U = 1 - e^{-t/2}\)
\(e^{-t/2} = 1 - U\)
\(-\tfrac{t}{2} = \ln(1-U)\)
\(t = -2\,\ln(1-U).\)
Since \(1-U \sim \operatorname{Unif}(0,1)\) as well, we can equivalently write
\(t = -2\,\ln U.\)
Thus
\(T = -2\ln U \sim \operatorname{Exp}(1/2).\)
2. From Exponential to Rayleigh
We want \(R\) such that \(R^2 \sim \operatorname{Exp}(\tfrac{1}{2})\).
That is, if \(T \sim \operatorname{Exp}(1/2)\) then \(R = \sqrt{T}\) is Rayleigh distributed.
hist(samples_r, breaks =40, probability =TRUE,main ="Rayleigh(σ=1)", xlab ="r")curve(x *exp(-x^2/2), from =0, to =5, add =TRUE, col ="red", lwd =2)
Plot generated by R code
Thus, the Rayleigh distribution naturally arises as the distribution of \(R = \sqrt{-2\ln U}\), completing the connection between uniforms, exponentials, and normals via the Box–Muller construction.
Deriving the Rayleigh via the Inverse CDF Method
1) From Uniform to \(\operatorname{Exp}\!(\tfrac{1}{2})\)
Let \(U \sim \operatorname{Unif}(0,1)\). The target CDF for \(T \sim \operatorname{Exp}(\tfrac{1}{2})\) is \(F_T(t) = 1 - e^{-t/2},\; t \ge 0.\)
Set \(U = F_T(T)\) and solve for \(T\): \(U = 1 - e^{-T/2}\)\(e^{-T/2} = 1 - U\)\(-\dfrac{T}{2} = \log(1-U)\)\(T = -2\,\log(1-U).\)
Since \(1-U \overset{d}{=} U\) for \(U\sim\operatorname{Unif}(0,1)\), an equivalent generator is \(T = -2\,\log U,\) which yields \(T \sim \operatorname{Exp}(\tfrac{1}{2}).\)
2) From \(\operatorname{Exp}\!(\tfrac{1}{2})\) to Rayleigh
Thus the CDF of \(R\) is \(F_R(r) = 1 - e^{-r^2/2},\; r \ge 0,\) and the PDF is \(f_R(r) = \dfrac{d}{dr} F_R(r) = r\,e^{-r^2/2},\; r \ge 0,\) which is the Rayleigh distribution with parameter \(\sigma = 1.\)
3) Verifying via the Inverse CDF for Rayleigh
Start from \(U \sim \operatorname{Unif}(0,1)\) and set \(U = F_R(R)\) with \(F_R(r) = 1 - e^{-r^2/2},\; r \ge 0.\)
Again using \(1-U \overset{d}{=} U\), an equivalent generator is \(R = \sqrt{-2\,\log U}.\)
Combining parts (1) and (2), if \(T = -2\,\log U \sim \operatorname{Exp}(\tfrac{1}{2})\) and \(R = \sqrt{T}\), then \(F_R(r) = 1 - e^{-r^2/2}\) and \(f_R(r) = r\,e^{-r^2/2},\) confirming that the transformation produces a Rayleigh random variable.
The Bivariate Normal Distribution
We introduce the general bivariate normal distribution with different means, variances, and correlation.
Notation: \(BVN(\mu_X, \mu_Y, \sigma_X^2, \sigma_Y^2, \rho)\).
1) Adding Means and Variances
Let the random vector be \[
\mathbf{Z} =
\begin{bmatrix}
X \\ Y
\end{bmatrix}.
\]
We specify its mean vector as \[
\boldsymbol{\mu} =
\begin{bmatrix}
\mu_X \\ \mu_Y
\end{bmatrix},
\]
and its covariance matrix without correlation as \[
\Sigma_0 =
\begin{bmatrix}
\sigma_X^2 & 0 \\
0 & \sigma_Y^2
\end{bmatrix}.
\]
Thus if \(X \sim N(\mu_X, \sigma_X^2)\) and \(Y \sim N(\mu_Y, \sigma_Y^2)\) independently, then \[
\mathbf{Z} \sim BVN(\mu_X, \mu_Y, \sigma_X^2, \sigma_Y^2, 0).
\]
2) Adding Correlation \(\rho\)
To introduce correlation, the covariance matrix becomes \[
\Sigma =
\begin{bmatrix}
\sigma_X^2 & \rho\,\sigma_X\sigma_Y \\
\rho\,\sigma_X\sigma_Y & \sigma_Y^2
\end{bmatrix}.
\]
Thus the full distribution is \[
\mathbf{Z} \sim BVN(\mu_X, \mu_Y, \sigma_X^2, \sigma_Y^2, \rho).
\]
Its density is \[
f_{X,Y}(x,y)
= \frac{1}{2\pi\,\sigma_X\sigma_Y\sqrt{1-\rho^2}}
\exp\!\left(
-\frac{1}{2(1-\rho^2)}
\Bigg[
\frac{(x-\mu_X)^2}{\sigma_X^2}
- 2\rho\frac{(x-\mu_X)(y-\mu_Y)}{\sigma_X\sigma_Y}
+ \frac{(y-\mu_Y)^2}{\sigma_Y^2}
\Bigg]
\right).
\]
In compact matrix notation, with \[
\Sigma^{-1} =
\begin{bmatrix}
1/\sigma_X^2 & -\rho/(\sigma_X\sigma_Y) \\
-\rho/(\sigma_X\sigma_Y) & 1/\sigma_Y^2
\end{bmatrix}
\cdot \frac{1}{1-\rho^2},
\]
the density can be written as \[
f_{X,Y}(\mathbf{z})
= \frac{1}{2\pi\,|\Sigma|^{1/2}}
\exp\!\left(-\tfrac{1}{2}(\mathbf{z}-\boldsymbol{\mu})^\top
\Sigma^{-1}(\mathbf{z}-\boldsymbol{\mu})\right).
\]
This general form compactly encodes the means, variances, and correlation structure of the bivariate normal distribution.
Bivariate Normal \(\operatorname{BVN}(\mu_X, \mu_Y, \sigma_X^2, \sigma_Y^2, \rho)\)
We derive the joint density in two steps: first with arbitrary means and variances (no correlation), then add correlation \(\rho\) using matrix notation.
1) Add means and variances (no correlation)
Let the random vector be \(\mathbf{Z} = \begin{bmatrix} X \\ Y \end{bmatrix}\) with mean \(\boldsymbol{\mu} = \begin{bmatrix} \mu_X \\ \mu_Y \end{bmatrix}\) and covariance matrix \(\boldsymbol{\Sigma}_0 = \begin{bmatrix} \sigma_X^2 & 0 \\ 0 & \sigma_Y^2 \end{bmatrix}.\)
Equivalently, \(X \sim \mathcal{N}(\mu_X,\sigma_X^2)\) and \(Y \sim \mathcal{N}(\mu_Y,\sigma_Y^2)\) independently. The joint PDF factors: \(f_{X,Y}(x,y) = \dfrac{1}{2\pi\,\sigma_X\,\sigma_Y}\exp\!\left\{-\dfrac{1}{2}\left[\dfrac{(x-\mu_X)^2}{\sigma_X^2} + \dfrac{(y-\mu_Y)^2}{\sigma_Y^2}\right]\right\}.\)
In matrix notation, with \(\mathbf{z} = \begin{bmatrix} x \\ y \end{bmatrix}\), \(f_{X,Y}(\mathbf{z}) = \dfrac{1}{(2\pi)^{1}}\,\dfrac{1}{\sqrt{\lvert \boldsymbol{\Sigma}_0 \rvert}}\,\exp\!\left\{-\dfrac{1}{2}(\mathbf{z}-\boldsymbol{\mu})^\top \boldsymbol{\Sigma}_0^{-1} (\mathbf{z}-\boldsymbol{\mu})\right\},\) where \(\lvert \boldsymbol{\Sigma}_0 \rvert = \sigma_X^2\,\sigma_Y^2\) and \(\boldsymbol{\Sigma}_0^{-1} = \begin{bmatrix} 1/\sigma_X^2 & 0 \\ 0 & 1/\sigma_Y^2 \end{bmatrix}.\)
Notes: 1) When \(\rho = 0\), the off-diagonal terms vanish and the density reduces to the independent case above.
2) Existence requires \(\sigma_X>0\), \(\sigma_Y>0\), and \(\lvert\rho\rvert<1\) so that \(\boldsymbol{\Sigma}\) is positive definite.
3) One can obtain \(\mathbf{Z}\) by an affine transform of a standard normal \(\mathbf{N} \sim \mathcal{N}(\mathbf{0}, \mathbf{I}_2)\) using a matrix square root (e.g., Cholesky): \(\mathbf{Z} = \boldsymbol{\mu} + \mathbf{L}\,\mathbf{N}\) with \(\mathbf{L}\,\mathbf{L}^\top = \boldsymbol{\Sigma}.\)
Deriving the Bivariate Normal \(BVN(\mu_X, \mu_Y, \sigma_X^2, \sigma_Y^2, \rho)\) via PDF Transformations
We use matrix notation. Let \(\mathbf{z} = (z_1, z_2)^\top\) with \(z_1, z_2 \overset{\text{iid}}\sim \mathcal{N}(0,1)\) and joint PDF \(f_{\mathbf{Z}}(\mathbf{z}) = \dfrac{1}{2\pi} \exp\!\left\{-\dfrac{1}{2}\,\mathbf{z}^\top\mathbf{z}\right\}.\)
Define \(\boldsymbol{\mu} = (\mu_X, \mu_Y)^\top\) and \(\mathbf{D} = \operatorname{diag}(\sigma_X, \sigma_Y)\).
1) Add Means and Variances (Independent Case)
Consider the affine map \(\mathbf{x}_0 = \boldsymbol{\mu} + \mathbf{D}\,\mathbf{z}.\) Then \(\mathbf{z} = \mathbf{D}^{-1}(\mathbf{x}_0 - \boldsymbol{\mu})\) and the Jacobian is \(\left|\det \dfrac{\partial \mathbf{z}}{\partial \mathbf{x}_0} \right| = \left|\det \mathbf{D}^{-1}\right| = \dfrac{1}{\sigma_X\sigma_Y}.\)
By the PDF method, \(f_{\mathbf{X}_0}(\mathbf{x}_0) = f_{\mathbf{Z}}\big(\mathbf{D}^{-1}(\mathbf{x}_0 - \boldsymbol{\mu})\big)\,\left|\det \mathbf{D}^{-1}\right|.\)
which is the independent bivariate normal with means \(\mu_X,\mu_Y\) and variances \(\sigma_X^2,\sigma_Y^2\).
2) Add Correlation \(\rho\) (Keep Means and Variances)
Work with centered variables \(\tilde{\mathbf{x}}_0 = \mathbf{x}_0 - \boldsymbol{\mu}\) and construct \(\tilde{\mathbf{x}} = \mathbf{C}\,\tilde{\mathbf{x}}_0,\) where \(\mathbf{C} = \begin{bmatrix}
1 & 0 \\
\rho\,\dfrac{\sigma_Y}{\sigma_X} & \sqrt{1-\rho^2}
\end{bmatrix}.\)
Set the final variables as \(\mathbf{x} = \boldsymbol{\mu} + \tilde{\mathbf{x}} = \boldsymbol{\mu} + \mathbf{C}(\mathbf{x}_0 - \boldsymbol{\mu}).\)
The Jacobian of the map \(\mathbf{x}_0 \mapsto \mathbf{x}\) equals \(\left|\det \dfrac{\partial \mathbf{x}_0}{\partial \mathbf{x}}\right| = \left|\det \mathbf{C}^{-1}\right| = \dfrac{1}{\sqrt{1-\rho^2}}.\)
Using the PDF method again with translation invariance of the Jacobian, we obtain \(f_{\mathbf{X}}(\mathbf{x}) = f_{\mathbf{X}_0}\big(\mathbf{C}^{-1}(\mathbf{x}-\boldsymbol{\mu}) + \boldsymbol{\mu}\big)\,\left|\det \mathbf{C}^{-1}\right|.\)
After algebra (or by composing the two linear maps on \(\mathbf{z}\)), the covariance of \(\mathbf{x}\) is \(\boldsymbol{\Sigma} = \begin{bmatrix}
\sigma_X^2 & \rho\,\sigma_X\sigma_Y \\
\rho\,\sigma_X\sigma_Y & \sigma_Y^2
\end{bmatrix},\) and the joint PDF simplifies to the standard \(BVN\) form \(f_{X,Y}(x,y) = \dfrac{1}{2\pi\,\sigma_X\sigma_Y\,\sqrt{1-\rho^2}}\,\exp\!\left\{-\dfrac{1}{2(1-\rho^2)}\left[\left(\dfrac{x-\mu_X}{\sigma_X}\right)^2 - 2\rho\,\left(\dfrac{x-\mu_X}{\sigma_X}\right)\left(\dfrac{y-\mu_Y}{\sigma_Y}\right) + \left(\dfrac{y-\mu_Y}{\sigma_Y}\right)^2\right]\right\}.\)
Matrix Notation Summary
Let \(\boldsymbol{\mu} = (\mu_X,\mu_Y)^\top\) and \(\boldsymbol{\Sigma} = \begin{bmatrix}
\sigma_X^2 & \rho\,\sigma_X\sigma_Y \\
\rho\,\sigma_X\sigma_Y & \sigma_Y^2
\end{bmatrix}.\) Then the density can be written compactly as \(f_{\mathbf{X}}(\mathbf{x}) = \dfrac{1}{2\pi\sqrt{\det \boldsymbol{\Sigma}}}\,\exp\!\left\{-\dfrac{1}{2}(\mathbf{x}-\boldsymbol{\mu})^\top\boldsymbol{\Sigma}^{-1}(\mathbf{x}-\boldsymbol{\mu})\right\},\) where \(\det\boldsymbol{\Sigma} = \sigma_X^2\sigma_Y^2(1-\rho^2)\) and \(\boldsymbol{\Sigma}^{-1} = \dfrac{1}{(1-\rho^2)\,\sigma_X^2\sigma_Y^2}
\begin{bmatrix}
\sigma_Y^2 & -\rho\,\sigma_X\sigma_Y \\
-\rho\,\sigma_X\sigma_Y & \sigma_X^2
\end{bmatrix}.\)
The two-step construction explicitly shows how translation and diagonal scaling introduce means and variances (with Jacobian \(|\det\mathbf{D}^{-1}|\)), and how a subsequent shear/rotation \(\mathbf{C}\) introduces correlation (with Jacobian \(|\det\mathbf{C}^{-1}|\)), yielding the full \(BVN(\mu_X,\mu_Y,\sigma_X^2,\sigma_Y^2,\rho)\) density.