Joint and Conditional Distributions - Applications

Multivariate techniques enable sophisticated analysis of dependent random variables in statistics and machine learning.

Multivariate Normal Distribution

Definition

The random vector $\mathbf{X} = (X_1, \ldots, X_n)^T$ has multivariate normal distribution $\mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma})$ with PDF: $f(\mathbf{x}) = \frac{1}{(2\pi)^{n/2}|\boldsymbol{\Sigma}|^{1/2}} \exp\left(-\frac{1}{2}(\mathbf{x}-\boldsymbol{\mu})^T\boldsymbol{\Sigma}^{-1}(\mathbf{x}-\boldsymbol{\mu})\right)$

where $\boldsymbol{\mu}$ is the mean vector and $\boldsymbol{\Sigma}$ is the covariance matrix.

Properties:

Linear combinations are normal: If $\mathbf{Y} = \mathbf{A}\mathbf{X} + \mathbf{b}$ , then $\mathbf{Y} \sim \mathcal{N}(\mathbf{A}\boldsymbol{\mu} + \mathbf{b}, \mathbf{A}\boldsymbol{\Sigma}\mathbf{A}^T)$
Marginals are normal
Conditionals are normal
Uncorrelated components are independent

Example

Portfolio Analysis: Three stocks with returns $\mathbf{R} \sim \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma})$ where: $\boldsymbol{\mu} = \begin{pmatrix}0.08\\0.12\\0.10\end{pmatrix}, \quad \boldsymbol{\Sigma} = \begin{pmatrix}0.04 & 0.01 & 0.02\\0.01 & 0.09 & 0.015\\0.02 & 0.015 & 0.06\end{pmatrix}$

Portfolio with weights $\mathbf{w} = (0.5, 0.3, 0.2)^T$ has return: $R_p = \mathbf{w}^T\mathbf{R} \sim \mathcal{N}(\mathbf{w}^T\boldsymbol{\mu}, \mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w})$

Conditional Expectation

Theorem

Tower Property: $E[E[X|Y]] = E[X]$
Taking Out Known: $E[g(Y)X|Y] = g(Y)E[X|Y]$
Independence: If $X \perp Y$ , then $E[X|Y] = E[X]$
Linearity: $E[aX + bZ|Y] = aE[X|Y] + bE[Z|Y]$

Example

Prediction: Minimize $E[(X - g(Y))^2]$ over all functions $g$ .

Solution: $g^*(Y) = E[X|Y]$ (conditional expectation is optimal predictor)

For bivariate normal: $E[X|Y=y] = \mu_X + \rho\frac{\sigma_X}{\sigma_Y}(y - \mu_Y)$

This is linear regression!

Principal Component Analysis

PCA finds orthogonal directions of maximum variance in multivariate data.

Given covariance matrix $\boldsymbol{\Sigma}$ , find eigenvalues $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n$ and eigenvectors $\mathbf{v}_1, \ldots, \mathbf{v}_n$ .

Principal components: $Z_i = \mathbf{v}_i^T \mathbf{X}$ with $\text{Var}(Z_i) = \lambda_i$ .

First component captures direction of greatest variability.

Remark

Multivariate analysis is central to modern data science. From finance (portfolio optimization) to machine learning (dimensionality reduction), understanding joint distributions and their properties enables sophisticated modeling of complex, high-dimensional phenomena.