ConceptComplete

Joint and Conditional Distributions - Key Properties

Understanding independence and covariance structure reveals how random variables relate to each other.

Independence

Definition

Random variables XX and YY are independent if: pX,Y(x,y)=pX(x)pY(y)(discrete)p_{X,Y}(x,y) = p_X(x) \cdot p_Y(y) \quad \text{(discrete)} fX,Y(x,y)=fX(x)fY(y)(continuous)f_{X,Y}(x,y) = f_X(x) \cdot f_Y(y) \quad \text{(continuous)}

Equivalently: P(XA,YB)=P(XA)P(YB)P(X \in A, Y \in B) = P(X \in A)P(Y \in B) for all sets A,BA, B.

Test for Independence: The joint factors as a product of marginals.

Example

If fX,Y(x,y)=6xyf_{X,Y}(x,y) = 6xy for 0<x<1,0<y<10 < x < 1, 0 < y < 1:

Marginals: fX(x)=016xydy=3xf_X(x) = \int_0^1 6xy \, dy = 3x, fY(y)=016xydx=3yf_Y(y) = \int_0^1 6xy \, dx = 3y

Product: fX(x)fY(y)=9xy6xyf_X(x) \cdot f_Y(y) = 9xy \neq 6xy

Not independent! (But if it were 9xy9xy, they'd be independent.)

Covariance and Correlation

Definition

The covariance is: Cov(X,Y)=E[(XμX)(YμY)]=E[XY]E[X]E[Y]\text{Cov}(X,Y) = E[(X-\mu_X)(Y-\mu_Y)] = E[XY] - E[X]E[Y]

The correlation is: ρ(X,Y)=Cov(X,Y)σXσY\rho(X,Y) = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y}

Properties:

  • If X,YX, Y independent → Cov(X,Y)=0\text{Cov}(X,Y) = 0 (converse not true!)
  • 1ρ1-1 \leq \rho \leq 1
  • ρ=1|\rho| = 1 iff Y=aX+bY = aX + b (perfect linear relationship)
  • Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X,Y)
Example

Let XN(0,1)X \sim \mathcal{N}(0,1) and Y=X2Y = X^2. Then:

  • E[XY]=E[X3]=0E[XY] = E[X^3] = 0 (odd function)
  • E[X]=0E[X] = 0, E[Y]=1E[Y] = 1
  • Cov(X,Y)=001=0\text{Cov}(X,Y) = 0 - 0 \cdot 1 = 0

Yet XX and YY are clearly dependent! Zero covariance doesn't imply independence.

Bivariate Normal Distribution

Definition

(X,Y)(X,Y) have bivariate normal distribution if their joint PDF is: f(x,y)=12πσXσY1ρ2exp(Q2(1ρ2))f(x,y) = \frac{1}{2\pi\sigma_X\sigma_Y\sqrt{1-\rho^2}} \exp\left(-\frac{Q}{2(1-\rho^2)}\right)

where: Q=(xμXσX)22ρ(xμX)(yμY)σXσY+(yμYσY)2Q = \left(\frac{x-\mu_X}{\sigma_X}\right)^2 - 2\rho\frac{(x-\mu_X)(y-\mu_Y)}{\sigma_X\sigma_Y} + \left(\frac{y-\mu_Y}{\sigma_Y}\right)^2

Parameters: μX,μY,σX,σY,ρ\mu_X, \mu_Y, \sigma_X, \sigma_Y, \rho

Key Property: For bivariate normal, uncorrelated (ρ=0\rho = 0) implies independent!

Conditional Distribution: XY=yN(μX+ρσXσY(yμY),σX2(1ρ2))X|Y=y \sim \mathcal{N}\left(\mu_X + \rho\frac{\sigma_X}{\sigma_Y}(y-\mu_Y), \sigma_X^2(1-\rho^2)\right)

This is the foundation of linear regression.

Remark

Independence is stronger than zero covariance. While independent random variables always have zero covariance, the converse requires special structure (e.g., bivariate normality). Understanding this distinction is crucial for proper statistical modeling.