Common Distributions - Core Definitions

Probability distributions form families characterized by parameters that control their shape, location, and scale. Understanding these distribution families is essential for statistical modeling.

The Normal (Gaussian) Distribution

Definition

$X \sim \mathcal{N}(\mu, \sigma^2)$ has PDF: $f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right), \quad x \in \mathbb{R}$

Parameters: $\mu$ (mean), $\sigma^2 > 0$ (variance)

The normal distribution is the most important continuous distribution, arising naturally from the Central Limit Theorem. The standard normal $Z \sim \mathcal{N}(0,1)$ has CDF denoted $\Phi(z)$ .

Properties:

Symmetric about $\mu$
Bell-shaped curve
68-95-99.7 rule: Approximately 68%, 95%, 99.7% of mass within 1, 2, 3 standard deviations
MGF: $M_X(t) = e^{\mu t + \sigma^2 t^2/2}$

The Binomial Distribution

Definition

$X \sim \text{Binomial}(n,p)$ represents successes in $n$ independent Bernoulli $(p)$ trials: $P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad k = 0, 1, \ldots, n$

Parameters: $n \in \{1,2,3,\ldots\}$ (trials), $p \in [0,1]$ (success probability)

Properties:

$E[X] = np$ , $\text{Var}(X) = np(1-p)$
MGF: $M_X(t) = (pe^t + 1-p)^n$
Sum of independent Binomial $(n_i, p)$ is Binomial $(\sum n_i, p)$ (same $p$ )
Normal approximation: For large $n$ , $X \approx \mathcal{N}(np, np(1-p))$

The Poisson Distribution

Definition

$X \sim \text{Poisson}(\lambda)$ models rare events occurring at rate $\lambda > 0$ : $P(X = k) = \frac{e^{-\lambda} \lambda^k}{k!}, \quad k = 0, 1, 2, \ldots$

Parameter: $\lambda > 0$ (rate)

Properties:

$E[X] = \text{Var}(X) = \lambda$
MGF: $M_X(t) = e^{\lambda(e^t - 1)}$
Sum of independent Poisson $(\lambda_i)$ is Poisson $(\sum \lambda_i)$
Approximates Binomial $(n,p)$ when $n$ large, $p$ small, $np = \lambda$

The Exponential Distribution

Definition

$X \sim \text{Exponential}(\lambda)$ models waiting times with rate $\lambda > 0$ : $f(x) = \lambda e^{-\lambda x}, \quad x \geq 0$

Parameter: $\lambda > 0$ (rate)

Properties:

$E[X] = 1/\lambda$ , $\text{Var}(X) = 1/\lambda^2$
CDF: $F(x) = 1 - e^{-\lambda x}$ for $x \geq 0$
Memoryless: $P(X > s+t | X > s) = P(X > t)$
MGF: $M_X(t) = \frac{\lambda}{\lambda - t}$ for $t < \lambda$
Minimum of independent Exponential $(\lambda_i)$ is Exponential $(\sum \lambda_i)$

The Gamma Distribution

Definition

$X \sim \text{Gamma}(\alpha, \lambda)$ generalizes the exponential: $f(x) = \frac{\lambda^{\alpha}}{\Gamma(\alpha)} x^{\alpha-1} e^{-\lambda x}, \quad x \geq 0$

Parameters: $\alpha > 0$ (shape), $\lambda > 0$ (rate)

where $\Gamma(\alpha) = \int_0^{\infty} t^{\alpha-1} e^{-t} dt$

Properties:

$E[X] = \alpha/\lambda$ , $\text{Var}(X) = \alpha/\lambda^2$
Exponential $(\lambda) = \text{Gamma}(1, \lambda)$
Sum of $n$ independent Exponential $(\lambda)$ is Gamma $(n, \lambda)$
Chi-squared $(\nu) = \text{Gamma}(\nu/2, 1/2)$

Remark

These distributions are not arbitrary—they arise naturally from physical processes. Normal from sums (CLT), Poisson from rare events, Exponential from memoryless waiting times, and Gamma from sums of exponentials. Understanding their origins helps in choosing appropriate models.