Common Distributions - Examples and Constructions
Real-world applications demonstrate how to select appropriate distributions based on the underlying data-generating process.
Modeling Count Data
Quality Control: A factory inspects batches of 100 items. The number of defective items follows Binomial where is the defect rate.
If :
For rare defects ( small), Poisson approximation:
Radioactive Decay: Particles arrive at a detector following a Poisson process with rate per minute. The number of arrivals in one minute follows Poisson.
Probability of exactly 7 arrivals:
Time until first arrival follows Exponential with mean 0.2 minutes = 12 seconds.
Modeling Continuous Measurements
Measurement Errors: Scientific measurements often follow normal distributions due to the aggregation of many small independent errors (CLT).
Heights in a population: cm (mean 170cm, std dev 10cm).
Probability someone is taller than 185cm:
About 6.7% are taller than 185cm.
Reliability and Survival Analysis
Component Lifetimes: Electronic components often have exponentially distributed lifetimes if failures occur at constant rate.
For a component with mean lifetime 5 years (Exponential):
System Reliability: For a system with independent components, each with lifetime Exponential:
- Series (all must work):
- Parallel (one must work): has more complex distribution
Two redundant systems, each lasting Exponential years:
Redundancy improves reliability!
Financial Applications
Stock Returns: Daily log-returns often modeled as normal. For a stock with annual return 8% and volatility 20%:
Daily return (252 trading days/year)
Value at Risk (VaR): For a portfolio with value V \sim \mathcal{N}(\1M, $100K^2)$:
95% VaR (loss exceeded 5% of time):
Expected loss is at most about $165K with 95% confidence.
Queueing Theory
Bank Teller: Customers arrive at rate /hour (Poisson process). Service time per customer is Exponential (mean 5 minutes).
Number of arrivals in one hour: Poisson
Time until first arrival: Exponential with mean 6 minutes
For stability, arrival rate must be less than service rate: (here ✓)
Distribution selection is an art informed by:
- Nature of the variable (discrete vs. continuous, bounded vs. unbounded)
- Physical process (arrivals → Poisson, waiting times → Exponential, sums → Normal)
- Empirical data (histogram, Q-Q plots, goodness-of-fit tests)
- Mathematical tractability (sometimes approximate distributions chosen for convenience)