ConceptComplete

Expectation and Variance - Core Definitions

Expectation and variance are the two most important numerical summaries of a distribution, characterizing its center and spread respectively.

Expectation (Expected Value)

Definition

The expectation (or expected value or mean) of a random variable XX is:

Discrete case: E[X]=βˆ‘xxβ‹…pX(x)E[X] = \sum_x x \cdot p_X(x)

Continuous case: E[X]=βˆ«βˆ’βˆžβˆžxβ‹…fX(x) dxE[X] = \int_{-\infty}^{\infty} x \cdot f_X(x) \, dx

provided the sum or integral converges absolutely.

The expectation represents the "average" value of XX over many independent repetitions. It is also denoted ΞΌ\mu or ΞΌX\mu_X.

Example

Fair Die: X∈{1,2,3,4,5,6}X \in \{1,2,3,4,5,6\} with pX(k)=1/6p_X(k) = 1/6: E[X]=βˆ‘k=16kβ‹…16=1+2+3+4+5+66=216=3.5E[X] = \sum_{k=1}^6 k \cdot \frac{1}{6} = \frac{1+2+3+4+5+6}{6} = \frac{21}{6} = 3.5

Example

Exponential Distribution: X∼Exponential(Ξ»)X \sim \text{Exponential}(\lambda) with fX(x)=Ξ»eβˆ’Ξ»xf_X(x) = \lambda e^{-\lambda x} for xβ‰₯0x \geq 0: E[X]=∫0∞xβ‹…Ξ»eβˆ’Ξ»x dx=1Ξ»E[X] = \int_0^{\infty} x \cdot \lambda e^{-\lambda x} \, dx = \frac{1}{\lambda}

Properties of Expectation

Linearity: For any constants a,ba, b and random variables X,YX, Y: E[aX+b]=aE[X]+bE[aX + b] = aE[X] + b E[X+Y]=E[X]+E[Y]E[X + Y] = E[X] + E[Y]

The linearity holds even if XX and YY are dependentβ€”no independence assumption needed!

Non-negativity: If Xβ‰₯0X \geq 0 almost surely, then E[X]β‰₯0E[X] \geq 0.

Monotonicity: If X≀YX \leq Y almost surely, then E[X]≀E[Y]E[X] \leq E[Y].

Variance

Definition

The variance of XX measures the spread of the distribution: Var(X)=E[(Xβˆ’ΞΌ)2]=E[X2]βˆ’(E[X])2\text{Var}(X) = E[(X - \mu)^2] = E[X^2] - (E[X])^2

The standard deviation is Οƒ=Var(X)\sigma = \sqrt{\text{Var}(X)}.

The variance is always non-negative: Var(X)β‰₯0\text{Var}(X) \geq 0, with equality if and only if XX is constant almost surely.

Example

For a fair die: E[X2]=βˆ‘k=16k2β‹…16=1+4+9+16+25+366=916E[X^2] = \sum_{k=1}^6 k^2 \cdot \frac{1}{6} = \frac{1+4+9+16+25+36}{6} = \frac{91}{6} Var(X)=916βˆ’(3.5)2=916βˆ’494=182βˆ’14712=3512β‰ˆ2.917\text{Var}(X) = \frac{91}{6} - (3.5)^2 = \frac{91}{6} - \frac{49}{4} = \frac{182 - 147}{12} = \frac{35}{12} \approx 2.917

Properties of Variance

Shift invariance: Var(X+b)=Var(X)\text{Var}(X + b) = \text{Var}(X) (adding a constant doesn't change spread)

Scaling: Var(aX)=a2Var(X)\text{Var}(aX) = a^2 \text{Var}(X)

Independence: If XX and YY are independent: Var(X+Y)=Var(X)+Var(Y)\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)

Remark

Expectation and variance are the first two moments of a distribution. Higher moments (skewness, kurtosis) provide additional shape information, but mean and variance suffice for many applications, especially with normal distributions which are completely determined by these two parameters.