Basic Definitions, the $\pi-\lambda$ theorem, and Carathéodory extension.

In probability theory, there are various collections of sets we are generally interested in.

Let $\Omega$ be some non-empty set. We call $\mathcal{F}\subseteq 2^\Omega$ a (or an):

algebra if $\emptyset\in \mathcal{F}$ and $\mathcal{F}$ is closed under complements and finite unions.
$\boldsymbol{\sigma}$-algebra if it is an algebra closed under countable unions.
$\pi$-system if it is closed under finite intersections.
$\lambda$-system if $\emptyset\in \mathcal{F}$ and it is closed under complement and disjoint, countable unions.

We let $\sigma(\mathcal{F}),\pi(\mathcal{F}),\lambda(\mathcal{F})$ denote the smallest $\sigma$-algebra,$\pi$-system,$\lambda$-system containing $\mathcal{F}$, respectively.

Elements of algebras and $\sigma$-algebras are called events. “Smallest” in the above definition is taken in the sense of the partial ordering of set inclusion $\subseteq$. As an aside, such minimal systems exist since (1) $\mathcal{F}\subseteq 2^{\Omega}$ and the powerset is a $\sigma$-algebra and $\pi$-system and $\lambda$-system and (2) arbitrary intersections of $\sigma$-algebras,$\pi$-systems,$\lambda$-systems containing $\mathcal{F}$ remains a $\sigma$-algebra,$\pi$-system,$\lambda$-system containing $\mathcal{F}$.

The relation between $\pi$-systems, $\lambda$-systems, and $\sigma$-algebras.

The goal of this section is to prove the $\pi-\lambda$ theorem. To do so, we prove various intermediatory results. First, closure under finite disjoint union and proper differences are really the same (given closure under complementation).

Let $\mathcal{F}\subseteq 2^\Omega$ be closed under complementation. Then the following are equivalent

$A,B\in \mathcal{F}$ disjoint implies $A\sqcup B\in \mathcal{F}$
$A,B\in\mathcal{F}$ with $B\subseteq A$ implies $A-B\in\mathcal{F}$

Proof.

($1\Rightarrow 2$) Fix $A,B\in\mathcal{F}$ with $B\subseteq A$. Then, $A-B=(A^c\cup B)^c$ and $A^c,B$ are disjoint. By closure under complementation and disjoint union, $A^c\in \mathcal{F}$, $A^c\cup B\in\mathcal{F}$ so that $A-B\in \mathcal{F}$

($2\Rightarrow 1$) Fix $A,B\in\mathcal{f}$ with $A,B$ disjoint. Then, $B\subseteq A^c$ with $A^c\in\mathcal{F}$. Thus, $A^c-B=(A\cup B)^c\in\mathcal{F}$. Taking the complement once more, $A\cup B\in\mathcal{F}$.

$\blacksquare$

We first note the following important result:

If $\mathcal{F}\subseteq 2^\Omega$ is a $\pi$-system and a $\lambda$-system, then it is a $\sigma$-algebra.

Proof.

It suffices to shown that $\{A_n\}_n\subseteq \mathcal{F}$ implies $\bigcup_{n=1}^\infty A_n\in \mathcal{F}$. Define $B_n=\bigcup_{k=1}^n A_k$. Since by assumption $\mathcal{F}$ is closed under finite intersection and complement, so to is it closed under finite union. Thus, $B_n\in \mathcal{F}$ for all $n$. Now define $C_n=B_n-B_{n-1}$, where $B_0:=\emptyset$. Then by construction: $$\bigcup_{n=1}^\infty A_n=\bigcup_{n=1}^\infty B_n=\bigsqcup_{n=1}^\infty C_n\in \mathcal{F}$$

$\blacksquare$

We also have that the $\lambda$-system generated by a $\pi$-system remains a $\pi$-system.

Let $\mathcal{F}$ be a $\pi$-system. Then, $\lambda(\mathcal{F})$ remains a $\pi$-system.

Proof.

It remains to show that for any $A,B\in\lambda(\mathcal{F})$ that $A\cap B\in \lambda(\mathcal{F})$. To show this, consider the following collection: $$\mathcal{L}_A=\{B\in\lambda(\mathcal{F}):A\cap B\in\lambda(\mathcal{F})\}.$$ If we show (1) that $\mathcal{L}_A$ is a $\lambda$-system and (2) that $\mathcal{F}\subseteq \mathcal{L}_A$ for all $A\in\lambda(\mathcal{F})$, it will follow that $\mathcal{L}_A=\lambda(\mathcal{F})$ for all $A\in\lambda(\mathcal{F})$, which would prove the result.

Start with (1) and fix $A\in\lambda(\mathcal{F})$. Clearly, $\emptyset\in\mathcal{L}_A$. If $B\in\mathcal{L}_A$, note that $A\cap B\in\lambda(\mathcal{F})$ so that $A\cap B^c=A-(A\cap B)\in\lambda(\mathcal{F})$ by #[lem:proper_diff_union] . Therefore, $B^c\in\mathcal{L}_{A}$. Also if $\{B_n\}_n\subseteq \mathcal{L}_A$ is a pairwise disjoint collection, $A\cap\left(\bigsqcup_{n=1}^\infty B_n\right)=\bigsqcup_{n=1}^\infty(A\cap B_n)\in \lambda(\mathcal{F})$.

(2) Start with $A\in\mathcal{F}$. Then for any $B\in\mathcal{F}$, $A\cap B\in\mathcal{F}$ since $\mathcal{F}$ is a $\pi$-system. Therefore, $\mathcal{F}\subseteq \mathcal{L}_A$. Thus, for such $A$ $\lambda(\mathcal{F})=\mathcal{L}_A$. Now, fix arbitrary $A\in\lambda(\mathcal{F})$. Then for any $B\in\mathcal{F}$, by the previous part $A\in\mathcal{L}_B$ so that $A\cap B\in\lambda(\mathcal{F})$. This implies $B\in\mathcal{L}_A$. Hence, $\mathcal{F}\subseteq \mathcal{L}_A$, which concludes the result.

$\blacksquare$

Then, we can conclude with the main result, the $\pi-\lambda$ theorem.

Suppose $\mathcal{F}$ is a $\pi$-system and $\Lambda$ is a $\lambda$-system. If $\mathcal{F}\subseteq \Lambda$, then $\sigma(\mathcal{F})\subseteq \Lambda$.

Proof.

Trivially, $\lambda(\mathcal{F})\subseteq \Lambda$. By #[prop:remaining_pi_system] , $\lambda(\mathcal{F})$ is a $\pi$-system. Then invoking #[prop:pi_lambda_sigma] , $\lambda(\mathcal{F})$ is a $\sigma$-algebra. We conclude by noticing: $$\sigma(\mathcal{F})\subseteq \lambda(\mathcal{F})\subseteq \Lambda.$$

$\blacksquare$

Extending probability measures from algebras to $\sigma$-algebras.

Consider the tuple $(\Omega,\mathcal{F})$ where $\mathcal{F}$ is either an algebra or $\sigma$-algebra. We call $\mathbb{P}:\mathcal{F}\to [0,1]$ a probability measure over $(\Omega,\mathcal{F})$ if $\mathbb{P}(\Omega)=1-\mathbb{P}(\emptyset)=1$ and if $A_1,A_2,\dots$ are disjoint $\mathcal{F}$-sets and if $\bigcup_{k=1}^\infty A_k\in \mathcal{F}$ then: $$\mathbb{P}(\bigcup_{k=1}^\infty A_k)=\sum_{k=1}^\infty \mathbb{P}(A_k)$$

We can uniquely extend a probability measure on a field $\mathcal{F}$ to one on $\sigma(\mathcal{F})$:

(Carathéodory Extension Theorem).

Let $\mathbb{P}$ be a probability measure on a field $\mathcal{F}$, then there is a unique extension of $\mathbb{P}$ to a probability measure on $\sigma(\mathcal{F})$.

The proof of this is constructive. Define the following outer probability measure $\mathbb{P}^{\ast}:2^\Omega\to \mathbb{R}_{\geq 0}$:

$$ \begin{align} \mathbb{P}^{\ast}(A):=\inf_{A\subseteq \bigcup_n A_n,A_n\in\mathcal{F}}\sum_n \mathbb{P}(A_n)\label{eq:outer_measure} \end{align} $$

$\mathbb{P}^{\ast}$ defined in $\eqref{eq:outer_measure}$ has the following properties:

$\mathbb{P}^{\ast}(\emptyset)=0$
$\mathbb{P}^{\ast}$ is monotone; $A\subseteq B$ implies $\mathbb{P}^{\ast}(A)\leq \mathbb{P}^{\ast}(B)$
$\mathbb{P}^{\ast}$ is countable subadditive; $\mathbb{P}^{\ast}(\bigcup_n A_n)\leq \sum_{n} \mathbb{P}^{\ast}(A_n)$

Proof.

(1) $\emptyset\subseteq \emptyset\in \mathcal{F}$, so $\mathbb{P}^{\ast}(\emptyset)\leq 0$, implying equality.

(2) Every countable $\mathcal{F}$-cover of $B$ is a cover for $A$, so the statement follows.

(3) Fix $\varepsilon>0$. For each $n$, choose $\{B_{n,k}\}\subseteq \mathcal{F}$ a cover for $A_n$ such that $\sum_k \mathbb{P}(B_{n,k})\leq \mathbb{P}^{\ast}(A_n)+\varepsilon/2^n$. Then: $$ \mathbb{P}^{\ast}(\bigcup_n A_n)\leq \sum_{n,k}\mathbb{P}(B_{n,k})\leq \sum_n \mathbb{P}^{\ast}(A_n)+\varepsilon. $$ Sending $\varepsilon\to 0^+$ concludes the result.

$\blacksquare$

We define the following class of sets: $$ \mathcal{M}:=\{A\in 2^\Omega:\forall E\in 2^\Omega,\, \mathbb{P}^{\ast}(A\cap E)+\mathbb{P}^{\ast}(A^c\cap E)=\mathbb{P}^{\ast}(E)\}. $$ If $A\in\mathcal{M}$, we call $A$ $\mathbb{P}^{\ast}$-measurable. Remark that by sub-additivity, $A\in\mathcal{M}$ iff $\forall E\in 2^\Omega,\, \mathbb{P}^{\ast}(A\cap E)+\mathbb{P}^{\ast}(A^c\cap E)\leq \mathbb{P}^{\ast}(E)$.

$\mathcal{M}$ is a field.

Proof.

Clearly, $\Omega\in \mathcal{M}$ and $\mathcal{M}$ is closed under complementation. It suffices to show $A,B\in\mathcal{M}$ implies $A\cap B\in\mathcal{M}$: $$ \begin{align*} \mathbb{P}^{\ast}(E)&=\mathbb{P}^{\ast}(B\cap E)+\mathbb{P}^{\ast}(B^c\cap E)\\ &=\mathbb{P}^{\ast}(A\cap B\cap E)+\mathbb{P}^{\ast}(A^c\cap B\cap E)+\mathbb{P}^{\ast}(A\cap B^c\cap E)+\mathbb{P}^{\ast}(A^c\cap B^c\cap E)\\ &\geq \mathbb{P}^{\ast}(A\cap B\cap E)+\mathbb{P}^{\ast}(((A^c\cap B)\cup (A\cap B^c)\cup (A^c\cap B^c))\cap E)\\ &=\mathbb{P}^{\ast}((A\cap B)\cap E)+\mathbb{P}^{\ast}((A\cap B)^c\cap E) \end{align*} $$

$\blacksquare$

$\mathcal{M}$ is a $\lambda$-system.

Proof.

Since $\mathcal{M}$ is a field, it suffices to show countable disjoint additivity. Let $\{A_n\}_n\subseteq \mathcal{M}$ be a countably infinite collection of disjoint, measurable sets. Define $B_n=\bigsqcup_{k=1}^n A_k$. From the previous lemma, $B_n\in\mathcal{M}$. Note then by splitting on the event $A_n$ and $A_n^c$: $$ \begin{align*} \mathbb{P}^\ast(E\cap B_n)&=\mathbb{P}^\ast(E\cap B_n\cap A_n)+\mathbb{P}^\ast(E\cap B_n\cap A_n^c)\\ &=\mathbb{P}^\ast(E\cap A_n)+\mathbb{P}^\ast(E\cap B_{n-1}). \end{align*}$$

Using $B_0:=\emptyset$, we have by induction $$ \begin{align} \mathbb{P}^\ast(E\cap B_n)=\sum_{k=1}^n \mathbb{P}^\ast(E\cap A_k).\label{eq:finite_additivity} \end{align} $$

Let $B:=\bigsqcup_{k=1}^\infty A_k=\bigcup_{k=1}^\infty B_k$. Trivially $B^c\subseteq B_n^c$ for every $n\geq 1$. Therefore: $$ \begin{align} \mathbb{P}^\ast(E)=\mathbb{P}^\ast(E\cap B_n)+\mathbb{P}^\ast(E\cap B_n^c)\geq \sum_{k=1}^n \mathbb{P}^\ast(E\cap A_k)+\mathbb{P}^\ast(E\cap B^c).\label{eq:countable_additivity} \end{align} $$

Let $n\to\infty$ in the above equation, so then by subadditivity:

$$ \begin{align*} \mathbb{P}^\ast(E)&\geq \sum_{k=1}^\infty \mathbb{P}^\ast(E\cap A_k)+\mathbb{P}^\ast(E\cap B^c)\\ &\geq \mathbb{P}^\ast(E\cap B)+\mathbb{P}^\ast(E\cap B^c). \end{align*} $$

We see then $\mathbb{P}^\ast(E)=\mathbb{P}^\ast(E\cap B)+\mathbb{P}^\ast(E\cap B^c)$.

$\blacksquare$

$\mathcal{F}\subseteq \mathcal{M}$

Proof.

Let $\varepsilon>0$ be arbitrary. Let $A\in \mathcal{F}$ and $E$ be any set. Choose $\{A_n\}\subseteq \mathcal{F}$ cover $E$ such that $\sum_n\mathbb{P}(A_n)\leq \mathbb{P}^{\ast}(E)+\varepsilon$. Define $B_n=A_n\cap A$ and $C_n=A_n\cap A^c$. Then, by finite subadditivity of $\mathbb{P}$: $$ \mathbb{P}^{\ast}(E\cap A)+\mathbb{P}^{\ast}(E\cap A^c)\leq \sum_n (\mathbb{P}(B_n)+\mathbb{P}(C_n))=\sum_n \mathbb{P}(A_n)\leq \mathbb{P}(E)+\varepsilon, $$ sending $\varepsilon\to 0^+$ shows one inequality, with the other from general subadditivity.

$\blacksquare$

Proof of #[thm:caratheodory] .

By the $\pi-\lambda$ theorem ( #[thm:pi_lambda] ) using that $\mathcal{F}$ is a $\pi$-system contained in $\mathcal{M}$ ( #[lem:F_in_M] ) and that $\mathcal{M}$ is a $\lambda$-system ( #[lem:M_lambda] ), $\sigma(\mathcal{F})\subseteq \mathcal{M}$. Therefore $\mathbb{P}^{\ast}|_{\sigma(\mathcal{F})}$ is the desired extension provided it has countable disjoint additivity and $\mathbb{P}^{\ast}|_\mathcal{F}=\mathbb{P}$. For the first claim, finite disjoint additivity follows from $\eqref{eq:finite_additivity}$ in the proof of #[lem:M_lambda] by taking $E$ to be $\Omega$. Now, let $\{A_n\}_n\subseteq \mathcal{M}$ be disjoint with $B=\bigcup_{k=1}^\infty A_k$. The countably infinite case follows from $\eqref{eq:countable_additivity}$ taking $E=B$. The second claim is trivial, since if $A\subseteq \mathcal{F}$, it is its own smallest $\mathcal{F}$-cover.

For uniqueness, let $\mathbb{Q},\mathbb{Q}’$ be two extensions. Remark that $\{A\in\sigma(\mathcal{F}):\mathbb{Q}(A)=\mathbb{Q}’(A)\}$ is a $\lambda$-system containing the $\pi$-system $\mathcal{F}$ (by the definition of extension), so that $\mathbb{Q}$ and $\mathbb{Q}’$ agree over all sets in $\sigma(\mathcal{F})$ by the $\pi-\lambda$ theorem, i.e. $\mathbb{Q}=\mathbb{Q}’$.

$\blacksquare$

Sometimes, we will have $\mathbb{P}$ defined over $\mathcal{F}$, but will need to check infinite subadditivity. There is a useful criterion to do so:

Let $\mathcal{F}$ be an algebra and $\mathbb{P}:\mathcal{F}\to [0,1]$ with $\mathbb{P}(\Omega)=1-\mathbb{P}(\emptyset)=1$ such that it is finitely additive. Then, $\mathbb{P}$ is countably infinitely additive (and thus a probability measure) if and only if for any decreasing sequence of sets $B_1\supseteq B_2\supseteq\cdots$, with $B_i\in\mathcal{F}$, $\lim_{n\to\infty}\mathbb{P}(B_n)>0$ implies $\bigcap_n B_n\neq \emptyset$

Proof.

$(\Rightarrow)$ From upper continuity. In particular, fix the decreasing sequence of events $B_i$ and suppose $\cap_n B_n=\emptyset$. Under this assumption, $B_1=\bigcup_{n=1}^\infty (B_n-B_{n+1})$. So by additivity: $$ \mathbb{P}(B_1)=\sum_{n=1}^\infty \mathbb{P}(B_n-B_{n+1})\leq 1<\infty $$ Therefore, since the sum is summable $\mathbb{P}(B_n)\to 0$ as $n\to\infty$.

$(\Leftarrow)$ Let $E_1,E_2,\dots$ be disjoint events. Since they are disjoint, there is no $\omega$ that appear infinitely often in these $E_i$. Therefore, letting $B_n=\bigcup_{k=n}^\infty E_k$, $B_n$ is decreasing with $\bigcap_{n=1}^\infty B_n=\emptyset$. Thus, $\mathbb{P}(B_n)\to 0$ as $n\to\infty$.

We can write: $$ |\mathbb{P}(\bigcup_{n=1}^\infty E_n)-\sum_{i=1}^N\mathbb{P}(E_i)|=\mathbb{P}(B_{N+1}), $$ by finite additivity. Remark the lefthandside is less that $\varepsilon$ for all $N\geq M_\varepsilon$. So, take $N\to\infty$ and then $\varepsilon\to 0$ to conclude.

$\blacksquare$