Basic Definitions, the $\pi-\lambda$ theorem, and Carathéodory extension.
In probability theory, there are various collections of sets we are generally interested in.
Let $\Omega$ be some non-empty set. We call $\mathcal{F}\subseteq 2^\Omega$ a (or an):
- algebra if $\emptyset\in \mathcal{F}$ and $\mathcal{F}$ is closed under complements and finite unions.
- $\boldsymbol{\sigma}$-algebra if it is an algebra closed under countable unions.
- $\pi$-system if it is closed under finite intersections.
- $\lambda$-system if $\emptyset\in \mathcal{F}$ and it is closed under complement and disjoint, countable unions.
We let $\sigma(\mathcal{F}),\pi(\mathcal{F}),\lambda(\mathcal{F})$ denote the smallest $\sigma$-algebra,$\pi$-system,$\lambda$-system containing $\mathcal{F}$, respectively.
Elements of algebras and $\sigma$-algebras are called events. “Smallest” in the above definition is taken in the sense of the partial ordering of set inclusion $\subseteq$. As an aside, such minimal systems exist since (1) $\mathcal{F}\subseteq 2^{\Omega}$ and the powerset is a $\sigma$-algebra and $\pi$-system and $\lambda$-system and (2) arbitrary intersections of $\sigma$-algebras,$\pi$-systems,$\lambda$-systems containing $\mathcal{F}$ remains a $\sigma$-algebra,$\pi$-system,$\lambda$-system containing $\mathcal{F}$.
The relation between $\pi$-systems, $\lambda$-systems, and $\sigma$-algebras.
The goal of this section is to prove the $\pi-\lambda$ theorem. To do so, we prove various intermediatory results. First, closure under finite disjoint union and proper differences are really the same (given closure under complementation).
Let $\mathcal{F}\subseteq 2^\Omega$ be closed under complementation. Then the following are equivalent
- $A,B\in \mathcal{F}$ disjoint implies $A\sqcup B\in \mathcal{F}$
- $A,B\in\mathcal{F}$ with $B\subseteq A$ implies $A-B\in\mathcal{F}$
Proof.
($1\Rightarrow 2$) Fix $A,B\in\mathcal{F}$ with $B\subseteq A$. Then, $A-B=(A^c\cup B)^c$ and $A^c,B$ are disjoint. By closure under complementation and disjoint union, $A^c\in \mathcal{F}$, $A^c\cup B\in\mathcal{F}$ so that $A-B\in \mathcal{F}$
($2\Rightarrow 1$) Fix $A,B\in\mathcal{f}$ with $A,B$ disjoint. Then, $B\subseteq A^c$ with $A^c\in\mathcal{F}$. Thus, $A^c-B=(A\cup B)^c\in\mathcal{F}$. Taking the complement once more, $A\cup B\in\mathcal{F}$.
Proof.
We also have that the $\lambda$-system generated by a $\pi$-system remains a $\pi$-system.
Proof.
It remains to show that for any $A,B\in\lambda(\mathcal{F})$ that $A\cap B\in \lambda(\mathcal{F})$. To show this, consider the following collection: $$\mathcal{L}_A=\{B\in\lambda(\mathcal{F}):A\cap B\in\lambda(\mathcal{F})\}.$$ If we show (1) that $\mathcal{L}_A$ is a $\lambda$-system and (2) that $\mathcal{F}\subseteq \mathcal{L}_A$ for all $A\in\lambda(\mathcal{F})$, it will follow that $\mathcal{L}_A=\lambda(\mathcal{F})$ for all $A\in\lambda(\mathcal{F})$, which would prove the result.
Start with (1) and fix $A\in\lambda(\mathcal{F})$. Clearly, $\emptyset\in\mathcal{L}_A$. If $B\in\mathcal{L}_A$, note that $A\cap B\in\lambda(\mathcal{F})$ so that $A\cap B^c=A-(A\cap B)\in\lambda(\mathcal{F})$ by #[lem:proper_diff_union] . Therefore, $B^c\in\mathcal{L}_{A}$. Also if $\{B_n\}_n\subseteq \mathcal{L}_A$ is a pairwise disjoint collection, $A\cap\left(\bigsqcup_{n=1}^\infty B_n\right)=\bigsqcup_{n=1}^\infty(A\cap B_n)\in \lambda(\mathcal{F})$.
(2) Start with $A\in\mathcal{F}$. Then for any $B\in\mathcal{F}$, $A\cap B\in\mathcal{F}$ since $\mathcal{F}$ is a $\pi$-system. Therefore, $\mathcal{F}\subseteq \mathcal{L}_A$. Thus, for such $A$ $\lambda(\mathcal{F})=\mathcal{L}_A$. Now, fix arbitrary $A\in\lambda(\mathcal{F})$. Then for any $B\in\mathcal{F}$, by the previous part $A\in\mathcal{L}_B$ so that $A\cap B\in\lambda(\mathcal{F})$. This implies $B\in\mathcal{L}_A$. Hence, $\mathcal{F}\subseteq \mathcal{L}_A$, which concludes the result.
Then, we can conclude with the main result, the $\pi-\lambda$ theorem.
Proof.
Extending probability measures from algebras to $\sigma$-algebras.
We can uniquely extend a probability measure on a field $\mathcal{F}$ to one on $\sigma(\mathcal{F})$:
The proof of this is constructive. Define the following outer probability measure $\mathbb{P}^{\ast}:2^\Omega\to \mathbb{R}_{\geq 0}$:
$$ \begin{align} \mathbb{P}^{\ast}(A):=\inf_{A\subseteq \bigcup_n A_n,A_n\in\mathcal{F}}\sum_n \mathbb{P}(A_n)\label{eq:outer_measure} \end{align} $$
$\mathbb{P}^{\ast}$ defined in $\eqref{eq:outer_measure}$ has the following properties:
- $\mathbb{P}^{\ast}(\emptyset)=0$
- $\mathbb{P}^{\ast}$ is monotone; $A\subseteq B$ implies $\mathbb{P}^{\ast}(A)\leq \mathbb{P}^{\ast}(B)$
- $\mathbb{P}^{\ast}$ is countable subadditive; $\mathbb{P}^{\ast}(\bigcup_n A_n)\leq \sum_{n} \mathbb{P}^{\ast}(A_n)$
Proof.
(1) $\emptyset\subseteq \emptyset\in \mathcal{F}$, so $\mathbb{P}^{\ast}(\emptyset)\leq 0$, implying equality.
(2) Every countable $\mathcal{F}$-cover of $B$ is a cover for $A$, so the statement follows.
(3) Fix $\varepsilon>0$. For each $n$, choose $\{B_{n,k}\}\subseteq \mathcal{F}$ a cover for $A_n$ such that $\sum_k \mathbb{P}(B_{n,k})\leq \mathbb{P}^{\ast}(A_n)+\varepsilon/2^n$. Then: $$ \mathbb{P}^{\ast}(\bigcup_n A_n)\leq \sum_{n,k}\mathbb{P}(B_{n,k})\leq \sum_n \mathbb{P}^{\ast}(A_n)+\varepsilon. $$ Sending $\varepsilon\to 0^+$ concludes the result.
We define the following class of sets: $$ \mathcal{M}:=\{A\in 2^\Omega:\forall E\in 2^\Omega,\, \mathbb{P}^{\ast}(A\cap E)+\mathbb{P}^{\ast}(A^c\cap E)=\mathbb{P}^{\ast}(E)\}. $$ If $A\in\mathcal{M}$, we call $A$ $\mathbb{P}^{\ast}$-measurable. Remark that by sub-additivity, $A\in\mathcal{M}$ iff $\forall E\in 2^\Omega,\, \mathbb{P}^{\ast}(A\cap E)+\mathbb{P}^{\ast}(A^c\cap E)\leq \mathbb{P}^{\ast}(E)$.
Proof.
Proof.
Since $\mathcal{M}$ is a field, it suffices to show countable disjoint additivity. Let $\{A_n\}_n\subseteq \mathcal{M}$ be a countably infinite collection of disjoint, measurable sets. Define $B_n=\bigsqcup_{k=1}^n A_k$. From the previous lemma, $B_n\in\mathcal{M}$. Note then by splitting on the event $A_n$ and $A_n^c$: $$ \begin{align*} \mathbb{P}^\ast(E\cap B_n)&=\mathbb{P}^\ast(E\cap B_n\cap A_n)+\mathbb{P}^\ast(E\cap B_n\cap A_n^c)\\ &=\mathbb{P}^\ast(E\cap A_n)+\mathbb{P}^\ast(E\cap B_{n-1}). \end{align*}$$
Using $B_0:=\emptyset$, we have by induction $$ \begin{align} \mathbb{P}^\ast(E\cap B_n)=\sum_{k=1}^n \mathbb{P}^\ast(E\cap A_k).\label{eq:finite_additivity} \end{align} $$
Let $B:=\bigsqcup_{k=1}^\infty A_k=\bigcup_{k=1}^\infty B_k$. Trivially $B^c\subseteq B_n^c$ for every $n\geq 1$. Therefore: $$ \begin{align} \mathbb{P}^\ast(E)=\mathbb{P}^\ast(E\cap B_n)+\mathbb{P}^\ast(E\cap B_n^c)\geq \sum_{k=1}^n \mathbb{P}^\ast(E\cap A_k)+\mathbb{P}^\ast(E\cap B^c).\label{eq:countable_additivity} \end{align} $$
Let $n\to\infty$ in the above equation, so then by subadditivity:
$$ \begin{align*} \mathbb{P}^\ast(E)&\geq \sum_{k=1}^\infty \mathbb{P}^\ast(E\cap A_k)+\mathbb{P}^\ast(E\cap B^c)\\ &\geq \mathbb{P}^\ast(E\cap B)+\mathbb{P}^\ast(E\cap B^c). \end{align*} $$
We see then $\mathbb{P}^\ast(E)=\mathbb{P}^\ast(E\cap B)+\mathbb{P}^\ast(E\cap B^c)$.
Proof.
Proof of #[thm:caratheodory] .
By the $\pi-\lambda$ theorem ( #[thm:pi_lambda] ) using that $\mathcal{F}$ is a $\pi$-system contained in $\mathcal{M}$ ( #[lem:F_in_M] ) and that $\mathcal{M}$ is a $\lambda$-system ( #[lem:M_lambda] ), $\sigma(\mathcal{F})\subseteq \mathcal{M}$. Therefore $\mathbb{P}^{\ast}|_{\sigma(\mathcal{F})}$ is the desired extension provided it has countable disjoint additivity and $\mathbb{P}^{\ast}|_\mathcal{F}=\mathbb{P}$. For the first claim, finite disjoint additivity follows from $\eqref{eq:finite_additivity}$ in the proof of #[lem:M_lambda] by taking $E$ to be $\Omega$. Now, let $\{A_n\}_n\subseteq \mathcal{M}$ be disjoint with $B=\bigcup_{k=1}^\infty A_k$. The countably infinite case follows from $\eqref{eq:countable_additivity}$ taking $E=B$. The second claim is trivial, since if $A\subseteq \mathcal{F}$, it is its own smallest $\mathcal{F}$-cover.
For uniqueness, let $\mathbb{Q},\mathbb{Q}’$ be two extensions. Remark that $\{A\in\sigma(\mathcal{F}):\mathbb{Q}(A)=\mathbb{Q}’(A)\}$ is a $\lambda$-system containing the $\pi$-system $\mathcal{F}$ (by the definition of extension), so that $\mathbb{Q}$ and $\mathbb{Q}’$ agree over all sets in $\sigma(\mathcal{F})$ by the $\pi-\lambda$ theorem, i.e. $\mathbb{Q}=\mathbb{Q}’$.
Sometimes, we will have $\mathbb{P}$ defined over $\mathcal{F}$, but will need to check infinite subadditivity. There is a useful criterion to do so:
Proof.
$(\Rightarrow)$ From upper continuity. In particular, fix the decreasing sequence of events $B_i$ and suppose $\cap_n B_n=\emptyset$. Under this assumption, $B_1=\bigcup_{n=1}^\infty (B_n-B_{n+1})$. So by additivity: $$ \mathbb{P}(B_1)=\sum_{n=1}^\infty \mathbb{P}(B_n-B_{n+1})\leq 1<\infty $$ Therefore, since the sum is summable $\mathbb{P}(B_n)\to 0$ as $n\to\infty$.
$(\Leftarrow)$ Let $E_1,E_2,\dots$ be disjoint events. Since they are disjoint, there is no $\omega$ that appear infinitely often in these $E_i$. Therefore, letting $B_n=\bigcup_{k=n}^\infty E_k$, $B_n$ is decreasing with $\bigcap_{n=1}^\infty B_n=\emptyset$. Thus, $\mathbb{P}(B_n)\to 0$ as $n\to\infty$.
We can write: $$ |\mathbb{P}(\bigcup_{n=1}^\infty E_n)-\sum_{i=1}^N\mathbb{P}(E_i)|=\mathbb{P}(B_{N+1}), $$ by finite additivity. Remark the lefthandside is less that $\varepsilon$ for all $N\geq M_\varepsilon$. So, take $N\to\infty$ and then $\varepsilon\to 0$ to conclude.