1.11:5

First, it is worth observing that any sigma-algebra is closed under pairwise intersections, since \(A\cap B = (A^c\cup B^c)^c\): if \(A\) and \(B\) are both in \(\mathcal{A}\), then so is \(A\cap B\). Hence, all the sets \(A\cap B\) at which the truncated measure is evaluated are actually measurable.

Now, for \(\nu\) to be a measure, we need to verify the conditions in the definition:

  1. \(\nu: \mathcal{A}\to [0,\infty]\). This holds because \(\nu\) only takes values that \(\mu\) takes.
  2. Disjoint unions produce sums. This holds because if \(B_1, B_2, \dots\) are all disjoint sets in \(\mathcal{A}\), then so are \(A\cap B_1, A\cap B_2, \dots\). So \[ \nu\left(\bigcup_{i=1}^\infty B_i\right) = \mu\left(A\cap\bigcup_{i=1}^\infty B_i\right) = \mu\left(\bigcup_{i=1}^\infty A\cap B_i\right) = \sum_{i=1}^\infty\mu\left(A\cap B_i\right) = \sum_{i=1}^\infty\nu\left(B_i\right) \]

1.11:10

We need to check the axioms for a measure.

  1. Since \(\mu\) and \(\nu\) are measures, \(\mu(B)\in[0,\infty]\) and \(\nu(B)\in[0,\infty]\). Hence \(\eta(B)\) is the sum of two non-negative real values, and thus is also a non-negative real value.
  2. Suppose \(B_1, B_2, \dots\) are disjoint sets in \(\mathcal{B}\). \[\begin{multline*} \eta\left(\bigcup_{i=1}^\infty B_i\right) = \mu\left(\bigcup_{i=1}^\infty B_i\right) + \nu\left(\bigcup_{i=1}^\infty B_i\right) = \\ \sum_{i=1}^\infty \mu(B_i) + \sum_{i=1}^\infty \nu(B_i) = \sum_{i=1}^\infty \left(\mu(B_i) + \nu(B_i)\right) = \sum_{i=1}^\infty \eta(B_i) \end{multline*}\]

Now to prove that \(\int f d\eta = \int f d\mu + \int f d\nu\).

Lemma: For an indicator function, \(\int 1_A d\eta = \int 1_A d\mu + \int 1_A d\nu\).

Proof: By integral property 1: \(\int 1_A d\eta = \eta(A) = \mu(A) + \nu(A) = \int 1_A d\mu + \int 1_A d\nu\).

Lemma: For a nonnegative simple \(f\), the claim holds.

Proof: By integral property \(2\), \[\begin{multline*} \int f d\eta = \int\sum_i a_i 1_{A_i} d\eta = \sum_i a_i\int 1_{A_i} d\eta = \sum_i a_i\eta{A_i} = \\ \sum_i a_i\left(\mu(A_i) + \nu(A_i)\right) = \sum_i a_i\mu(A_i) + \sum_i a_i\nu(A_i) = \\ \sum_i a_i\int 1_{A_i} d\mu + \sum_i a_i\int 1_{A_i} d\nu = \int \sum_i a_i 1_{A_i} d\mu + \int \sum_i a_i 1_{A_i} d\nu = \int f d\mu + \int f d\nu \end{multline*}\]

Theorem For non-negative measurable \(f\), \(\int f d\eta = \int f d\mu + \int f d\nu\).

Proof Since limits are linear, for a sequence \(f_n\) of nonnegative simple functions converging to \(f\), \[ \int f d\eta = \lim_{n\to\infty} \int f_n d\eta = \lim_{n\to\infty} \left(\int f d\mu + \int f d\nu\right) = \lim_{n\to\infty} \int f_n d\mu + \lim_{n\to\infty} \int f_n d\nu = \int f d\mu + \int f d\nu \]

1.11:17

The probability of \(X\) even is the sum of masses of even outcomes:

\[ \mathbb{P}(\text{even}) = \sum_{i=0}^\infty \mathbb{P}(X=2i) = \sum_{i=0}^\infty \theta(1-\theta)^{2i} = \theta\sum_{i=0}^\infty (1-\theta)^{2i} \]

Set \(z=(1-\theta)^2\) and this rewrites to

\[ \mathbb{P}(\text{even}) = \theta\sum_{i=0}^\infty z^i = \frac{\theta}{1-z} = \frac{\theta}{(1-(1-\theta)^2} = \frac{\theta}{1-(1-2\theta+\theta^2)} = \frac{\theta}{2\theta-\theta^2} = \frac{1}{2-\theta} \]

1.11:20

The probability of \(X\) even is the sum of masses of even outcomes:

\[ \mathbb{P}(\text{even}) = \sum_{i=0}^\infty \mathbb{P}(X=2i) = \sum_{i=0}^\infty \frac{\lambda^{2i} e^{-\lambda}}{(2i)!} = e^{-\lambda}\sum_{i=0}^\infty\frac{\lambda^{2i}}{(2i)!} \]

Recall the Taylor expansion of \(e^x\): \[ e^x = \sum_{i=0}^\infty \frac{x^i}{i!} \]

We can notice that \[ e^x + e^{-x} = \sum_i \frac{x^i + (-x)^i}{i!} = \sum_j \frac{2x^{2j}}{(2j)!} \] since the odd terms cancel out.

From this follows that \[ \mathbb{P}(\text{even}) = e^{-\lambda} \sum_i \frac{\lambda^{2i}}{(2i)!} = \frac{e^{-\lambda}}{2}\left(e^{\lambda}+e^{-\lambda}\right) = \frac{1+e^{-2\lambda}}{2} \]

1.11:27

First off, the mean: \[ \mathbb{E}\pmatrix{X \\ X^2} = \pmatrix{\mathbb{E}X \\ \mathbb{E}X^2} \]

The mean of \(X\) is \[ \mathbb{E}X = \int_0^1 x p_X(x) dx = \int_0^1 x dx = 1/2 \]

And the mean of \(X^2\) is \[ \mathbb{E}X^2 = \int_0^1 x^2 p_X(x) dx = \int_0^1 x^2 dx = 1/3 \]

So the mean vector is \[ \mathbb{E}\pmatrix{X \\ X^2} = \pmatrix{1/2 \\ 1/3} \]

Now, the covariance is \[ \mathbb{E}\left( (X-\mathbb{E}X)(X-\mathbb{E}X)^T \right) = \mathbb{E} \pmatrix{ (X-1/2)^2 & (X-1/2)(X^2-1/3) \\ (X-1/2)(X^2-1/3) & (X^2-1/3)^2\\ } \]

We’ll need to calculate three integrals

\[\begin{align*} \int_0^1 (x-1/2)^2 p_X(x) dx &= \int_0^1 (x-1/2)^2 dx \\ &= \int_{-1/2}^{1/2} y^2 dy \\ &= \frac{1}{3\cdot2^3} - \frac{-1}{3\cdot2^3} \\ &= \frac{2}{3\cdot2^3} = \frac{1}{12} \\ \int_0^1 (X-1/2)(X^2-1/3) dx &= \int_0^1 X^3 - X^2/2 - X/3 + 1/6 dx \\ &= 1/4 - 1/6 - 1/6 + 1/6 \\ &= 1/12 \\ \int_0^1 (x^2-1/3)^2 dx &= \int_0^1 x^4 - 2/3 x^2 + 1/9 dx \\ &= 1/5 - 2/9 + 1/9 \\ &= 4/45 \end{align*}\]

So the covariance matrix is \[ \pmatrix{ 1/12 & 1/12 \\ 1/12 & 4/45 } \]

1.11:45

By definition, \[ \mathbb{E}(f(X)Y|X) = \int f(x)y dQ_x(y) \]

Since this integral is over a distribution on \(Y\), the value \(f(x)\) is a constant, and by linearity of the integral can be extracted: \[ \int f(x)y dQ_x(y) = f(x)\int ydQ_x(y) = f(x)\mathbb{E}(Y|X) \]