4.7:26

The joint density is \[ \prod_i \frac{1}{\sqrt{2\pi m_i\sigma^2}} \exp\left[-\frac{(x_j-m_i\mu)^2}{2m_i\sigma^2}\right] = \frac{1}{\sqrt{2\pi\prod m_i}\sqrt{\sigma^2}} \exp\left[-\sum\frac {x_i^2-2m_i\mu x_i+m_i^2\mu^2} {2m_i\sigma^2}\right] = \\ \color{orange}{h(\sigma^2)} \exp\left[ -\left( \color{green}{\frac{1}{\sigma^2}} \color{blue}{\sum\frac{x_i^2}{m_i}} \color{green}{-\frac{2\mu}{\sigma^2}} \color{blue}{\sum x_i} + \color{orange}{\mu^2\sum m_i^2} \right) \right] \]

This forms a two dimensional full rank exponential family with \(T(x) = \left(\sum x_i^2/m_i ; \sum x_i\right)^T\), so this pair forms a complete sufficient statistic.

a)

Now, since \(X_i\sim\mathcal{N}(m_i\mu,m_i\sigma^2)\), we know that \(Y_i=(X_i-\sqrt{m_i}\mu)/\sqrt{m_i}\sim\mathcal{N}(0,\sigma^2)\) and \(Z_i=X_i/m_i\sim\mathcal{N}(\mu,\sigma^2/m_i)\). So the estimators \[ S^2 = \frac{\sum (Y_i-\overline{Y})^2}{n-1} \qquad \overline{Z} = \frac{\sum Z_i}{n} \] are unbiased estimators of \(\sigma^2\) and \(\mu\) respectively. Since they are functions exclusively of the complete sufficient statistics, their expected values conditioned on the statistics are the function values themselves, so these are UMVU.

b)

We follow the structure of example 3.22.

Suppose \(\sigma^2\) is known. Then \(\overline{Z}\) is complete sufficient for \(\mu\). The sample variance \(S^2\) is ancillary: the variable change \(X_i \to X_i-\sqrt{m_i}\mu\) removes the dependency on \(\mu\) from the probability density for \(Y_i\).

Since \(S^2\) is ancillary and \(\overline{Z}\) is complete sufficient, by Basu’s theorem they are independent.

Suppose instead \(\mu\) is known.

4.7:30

a)

The Fisher information matrix has entries, with \(\theta = (\alpha, \beta)^T\)

\[ \mathcal{I}(\theta)_{i,j} = \mathbb{E}\left[ \frac{\partial\log p(X|\theta)}{\partial\theta_i} \frac{\partial\log p(X|\theta)}{\partial\theta_j} \right] \]

The joint log density is \[ p(X|\alpha,\beta) = -\log\sqrt{2\pi} + \frac{1}{2}\sum X_i^2 -\sum(\alpha+\beta t_i) X_i + \frac{1}{2}\sum(\alpha+\beta t_i)^2 \]

So the partial derivatives are \[ \frac{\partial\log p(X|\theta)}{\partial\alpha} = -\sum X_i + \sum(\alpha+\beta t_i) \\ \frac{\partial\log p(X|\theta)}{\partial\beta} = -\sum t_i X_i + \sum t_i(\alpha+\beta t_i) \] and the full information matrix is \[ \mathcal{I}(\alpha,\beta) = \begin{pmatrix} n & \sum t_i \\ \sum t_i & \sum t_i^2 \end{pmatrix} \]

b)

Estimating \(\alpha\) amounts to estimating \(g(\alpha, \beta)=\alpha\), which has gradient \(\Delta g=(1, 0)^T\). The multivariate variance bound is \[ \Delta g^T\mathcal{I}^{-1}\Delta g = \begin{pmatrix} 1 & 0 \end{pmatrix} \begin{pmatrix} n & \sum t_i \\ \sum t_i & \sum t_i^2 \end{pmatrix}^{-1} \begin{pmatrix} 1 \\ 0 \end{pmatrix} \] which simply picks out the upper left corner of the inverse matrix. This comes out to \[ \sum t_i^2\Big{/}\left[n\sum t_i^2-\left(\sum t_i\right)^2\right] = \frac{1/n}{1-\frac{(\sum t_i/n)^2}{\sum t_i^2/n}} \]

c)

The Fisher information for known \(\beta\) is simply \[ \mathcal{I}(\alpha) = \frac{\partial^2\log p(X|\alpha)}{\partial\alpha^2} = \frac{\partial^2}{\partial\alpha^2}\left[ -\log\sqrt{2\pi} + \frac{1}{2}\sum X_i^2 -\sum(\alpha+\beta t_i) X_i + \frac{1}{2}\sum(\alpha+\beta t_i)^2 \right] = n \]

So our variance bound is \(\mathbb{V}\hat\alpha\geq1/n\).

d)

The two bounds agree whenever \(\sum t_i=0\). When the \(t_i\) do not sum to 1, knowing \(\beta\) gives a smaller lower bound.

e)

When \(g(\alpha,\beta)=\alpha\beta\), the gradient is \((\beta,\alpha)^T\) and the bound comes out to \[ \frac{1}{n\sum t_i^2-\left(\sum t_i\right)^2}\cdot \begin{pmatrix} \beta & \alpha \end{pmatrix} \begin{pmatrix} \sum t_i^2 & -\sum t_i \\ -\sum t_i & n \end{pmatrix} \begin{pmatrix} \beta \\ \alpha \end{pmatrix} = \frac{\beta^2\sum t_i^2-2\alpha\beta\sum t_i + n\alpha^2} {n\sum t_i^2-\left(\sum t_i\right)^2} \]