Review for the midterm

The midterm will be October 18th. It will be in-person. I still need to get the room number to you!

The test will cover material in chapters 3,4,5,6,7.

What the semester has been about so far.

The semester has been building up to just 3 sections; other efforts were to ensure the probabilistic language of those 3 sections was understood.

Section 7.2

This is the critical section so far. It is where we finally get down to doing statistics, though specialized to the case that our sample, $Y_1, Y_2, \dots, Y_n$, is

The latter allows many computations to be done. The first two are what we are now calling i.i.d

So, just to understand the three things above we needed to


Starting with the $Y$s we then considered $\bar{Y} = (1/n) \Sigma Y_i$.

With that we have our first key results in statistics:

$\bar{Y_n}$ has a normal distribution which can be used to find confidence intervals

That is, we can figure out an interval around $\bar{Y}$ which is as likely as specified to contain $\mu$, and for a given value $\bar{y}$ is called a confidence interval.


Next, rather than consider $(\Sigma Y_i)/n$, the next distribution to consider was of $\Sigma Z_i^2$, where $Z_i$ is a standard normal. This distribution is Chi-squared with $n$ degrees of freedom (a parameter)

The key computation involves two things:


With that distribution under your belt, you can show that $(n-1)S^2/\sigma^2$ has a chi-squared with $n-1$ degrees of freedom.

Key to that is a proof we avoided, but in the simple case of just 2 random variables (p358) comes down to the fact that $U_1 = Y_1+Y_2$ and $U_2 = Y_1 - Y_2$ are independent. (And more generally, for a normal i.i.d. sample $\bar{Y}$ and $S$ are independent.)

To see that they are independent, we needed (6.6) giving us a technique to compute the joint pdf of $U_1$ and $U_2$.


The normalized statistic $\sqrt{n}(\bar{Y}_n - \mu)/\sigma$ can be computed assuming both $\mu$ and $\sigma$ are known. We will see soon in Chapter 8, that it is more useful to have only one unknown parameter. Substituting $S$ for $\sigma$ leads to $T=\sqrt{n}(\bar{Y} - n)/S$. It can be shown (could you do that on a test, it appears on p360) that this statistic is of the form $Z/\sqrt{W/\nu}$ where $Z$, as usual, is a standard normal, and $W$ is a Chi-squared distribution with $\nu$ degrees of freedom. The distribution of $Z/\sqrt{W/\nu}$ is called the $t$-distribution with $n$ degrees of freedom. (The chi-square in $T$ has $n-1$ degrees of freedom.)


Finally the $F$ distribution was introduced. This is something you need to know how to compute, but we didn't do any computations, mostly as they don't simplify so nicely as the $T$.

7.3 The CLT

The central limit theorem (CLT) is the major theorem of statistics. It saying that the limit of the c.d.f. of $\sqrt{n}(\bar{Y}_n) - \mu)/\sigma$ is the c.d.f. of the standard normal under the simple assumption that $Y_i$ are an i.i.d. sample from some distribution with mean $\mu$ and variance $\sigma^2$.

This is always true when the population is normal, the big deal is that this applies to any population (the common distribution of each $Y_i$) assuming only that the mean and standard deviation are finite.

From a statistical point of view, the CLT says how far $\bar{Y}$ and $\mu$ differ in probabilistic language. This allows discussion about confidence intervals.

7.5 The normal approximation to the binomial

The binomial distribution (3.4) is generated by a sum of i.i.d. Bernoulli random variables. Bernoulli random variables are also used to define the Geometric distribution (3.5), the negative binomial distribution (3.6). Binomial random variable assume independence. Related random variables – not assuming independence – are characterized by the hypergeometric distribution (3.7).

(7.5) shows one case of the limit of binomial random variables–the Poisson in (3.8) was another. Applying the CLT to the Binomial yields the normal approximation to the binomial, a useful tool for calculating probabilities.

Sample problems.

Probably the best source of potential test problems is to review the HW problems, but here are few for you to mull over if you want

Discrete

Continuous

Transformations

order statistics

Let $U=X_{(1)}$ and $V=X_{(n)}$ be the smallest and largest values of an iid sequence of Uniform(0,1) random variables.

In 6.10, we learned their joint density is given by

$$~ f(u,v) = n \cdot (n-1) \cdot (v-u)^{n-2}, 0 \leq u \leq v \leq 1 ~$$

Now let $D = V - U$. The density of $D$ can be found using the techniques of chapter 6. It will be

$$~ n \cdot (n-1) \cdot (1-d) d^{n-2} ~$$

What is the mean and variance of $D$? (Think before you integrate.)

Sampling

Suppose a researcher performs a survey where they call at random 100 people from a list of 10,000 phone numbers, never re-dialing the same number. Suppose, miraculously, that 100 people answer and respond. Let $\hat{p}$ be the proportion who said "yes." It is not true that the the distribution of $p$ is normal. However, specify all the steps one needs to argue to say it is approximately normal. (I.e. how to apply the CLT to this question.)

Compute

Let $X_1, \dots, X_{9}$ be an iid sample from a normal distribution with mean $\mu$ and standard deviation $\sigma$. Find the value $b$ for which $P(\bar{X} - \mu < b) = 0.95$.

Gossett

An internet source has this to say about the work of Gossett (aka Student) that drove the investigation of the $T$ statistic:

Young chemists at Guinness Brewery, then the world's largest, designed many field and lab experiments to determine the best barley, best hops, best temperatures for brewing, etc.

They began to accumulate data and, at once, they ran into difficulties because their measurements varied. The effects they were looking for were not usually clearcut or consistent, as they had expected, and they had no way of judging whether the differences they found were effects of treatment or accident.

Two difficulties were confounded: the variation was high and the observations were few. The young research brewers worked well together- some were very close friends. Each seemed to fit into own role in brewery affairs. And to them it seemed natural to take their numerical problems to Gosset. He had done some mathematics at Oxford and seemed less scared of mathematics than they were. (Gosset was around 23).

The term "variation was high" speaks to what parameter? The term "observations were few" speak to what symbol?