As the semester was a bit herky jerky in how we covered the text, here is a review of the important new material. (Note, the probability review in chapters 1-6 will not be directly included on the exam.)
Organizing and Describing data
Plots: we learned contingency tables, histograms, stem-and-leaf diagrams, box-plots and q-q plots
Statistics: we learned the order statistics, the quantiles, the percentiles, the mean, variance and the median and inter-quartile range.
Samples, Statistics and sampling distributions. (skip 8.5, 8.9, 8.11. 8.12, 8.13)
We learned what we meant by random sampling: i.i.d. random variables.
We defined the likelihood function
We defined sufficient statistics – those which capture all the data in one statistic as far as likelihood is concerned.
We discussed sampling distributions for various statistics
We talked about order statistics. Important for us is the distribution of the largest and the smallest. Others are found with a more complicated formula.
We found moments and variances of the $\bar{X}$ statistic.
We discussed the important central limit theorem.
In 8.10 we found out about the $\chi^2$ distribution and saw that $\displaystyle \frac{(n-1)S^2}{\sigma^2}$ has this distibution.
Estimation (skipt 9.11)
We discussed various measures of error: $T-\theta$, $|T-\theta|$, $(T-\theta)^2$ and the one we used alot the mean squared error $E((T-\theta)^2) = var(T) + (E(T-\theta))^2$.
We learned that the size of a sample determines the accuracy of the statistic. This is basically due to the central limit theorem. A rule thumb is for 1/2 the error take 4 times as many data points.
We defined consistency. Roughly speaking, it says your statistic converges to the value it estimates as n gets large/
we discussed confidence intervals. First those based on large sample sizes so the statistic
$$\frac{\bar{X} - \mu}{S/\sqrt{n}}$$
is approximately normal.
Next, when $n$ is small, we used the $t$ distribution as appropriate.
We generalized to the notion of pivotal quantities. The procedure is clear
find a pivotal quantity. $Y$, involving the parameter you ar trying to estimate.
Use the distributino of $Y$ to solve $P(a < Y < b) = \alpha$ for some specified $\alpha$.
Algebraically solve for an interval containing your unknown parameter.
We turned our attention in 9.8 to differences between samples.
In 9.9 we estimate $\sigma$ using a known pivotal quantity.
In 9.10 we discussed two ways to derive estimators: method of moments vs. the maximum likelihood estimate.
In 9.12 we talked briefly about efficiency of an estimator. This basically compares estimators by how large their sampling variances are.
Significance testing. (skip 10.5, 10.6)
We learned who to identify the hypotheses $H_0$ and $H_A$.
We learned how to calculate a $p$-value. (pg. 423)
We learned how to do a significance test. (pg. 424) (How different is this than a decision test?)
We learned to compute $p$-values for
one sample $z$ tests
one sample $t$ tests
Tests as decision rules. skip 11.2, 11.6, 11.7, 11.9
we defined a rejection regionr or crtical region
We defined type-I errors and type-II errors, we defined the significance level of a test
$$P(reject H_0 | H_0).$$
we skipped 11.3 and 4 for the most part
In 11.5 we talked about “powerful” tests. In its simplest case, power means that for a fixed type-I error, the type-II error is as small as possible.
We talked briefly about the likelihood ratio tests. The statistic
$$\Lambda = \frac{\sup_{H_0}L(\theta)}{\sup_{H_0 \cup H_A}L(\theta)}$$
is the likelihood ratio for obivious reasons as it is a ratio of likelihoods. By looking at it, you see that small values are consistent with the null hypothesis being invalid. Rejection regions then take on the form $\Lambda < K$.
The value of this statistic is not that it is the most powerful (although it is in the simplest cases) but that it provides a uniform means to assess problems.
Two sample tests skip 12.5-12.8
The notion of treatment effects was discussed. This was summarized by me in the model
$$X_i = \mu + \tau_i + \epsilon_{ij}, 1 \leq j \leq n_i$$
The $\tau_i$ being the effects due to treatment on average, and the $\epsilon_{ij}$ being the mean 0 errors. The null hypothesis we tested was that each $\tau_i=0$, or the have the same means.
The two sample z-test is the topic of section 12.2. For large $n$ we can assume the central limit theorem applies and the statistic employed has a normal(0,1) distribution.
in 12.3 populations are compared. This is similar to 12.2, only the formulas simplify under the null hypothesis. (In fact, you don’t need to use an $S$ for the standard deviation, but $\sqrt{p(1-p)}$ )
in 12.4 we did two sample $t$ tests.
Goodness of fit
Know how to do chi-squared test as illustrated in sections 13.1, 13.2 and 13.5. In each case the formula used is
$$\chi^2 = \sum \frac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}$$
This statistic will have under the correct assumptions (large $n$) a chi-squared distribution with $ k - 1 - r$ degrees of freedom. Know how to figure out what $k$ and $r$ are.