Test-1 For Statistics 410
Here is the exam. If you have any questions, please e-mail. If you can’t get this exam to print properly a PostScript copy of this exam is available from
http://www.math.csi.cuny.edu/ verzani/410/test-1.ps
(except for the figure.) A PDF copy of this exam is available from
http://www.math.csi.cuny.edu/ verzani/410/test1.pdf
(Thanks to Armin for this format)
There are 11 questions. Please do any 10. It is due on Tueday the 16th.
Answer the questions below based on this stem and leaf diagram of test scores from a Bio 123 class.
|r|l stem & leaves
10 &
9 & 1 4 4 7
8 & 1 1 3 4 5 5 5 7 8
7 & 0 0 2 3 8 8 9
6 & 0 4 8
5 & 2 3 3
4 & 7
3 &
2 & 8 8 9
Find the 5 number summary for the data.
Make a box plot.
Make a histogram.
Which students deserve a failing grade?
Let $X$ and $Y$ be two random variables with joint density
$$f(x,y) = c x^2 y^2, 0 \leq x,y \leq 1.$$
Find $c$.
Find the marginal density of $X$, $f_X(x)$.
Are $X$ and $Y$ exchangeable? Why?
Are $X$ and $Y$ independent? Why?
Find $E(X)$, $var(X)$, $E(XY)$.
In a certain area 25% of all cars emit excessive pollutants. At the yearly testing suppose the following is true. The probability that a car emitting too many pollutants will fail is 99% and the probability a good car will falsely fail is 15%.
What is the probability that a car that fails the exam actually emit too much pollution?
Suppose it is known that cars that are 65% of cars 10 years old or older emit excessive pollutants. The other probabilities remain the same. Now if a 12 year old car fails the test, what is the probability it emits too much pollution?
Pick a point $(X,Y)$ uniformly from the triangle formed by the three points $(-1,0), (1,0), (1,1)$.
Find the cumulative distribution function
$$F(x,y) = P(X \leq x, Y \leq y)$$
Find the joint density $f(x,y)$.
Find the marginal density of $X$.
Show that the conditional density of $X$ given $Y = y$ is uniform.
For the picture below, if $T$ is assumed to be uniform on $[-\pi/2,\pi/2]$ show that $X$ has a Cauchy distribution. It is enough to find $F(x) = P(X \leq x)$.
Verify the formula
$$var(X_1 + \dots + X_n) = var(X_1) + \dots + var(X_n) + \sum_{i} \sum_{j \ne i} cov(X_i,X_j)$$
in the case $n=2$.
In $n=2$ what happens if $X_1$ and $X_2$ are independent. Why?
Compute $var(X_1 - X_2)$ when $X_1$ and $X_2$ are independent.
Suppose $X_i$ $1 \leq i \leq n$ are normal with mean $\mu_1$ and standard deviation $\sigma_1$ and $Y_j$ $1 \leq j \leq m$ are normal with mean $\mu_2$ and standard deviation $\sigma_2$. Furthermore all $n+m$ random variables are assumed to be independent. Set
$$Xbar = \bar X = \frac{X_1 + \dots X_n}{n},\quad Ybar = \bar Y = \frac{Y_1 + \dots Y_m}{m}.$$
Compute $E(Xbar)$ and $var(Xbar)$.
Compute $E(Xbar - Ybar)$, $var(Xbar - Ybar)$,
In section 4.8 the multinomial distribution is discussed which generalizes the binomial distribution. The key formula is on page 150. Use this to answer the following.
Let $X_1, X_2, \dots X_{n}$ be i.i.d. random variables which are $U(0,1)$ (uniform on $[0,1]$). What is the probability the $k$th largest is in the interval $[y,y+dy)$, $k-1$ are in the interval $[0,y)$ and $n-k$ are in the interval $[y+dy,1]$?
Suppose $n$ people from Staten Island were called at random via sampling with replacement and asked if they supported keeping the dump open beyond the current deadline. Only 22% said yes.
In the notation from class, what is $\hat p$? (phat), and what does $p$ represent?
If $n$ was 1000, what is an interval around $\hat p$ for which we are 95% certain $p$ lies in.
If we want to know 95% of the time that $p$ is in the interval $(\hat p -.01, \hat p + .01)$ how large should $n$ be?
()
Suppose 25% of the population is a certain minority. However, in $1000$ arrests by the police 33% were this minority type. How likely is this under an assumption that each arrest is independent, and each person has an equal chance of being arrested? Use the Normal approximation to give a numeric answer.
In trying to estimate population sizes a method of recapture is used. Let $N$ be the size of a population. (For example spotted owls in a region of Oregon.) Researchers capture $n_1$ of the population and band them. These are released to the wild. After some time $n_2$ of the population are captured. Suppose $X$ of the $n_2$ are banded. Suppose $n_1 = 3$ and $n_2 = 4$.
What is the likelihood function for $X$ given $N$, $L(N) = f(x | N)$.
If $X = 1$ what value of $N$ maximizes the likelihood function?
(This type of question is from section 8.2 which was in the lecture on March 4.)