The first few are from one sample tests in chapter 8 that we didn’t cover last week.

  1. An exit poll by a news station of 900 people in the state of Florida found 440 voting for Bush and 460 voting for Gore. Does the data support the hypothesis that Bush received $p=50$% of the state’s vote?

  2. Download the data set Cancer. Look only at column 1. These are survival times for cancer patients taking a large dosage of Vitamin C. Test the null hypothesis that the Median is 100 days. Should you also use a $t$-test? Why or why not?

  3. Download the dataset Chipavg. This is a measurement of thickness of the oxide layer and should be 1000. (See problem 8.54 for details.) Does the data support the hypothesis that the mean is 1000, or support an alternate value of the mean?

  4. We wish to test if the hypothesis testing for the equivalence of $\mu_1, \mu_2$ “works”. To do so:

    1. Generate two columns of 50 rows, each being normal with mean 10 and variance 5. Do a 2-sample $t$-test of the data testing the hypotheses that the means are equal vs. they are not.

    2. Repeat, only now let one be normal with mean 10.5 and variance 5. Does it get detected now?

    3. What about mean 11, variance 5? 20 and 5?

  5. Download Fish. This data contains the lengths of fish caught for two types of nets (35mm vs. 87mm).

    1. Formulate a test of hypothesis to decide if one of the nets is better at catching larger fish.

    2. Is this a large sample size? Why i this important?

    3. Based on your analysis, does the net size make a difference.

  6. Look at the data set Censored. For two types of treatment, the survival times are recorded. (These are treatments for small cell lung cancer.)

    1. Does the 2-sample $t$-test seem appropriate for this data?

    2. Is it better to compare means or medians? Why?

    3. Suppose you decide to compare means (regardless of your last two answers) and test the hypotheses that the two treatments are not equal what does the 2-sample $t$-test yield? Would you confidently advertise the results?

  7. Download Remedial. These are test scores for incoming students (similar to our CMAT). Test the hypothesis that men are less prepared than women upon entering college. What kind of test does the data suggest?

  8. Download Dopamine. These are measurements of spinal fluid for two different groups of people. Does the data suggest a difference in the two groups. Use a test of hypotheses, spelling out clearly $H_0$ and $H_A$ and the test you use.

  9. Download Twin. This is data for 9 pairs of identical twins. One twin was chosen at random and given a drug, the other presumably a placebo. Both took the same intelligence test.

    1. Are these a matched sample?

    2. Do we need the data to be approximately normal? Is it?

    3. What does a test of hypotheses that the drug has no effect, vs. it having an effect yield?

  10. Look at the data set Darwin. These are data collected by Darwin himself who was studying the effects of cross-fertilization vs. self-fertilization. Two such plants were placed in the same pot and their heights measured.

    1. Are the data symmetric or skewed?

    2. Are these matched samples?

    3. Which is more appropriate for the differences – the $t$-tests or the Wilcoxon signed rank test? Do both and discuss.