For this lab you should submit, on Blackboard, your .Rmd and .docx-files at the end of the lab hour.

Course Evaluations

Before you start with today's lab, do the course evaluation.

Two-way tables

Task Load the dataset HairEyeColor using the command data(HairEyeColor).

This data set is a 3-way contingency table for hair-color, eye-color and gender from a collection of 592 students.

Task Using margin.table create three 2-way tables. One should contain hair- and eye-color, the second should contain hair-color and gender and the third should contain eye-color and gender.

Task Using chisq.test, test whether hair- and eye-color are independent.

Task Using chisq.test, test whether hair-color and gender are independent.

Task Using chisq.test, test whether eye-color and gender are independent.

Task For each of the three tests, use the library DescTools and the function CramerV to calculate an effect size. For the tests that came out significant, describe the effect size.

Task Was there anything surprising in the results you saw? Describe it and speculate about possible reasons. It might be worth looking at the differences in test$observed-test$expected and see if these have anything interesting to say.

Goodness of Fit

According to the Mars / Wrigley company, Skittles have an even distribution of colors in the bags, while M&Ms follow a specific pattern.

As of 2017, M&M color distributions were different between factories. In the US, these are Cleveland (CLV) and Hackettstown (HKP). These cities can be found on the packaging. Distributions are:

factory red orange yellow green blue brown
CLV 0.13 0.20 0.14 0.20 0.21 0.12
HKP 0.12 0.25 0.12 0.12 0.25 0.12

Skittles

Task Fetch one Skittles bag from Prof.

Task For a Goodness of Fit chi square test to be valid, each expected count must be at least 5. How many Skittles will you need to perform a Goodness of Fit test against their distribution?

Task Count how many times each color of Skittles occurs in your bag. If necessary, pool your bag with your bench neighbors until you are able to reach the minimum \(n\) necessary for a Goodness of Fit test.

Task Use the chisq.test function to test the following:

  • Null hypothesis: The proportions given for the Skittles are accurate.
  • Alternative hypothesis: The proportions given for the factory differ from the ones for the Skittles.

Task Was your test significant? What does significance mean here?

Task Using the library DescTools and the function CramerV, calculate an effect size for your chi square test. How large was the effect?

M&M

Task Fetch one M&M bag from Prof. Check on the bag at the front which factory today's candy comes from. Write the factory into your report.

Task For a Goodness of Fit chi square test to be valid, each expected count must be at least 5. How many M&Ms will you need to perform a Goodness of Fit test against this factory's distribution?

Task Count how many times each color of M&Ms occurs in your bag. If necessary, pool your bag with your bench neighbors until you are able to reach the minimum \(n\) necessary for a Goodness of Fit test.

Task Use the chisq.test function to test the following:

  • Null hypothesis: The proportions given for the factory are accurate.
  • Alternative hypothesis: The proportions given for the factory differ from the ones for the M&Ms.

Task Was your test significant? What does significance mean here?

Task Using the library DescTools and the function CramerV, calculate an effect size for your chi square test. How large was the effect?