Second report: inferential statistics
Your second report will test the hypotheses you have registered in Blackboard. The report you write will be eligible for participating in the Undergraduate Class Project Competition, and I encourage you to participate. I will happily help give you feedback on adapting your report for submission.
You will test your hypotheses on the data you didn’t use to generate your hypotheses. You should pick this subset out by using the commands
set.seed(Last4DigitsFromYourStudentIDNumber) data = subset(data, !(1:nrow(data) %in% sample.int(nrow(data), nrow(data)/2)))
Important: This is NOT the same subsetting as you did for the first report. There is an additional
! that negates the selection condition.
For any criterium stated, if it does not make sense (for instance requesting confidence intervals for a method where no confidence intervals exist), you cover that requirement by point out the non-existence of the requested information.
Criteria for F
Not handing a report in on time. Omitting one of the instructed tasks completely. Handing in a report where any knitting errors prove difficult or time-consuming to correct. Handing in a report where four or more of the criteria for C have minor errors.
Criteria for D
Minor errors in the criteria for C. For example: the report file does not knit, but errors are relatively easy to fix; report has grammatical or spelling errors; etc.
Criteria for C
Your report will, for each of your hypothesis…
- … use the correct subsetting code snippet.
- … be written in readable correct English, spell-checked.
- … be written in RMarkdown that knits without error.
- … be written by yourself: any substantial outside help has to be clearly marked as such and any citations should follow standard academic citation styles.
- … completely state in full detail both null- and alternative hypothesis for each case.
- … state a confidence level up front.
- … test every hypothesis.
- … report test statistics and either p-values or confidence intervals for every test performed. Model details (coefficients) where appropriate.
- … state a judgement on each test: significant or not?
- … interpret the judgement: what does it mean?
- … verify suitability for the tests you chose. This includes checking for normality of data or of residuals, which can be done using a QQ-plot.
- … fully document any additional data manipulation you decided to perform: if your data is very skew, the logarithm could make tests appropriate that previously were not.
- … name each test type you are using as you use it.
Criteria for B
For a B, your report fulfills all the criteria for an C, and has minor errors in the criteria for A.
Criteria for A
To achieve the grade of A, your report will also …
- … critically compare the tests you chose against possible alternatives, and motivate your choices: what other tests could you have used, and why did you use this one?
- … report both p-values and confidence intervals wherever possible.
- … critically evaluate the weaknesses of your chosen tests: were there tests where the conditions were not completely fulfilled? what arguments are there in favor of or against trusting their results? are the tests known to be robust against the kinds of failures you observed?
- … report effect sizes whenever possible.
You can look at an example of an A-worthy report from F16