# Detailed class plan

For this class, the largest part of your fact consumption will be through reading the text book. You can expect a workload of approximately 30 pages each week (mean: 29; std.dev.: 6.7; 5-number summary: 20 / 24.25 / 27.5 / 35 / 40), as well as approximately 10 assigned exercises.

The overall structure of our class meetings will be:

1. Mondays:
• I clarify any issues that showed up in your self reflections.
• We discuss the points raised for thinking through and discussing in the text.
• Peer-evaluation of homework exercises.
1. Wednesdays, classroom:
• I go through the concepts and computer programming issues relevant for the lab of the day.
1. Wednesdays, computer lab:
• You work through a lab sheet, introducing and teaching specific techniques in RStudio.

You may optionally work in Python/PyLab. I am happy to help – this is the environment I work the most in – but I will not be able to talk about the Python approach in the Wednesday lecture part.

# Lab content

1. Get to know RStudio. Using ggplot2 for graphs. Histograms, scatterplots, jitter, alpha-channel. Reproducibility & random seeding. Dataset loading.
2. Compute summary statistics. Explore failures of Mean/Variance measures. Plotting PDFs. Computing PDFs, CDFs, correlation coefficients. Dataset summaries.
3. Experiment design, sampling design Data provenance. N/A representation. N/A handling. -999 as null value. Mrs. Null.
4. Inference, Ethics, Causation Anscombe quartet.
5. Randomness, models, random variables
6. Least squares Computing and examining LSQ fits and summaries
7. Error types: base rate bias.
8. Sampling distributions Polling data?
9. Confidence intervals, significance Data fishing, torture a data set. Polling data.
10. Proportional inference Exit polls/results.
11. Means inference
12. Two-way tables; chi-squared
13. Linear regression
14. ANOVA