Lecture 11

MVJ

12April, 2018

Experimental units, subjects, treatments

An experiment is a study in which we do something (a treatment) to some things (the experimental units) to observe the response.

An experimental unit is the smallest entity to which a treatment is applied. Human experimental units are often called subjects

The explanatory variables in an experiment are often called factors.

A specific condition applied to the individuals is called a treatment. Equivalently, a treatment is a combination of specific values of these variables.

Let’s try it out:

A researcher is measuring the boiling temperature of water at different pressures and at different salinity.

They take all combinations of three different athmospheric pressures (98, 100, 120 kPa) and of three different salinity levels (0, 10, 100 g/l) and boil 1 liter of water from fridge temperature in the same pot, on the same burner and measure temperature at a rolling boil.

Comparative experiments

Experiments are the preferred method for examining the effect of one variable on another. By imposing the specific treatment of interest and controlling other influences we can pin down cause and effect. Good designs are essential for effective experiments as well as they are for sampling.

Example: A study to reduce test anxiety had students write an essay about their feelings concerning an upcoming exam. Exam scores for a first midterm (before the essay) and a second midterm (after the essay) were compared. Mean score on the second midterm was higher than on the first midterm.

Comparative experiments

Some issues with this example include:

In a comparative experiment, subjects are assigned to two or more groups. A common split is into a control group and a treatment group.

The placebo effect is the effect that people may improve from getting attention or believing there to be a treatment even if one is absent.

A study is biased if it systematically favors certain outcomes.

Question How could we improve the study design?

Need for randomization

Confounding variables is a problem: they can make us measure something different from what we were trying to measure.

Remedy: perform comparative experiments where the units that receive different treatments are similar between treatments.

Comparison alone is not enough. If groups differ, this will produce bias.

Remedy: random assignment

Experimental units are assigned to treatments using some random process.

It may be necessary to keep treatment assignment secret not just for the subjects, but also for the experiment leader. This is called double blind.

Completely randomized design

  1. Subjects are split at random into two (or more) groups.
  2. Each group receives a different treatment.
  3. Results are compared.

In R:

data = data %>% 
  mutate(group=rep(1:n, length.out=nrow(data)) %>% sample())

Principles of Experimental Design

Carefully designed experiments can give evidence for causation. To increase the weight of this evidence, follow these principles:

  1. Control for lurking variables that might affect the response: compare two or more treatments, and get the same distribution of the lurking variables in each group
  2. Randomize the assignment of treatment to experimental unit.
  3. Replicate each treatment repeatedly. This way, chance variation will (hopefully) average out in the result.

Even with this, many factors can weaken the design. Critical problem lies in failures to treat everyone the same way and in failures to realistically replicate the conditions under study.

The first problem can be mitigated using double blind designs. The second problem is more difficult to handle.

Controlling for confounders: block designs

To achieve balanced distributions of lurking variables, often block designs are used: the experiment is split into blocks of similar experimental units, and within each group treatments are assigned randomly.

Smallest example is the matched pairs design: each block is a matching pair of similar experimental units.

The pair here might be a single unit receiving two treatments but in different orders between pairs.

Principles of blocked designs

Return of the exam anxiety essay

In groups of 4: Create a new design of the experiment of whether essay writing before an exam is effective.

Consider in your experiment design

  1. What can you control?
  2. What can you block?
  3. What should you randomize?

Exam anxiety essay design suggestion

Here’s one I came up with.

Split the student population into 2 groups: Experiment and Control. Block these group on gender, study year and section. Assign the essay, and then run the (one) exam. Compare exam results between Experiment and Control.

This way, gendered aspects of study anxiety as well as study experience (proxied by study year) and quality of instruction are blocked. Progression through the course is controlled.

Experiment vs Sample

An experiment is a study in which we do something (a treatment) to some things (the experimental units) to observe the response.

A sample is a study in which we observe a smaller group drawn from a larger population.

The population is the thing we want to know more about.

The sample is the part of the population we actually study, measure, observe.

Data is collected from a (representative) sample. Studied and used to make an inference about the population.

How to sample badly

A design is biased if it systematically favors certain outcomes.

Example Opinion polling is often performed by calling landlines during early weekday evenings. What would a possible source of bias be?

Choosing individuals because they are easy to reach produces a convenience sample.

Example Asking passers-by at the mall.

Choosing individuals that voluntary respond to a general appeal produces a voluntary response sample. Voluntary response samples tend to gather replies mostly from people with strong (especially negative) opinions.

Example Online reviews.

Simple Random Samples

A simple random sample of size \(n\) is \(n\) individuals from the population chosen so that every possible set of \(n\) individuals has an equal chance of being selected.

This used to be done using tables of random digits.

Simple Random Samples

A simple random sample of size \(n\) is \(n\) individuals from the population chosen so that every possible set of \(n\) individuals has an equal chance of being selected.

Better: use computers. In R:

data %>% sample_n(n)
data %>% sample_frac(p)

Blocked methods: stratified sampling

The same way that blocked designs can be used to handle possible confounders, a similar approach works for sampling:

A stratified random sample is selected by first splitting the population into groups of similar individuals called strata, and then combine simple random samples from each stratum to a full sample.

Blocked methods: multistage random sampling

Sample through a sequence of stages, each stage refining the selection from the previous ones.

Blocked methods: multistage random sampling

Sample through a sequence of stages, each stage refining the selection from the previous ones.

Blocked methods: multistage random sampling

Sample through a sequence of stages, each stage refining the selection from the previous ones.

Sample survey cautions

Some common issues with sample surveys include:

To trust poll results, first insist on knowing

  1. Questions asked
  2. Rate of non-response
  3. Date and method of the survey

Design

Groups of 4.

  1. Do students’ grades in math requirements influence their degree GPA?
  2. What amount of sugar makes for the best consistency in bread?
  3. Will higher octane raise mpg for a car?
  4. Does the introduction of Uber increase or decrease the overall taxi market?

Design

Groups of 4.

  1. Do students’ grades in math requirements influence their degree GPA?
  2. What amount of sugar makes for the best consistency in bread?
  3. Will higher octane raise mpg for a car?
  4. Does the introduction of Uber increase or decrease the overall taxi market?

Pair the groups up and explain your design and critique theirs. Do you see a source of systematic bias? Is the list of predictors and confounders missing something?