3/26/2020

Example (13.11)

Data Given

n x.i std.err
14 0.93 0.04
14 1.21 0.03
14 0.92 0.04

Data Given

n x.i std.err
14 0.93 0.04
14 1.21 0.03
14 0.92 0.04

Recall that \(\text{std.err} = \text{std.dev} / \sqrt{n}\). We can use this to calculate the standard deviation and variance for each case:

ex.df$S = ex.df$std.err * sqrt(ex.df$n)
ex.df$S2 = ex.df$S^2

Data Given

n x.i std.err S S2
14 0.93 0.04 0.1496663 0.0224
14 1.21 0.03 0.1122497 0.0126
14 0.92 0.04 0.1496663 0.0224

Data Given

n x.i std.err S S2
14 0.93 0.04 0.1496663 0.0224
14 1.21 0.03 0.1122497 0.0126
14 0.92 0.04 0.1496663 0.0224

Now, \(SSE = \sum_i\sum_j(Y_{ij}-\overline{Y}_{i*})^2 = \sum_i(n_i-1)S_i^2\). We can put each contribution in a column:

ex.df$SSE = (ex.df$n-1)*ex.df$S2

Data Given

n x.i std.err S S2 SSE
14 0.93 0.04 0.1496663 0.0224 0.2912
14 1.21 0.03 0.1122497 0.0126 0.1638
14 0.92 0.04 0.1496663 0.0224 0.2912

Data Given

n x.i std.err S S2 SSE
14 0.93 0.04 0.1496663 0.0224 0.2912
14 1.21 0.03 0.1122497 0.0126 0.1638
14 0.92 0.04 0.1496663 0.0224 0.2912

Next, \(SST = \sum_i\sum_j(\overline{Y}_{i*}-\overline{Y})^2 = \sum_in_i(\overline{Y}_{i*}-\overline{Y})^2\) where \(\overline{Y} = \frac{1}{n}\sum_i n_i\overline{Y}_{i*}\).

ex.n = sum(ex.df$n)
ex.x = (1/ex.n) * sum(ex.df$n * ex.df$x.i)
ex.df$SST = ex.df$n * (ex.df$x.i - ex.x)^2

Data Given

n x.i std.err S S2 SSE SST
14 0.93 0.04 0.1496663 0.0224 0.2912 0.1134
14 1.21 0.03 0.1122497 0.0126 0.1638 0.5054
14 0.92 0.04 0.1496663 0.0224 0.2912 0.1400

Data Given

n x.i std.err S S2 SSE SST
14 0.93 0.04 0.1496663 0.0224 0.2912 0.1134
14 1.21 0.03 0.1122497 0.0126 0.1638 0.5054
14 0.92 0.04 0.1496663 0.0224 0.2912 0.1400

Now we have all the ingredients for our ANOVA table!

ANOVA Table

ex.dfT = nrow(ex.df) - 1
ex.dfE = ex.n - nrow(ex.df)
ex.SST = sum(ex.df$SST)
ex.SSE = sum(ex.df$SSE)
ex.MST = ex.SST / ex.dfT
ex.MSE = ex.SSE / ex.dfE
ex.F = ex.MST / ex.MSE
ex.p = pf(ex.F, ex.dfT, ex.dfE, lower.tail=FALSE)

ANOVA Table

Source DoF SS MS F p
Treatments 2 0.7588 0.3794000 19.82927 1.1e-06
Error 39 0.7462 0.0191333 NA NA

Example (Exercise 13.85)

Data Given

A B C D E
.8 .7 1.2 1.0 .6
.6 .8 1.0 .9 .4
.6 .5 .9 .9 .4
.5 .5 1.2 1.1 .7
  .6 1.3 .7 .3
  .9 .8
  .7

Data Entry

ex.df = data.frame(
  group = c(rep("A", 4), rep("B",7), rep("C", 6), rep("D", 5), rep("E", 5)),
  data = c(.8,.6,.6,.5, 
           .7,.8,.5,.5,.6,.9,.7, 
           1.2,1.0,.9,1.2,1.3,.8, 
           1.0,.9,.9,1.1,.7,
           .6,.4,.4,.7,.3
           )
)

Data Shape

group data
A 0.8
A 0.6
A 0.6
A 0.5
B 0.7
B 0.8

ANOVA

aov(data ~ group, ex.df)
## Call:
##    aov(formula = data ~ group, data = ex.df)
## 
## Terms:
##                    group Residuals
## Sum of Squares  1.211844  0.571119
## Deg. of Freedom        4        22
## 
## Residual standard error: 0.1611209
## Estimated effects may be unbalanced

ANOVA in R is a special case of linear regression. Using the command aov ensures that summary printouts match the multiple means use case.

ANOVA

anova( aov(data ~ group, ex.df) )
Df Sum Sq Mean Sq F value Pr(>F)
group 4 1.211844 0.302961 11.67032 3.09e-05
Residuals 22 0.571119 0.025960 NA NA

ANOVA

summary( aov(data ~ group, ex.df) )
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## group        4 1.2118 0.30296   11.67 3.09e-05 ***
## Residuals   22 0.5711 0.02596                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1