For this lab you should submit, on Blackboard, your .Rmd and .html-files at the end of the lab hour.

Linear Regression

The dataset cars can be loaded using data(cars). It contains data on speed (in mph) and stopping distance (in ft) from 50 experiments in the 1950s.

Task Make a scatter plot.

We will try to fit and evaluate a linear model for this data.

Task Make a linear regression. Which do you believe to be the response variable?

Task Evaluate the validity of your linear regression. Remember to check both the residual plot and the QQ-plot.

Task Interpret your results. Does your linear model accurately describe the behavior of your data?

The function predict can be used to make predictions from a model. You use it like this:

y.pred = predict(model, data.frame(x=c(x1, x2, ...)))

where the second argument to the function predict is a data frame with a variable named the exact same thing as the predictor in the model, containing the values at which to make a prediction.

Task What are my stopping distances if I am driving at 10mph, 15mph, or 20mph?

predict can also compute the prediction error. We can use this to plot a confidence band around the prediction line. To get the prediction error, give the argument se.fit = TRUE to predict. The result has ypred$fit for the predicted values and y.pred$se.fit for the standard errors.

We might do something like this for a 95% prediction band:

x = data.frame(x=c(x1, x2, ...)
y.pred = predict(model, x, se.fit=TRUE)
x$y.pred = y.pred$fit
x$y.lo = y.pred$fit - qnorm(0.95) * y.pred$se.fit
x$y.hi = y.pred$fit + qnorm(0.95) * y.pred$se.fit

Once we have a data frame with the bounds for the interval extracted, the band can be plotted using the function gf_ribbon:

x %>% gf_line(y.pred ~ x) %>% gf_ribbon(y.lo + y.hi ~ x, alpha=0.5)

If you want to include the original data in the plot, you can do this by adding the data= parameter to gf_point:

x %>% 
  gf_line(y.pred ~ x) %>% 
  gf_ribbon(y.lo + y.hi ~ x, alpha=0.5) %>% 
  gf_point(dist ~ speed, data=cars)

Task Make a plot like this.

Work session

Once you are done with this, work on your report drafts.