We use the following packages in this Practical:

library(dplyr)
library(magrittr)
library(ggplot2)

In this practical you will need to perform regression analysis and create plots with ggplot2. I give you some examples and ask from you to apply the techniques I demonstrate. For some exercises I give you the solution (e.g. the resulting graph) and the interpretation. The exercise is then to provide to me the code that generates the solution and give me the interpretation for the exercises where this is omitted.

Feel free to ask me, if you have questions.

All the best,

Gerko


Models and output


  1. Fit the following linear models on the anscombe data:
  • y1 predicted by x1 - stored in object fit1
  • y2 predicted by x2 - stored in object fit2
  • y3 predicted by x3 - stored in object fit3
  • y4 predicted by x4 - stored in object fit4

I give you the code for first regression model. You need to fit the other three models yourself.

fit1 <- anscombe %$%
  lm(y1 ~ x1)

  1. `Display a data frame with the coefficients of the 4 fitted objects from Exercise 1

Use the following code to markup your output into a nice format

output <- data.frame(fit1 = coef(fit1),
                     fit2 = coef(fit2),
                     fit3 = coef(fit3),
                     fit4 = coef(fit4))
row.names(output) <- names(coef(fit1))

  1. Inspect the estimates for the four models in the output object. What do you conclude?

Plotting the relation


  1. Plot the pair (x1, y1) such that y1 is on the Y-axis and make the color of the points blue This is quite simple to do with ggplot2
anscombe %>%
  ggplot(aes(x = x1, y = y1)) + 
  geom_point(color = "blue")

In the above code we put the aesthetics aes(x = x1, y = y1) in the ggplot() function. This way, the aesthetics hold for the whole graph (i.e. all geoms we specify), unless otherwise specified. Alternatively, we could specify aesthetics for individual geom’s, such as in

anscombe %>%
  ggplot() + 
  geom_point(aes(x = x1, y = y1), color = "blue")

We can also override the aes(x = x1, y = y1) specified in ggplot() by specifying a different aes(x = x2, y = y2) under geom_point().

anscombe %>%
  ggplot(aes(x = x1, y = y1)) + 
  geom_point(aes(x = x2, y = y2), color = "blue")


  1. Plot the four pairs of variables from Exercise 1 in a single plotting window. Make the points in the plots blue, gray, orange and purple, respectively.

In other words, create the following plot:


  1. Add a regression line to the plot for only the pairs (y3, x3) and (y4, x4) where the line inherits the colour from the respective points. Hint: use geom_smooth().

  1. Now add a loess line to the plot from Exercise 5 for all pairs but (y4, x4) where the line inherits the colour from the respective points.


End of practical.