In this practical I detail multiple skills and show you a workflow for (predictive) analytics.
All the best,
Gerko
The following packages are required for this practical:
library(dplyr)
library(magrittr)
library(mice)
library(ggplot2)
library(DAAG)
library(MASS)
The data sets elastic1 and elastic2 from the package DAAG were obtained using the same apparatus, including the same rubber band, as the data frame elasticband.
elastic1 and elastic2 on the same graph. Do the two sets of results appear consistent?elastic1 and elastic2, determine the regression of distance on stretch (i.e. model the outcome distance on the predictor stretch). In each case determine:Compare the two sets of results. What is the key difference between the two sets of data?
plot() on the fitted objectBecause there is a single value that influences the estimation and is somewhat different than the other values, a robust form of regression may be advisable to obtain more stable estimates. When robust methods are used, we refrain from omitting a suspected outlier from our analysis. In general, with robust analysis, influential cases that are not conform the other cases receive less weight in the estimation procedure then under non-robust analysis.
rlm() from the MASS package to fit lines to the data in elastic1 and elastic2. Compare the results with those from use of lm():elastic2 variable stretch to obtain predictions on the model fitted on elastic1.elastic2The mammalsleep dataset is part of mice. It contains the Allison and Cicchetti (1976) data for mammalian species. To learn more about this data, type
?mammalsleep
brw is modeled from bwbrw is predicted from both bw and speciesbrw?exercise 16. What issues can you detect?End of Practical.