library(mice) # Data imputation
library(dplyr) # Data manipulation
library(magrittr) # Flexible piping in R
library(purrr) # Flexible functional programming
set.seed(123)
mice
: Combining inferences
Multiple Imputation in Practice
This is the fifth vignette in a series of ten.
In this practical we will walk you through different ways of combining inferences based on multiply imputed data sets.
All the best,
1. Open R
and load the following packages and fix the random seed.
We choose seed value 123
. This is an arbitrary value; any value would be an equally good seed value. Fixing the random seed enables you (and others) to exactly replicate anything that involves random number generators. If you set the seed in your R
instance to 123
, you will get the exact same results and plots as we present in this document if you follow the order and code of the exercises precisely.
2. Impute the boys
data properly with passive imputation for bmi
with the following parameters:
m = 10
for 10 imputed datasets.maxit = 6
to give the algorithm 6 iterations to obtain a stable solution.print = FALSE
to omit printing of the iteration and imputation history.
We will use this data to go through the workflow of data analysis with mids
(multiply imputed data set) objects.
We start by creating the method
vector and specify the passive imputation of bmi
<- make.method(boys)
meth "bmi"] <- "~ I(wgt / (hgt / 100)^2)" meth[
Then we remove bmi
as a predictor for hgt
and wgt
to avoid circularity (bmi
feeding back into hgt
and wgt
.
<- make.predictorMatrix(boys)
pred c("hgt", "wgt"), "bmi"] <- 0
pred[ pred
age hgt wgt bmi hc gen phb tv reg
age 0 1 1 1 1 1 1 1 1
hgt 1 0 1 0 1 1 1 1 1
wgt 1 1 0 0 1 1 1 1 1
bmi 1 1 1 0 1 1 1 1 1
hc 1 1 1 1 0 1 1 1 1
gen 1 1 1 1 1 0 1 1 1
phb 1 1 1 1 1 1 0 1 1
tv 1 1 1 1 1 1 1 0 1
reg 1 1 1 1 1 1 1 1 0
and we run the mice
algorithm again with the new predictor matrix (we still ‘borrow’ the imputation methods object meth
from before)
<-mice(boys,
imp meth = meth,
pred = pred,
print = FALSE,
m = 10,
maxit = 6)
We use the multiply imputed data set imp
from now on.
Correlations
3. Calculate a correlation between all continuous variables for the imputed boys
data
There are two ways in which we can calculate the correlation on the imputed data:
- The wrong way: calculate an estimate over the average imputed dataset .
Quite often people are suggesting that using the average imputed dataset - so taking the average over the imputed data set such that any realized cell depicts the average over the corresponding data in the imputed data - would be efficient and conform Rubin’s rules. This is not true. Doing this will yield false inference.
To demonstrate this, let’s create the averaged data set and exclude the non-numerical columns:
<- imp %>%
ave ::complete("long") %>%
micegroup_by(.id) %>%
summarise_all(.funs = mean) %>%
select(-.id, -.imp, -phb, -gen, -reg)
head(ave)
# A tibble: 6 × 6
age hgt wgt bmi hc tv
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.035 50.1 3.65 14.5 33.7 2.5
2 0.038 53.5 3.37 11.8 35 3.5
3 0.057 50 3.14 12.6 35.2 2.8
4 0.06 54.5 4.27 14.4 36.7 2.5
5 0.062 57.5 5.03 15.2 37.3 2
6 0.068 55.5 4.66 15.1 37 2.1
If we now calculate Pearson’s correlation, rounded to two digits:
<- ave %>%
cor.wrong cor() %>%
round(digits = 2)
we obtain:
cor.wrong
age hgt wgt bmi hc tv
age 1.00 0.98 0.95 0.63 0.86 0.86
hgt 0.98 1.00 0.94 0.60 0.91 0.81
wgt 0.95 0.94 1.00 0.79 0.84 0.87
bmi 0.63 0.60 0.79 1.00 0.60 0.64
hc 0.86 0.91 0.84 0.60 1.00 0.67
tv 0.86 0.81 0.87 0.64 0.67 1.00
- The correct way: calculate an estimate for each imputed dataset and average over the estimates
It is best to do a Fisher transformation before pooling the correlation estimates - and a backtransformation afterwards. Therefore we define the following two functions that allow us to transform and backtransform any value:
<- function(x) 1/2 * log((1 + x) / (1 - x))
fisher.trans <- function(x) (exp(2 * x) - 1) / (exp(2 * x) + 1) fisher.backtrans
Now, to calculate the correlation on the imputed data
<- imp %>%
cor ::complete("all") %>%
micemap(select, -phb, -gen, -reg) %>%
map(stats::cor) %>%
map(fisher.trans)
cor
$`1`
age hgt wgt bmi hc tv
age Inf 2.1971683 1.837486 0.7306936 1.2875134 1.1764013
hgt 2.1971683 Inf 1.771810 0.6796283 1.5304395 1.0150829
wgt 1.8374860 1.7718099 Inf 1.0661101 1.2173486 1.1632394
bmi 0.7306936 0.6796283 1.066110 Inf 0.6755864 0.6965303
hc 1.2875134 1.5304395 1.217349 0.6755864 Inf 0.7662706
tv 1.1764013 1.0150829 1.163239 0.6965303 0.7662706 Inf
$`2`
age hgt wgt bmi hc tv
age Inf 2.1928290 1.839154 0.7396715 1.2805717 1.1540150
hgt 2.1928290 Inf 1.772569 0.6866849 1.5335605 1.0247420
wgt 1.8391536 1.7725691 Inf 1.0777861 1.2168596 1.1924681
bmi 0.7396715 0.6866849 1.077786 Inf 0.6757144 0.7175002
hc 1.2805717 1.5335605 1.216860 0.6757144 Inf 0.7716682
tv 1.1540150 1.0247420 1.192468 0.7175002 0.7716682 Inf
$`3`
age hgt wgt bmi hc tv
age Inf 2.1980513 1.839317 0.7384530 1.2750962 1.1623767
hgt 2.198051 Inf 1.772417 0.6877455 1.5248987 1.0039437
wgt 1.839317 1.7724172 Inf 1.0774889 1.2124082 1.1611924
bmi 0.738453 0.6877455 1.077489 Inf 0.6812475 0.7128481
hc 1.275096 1.5248987 1.212408 0.6812475 Inf 0.7624198
tv 1.162377 1.0039437 1.161192 0.7128481 0.7624198 Inf
$`4`
age hgt wgt bmi hc tv
age Inf 2.1961017 1.838918 0.7408736 1.2897920 1.1832366
hgt 2.1961017 Inf 1.774843 0.6907368 1.5444453 1.0383982
wgt 1.8389176 1.7748429 Inf 1.0800728 1.2256015 1.2125146
bmi 0.7408736 0.6907368 1.080073 Inf 0.6827374 0.7359459
hc 1.2897920 1.5444453 1.225601 0.6827374 Inf 0.7837830
tv 1.1832366 1.0383982 1.212515 0.7359459 0.7837830 Inf
$`5`
age hgt wgt bmi hc tv
age Inf 2.195998 1.839240 0.7359324 1.2753551 1.1843788
hgt 2.1959978 Inf 1.772784 0.6840920 1.5270459 1.0261064
wgt 1.8392403 1.772784 Inf 1.0709307 1.2095539 1.1910395
bmi 0.7359324 0.684092 1.070931 Inf 0.6742064 0.7114433
hc 1.2753551 1.527046 1.209554 0.6742064 Inf 0.7587066
tv 1.1843788 1.026106 1.191039 0.7114433 0.7587066 Inf
$`6`
age hgt wgt bmi hc tv
age Inf 2.1960817 1.839802 0.7380297 1.2683845 1.1098175
hgt 2.1960817 Inf 1.773711 0.6872310 1.5185628 0.9710611
wgt 1.8398023 1.7737114 Inf 1.0753483 1.2056839 1.1312176
bmi 0.7380297 0.6872310 1.075348 Inf 0.6778270 0.6868663
hc 1.2683845 1.5185628 1.205684 0.6778270 Inf 0.7000833
tv 1.1098175 0.9710611 1.131218 0.6868663 0.7000833 Inf
$`7`
age hgt wgt bmi hc tv
age Inf 2.1962046 1.835994 0.7315759 1.2772518 1.1987818
hgt 2.1962046 Inf 1.772959 0.6825245 1.5256606 1.0441973
wgt 1.8359941 1.7729588 Inf 1.0685706 1.2150211 1.2020363
bmi 0.7315759 0.6825245 1.068571 Inf 0.6805006 0.7139035
hc 1.2772518 1.5256606 1.215021 0.6805006 Inf 0.7789061
tv 1.1987818 1.0441973 1.202036 0.7139035 0.7789061 Inf
$`8`
age hgt wgt bmi hc tv
age Inf 2.1929580 1.840578 0.7439899 1.2888151 1.2043697
hgt 2.1929580 Inf 1.773407 0.6923113 1.5404146 1.0468795
wgt 1.8405779 1.7734068 Inf 1.0811815 1.2216658 1.2005007
bmi 0.7439899 0.6923113 1.081182 Inf 0.6844591 0.7123828
hc 1.2888151 1.5404146 1.221666 0.6844591 Inf 0.7906234
tv 1.2043697 1.0468795 1.200501 0.7123828 0.7906234 Inf
$`9`
age hgt wgt bmi hc tv
age Inf 2.1955320 1.838406 0.7384463 1.2842531 1.1620413
hgt 2.1955320 Inf 1.773995 0.6878719 1.5270096 1.0147341
wgt 1.8384062 1.7739947 Inf 1.0758542 1.2183190 1.1719916
bmi 0.7384463 0.6878719 1.075854 Inf 0.6857832 0.7065790
hc 1.2842531 1.5270096 1.218319 0.6857832 Inf 0.7578655
tv 1.1620413 1.0147341 1.171992 0.7065790 0.7578655 Inf
$`10`
age hgt wgt bmi hc tv
age Inf 2.194992 1.840380 0.7412752 1.2769159 1.1855404
hgt 2.1949924 Inf 1.773964 0.6897810 1.5248482 1.0468461
wgt 1.8403795 1.773964 Inf 1.0783437 1.2142874 1.2146691
bmi 0.7412752 0.689781 1.078344 Inf 0.6849574 0.7124170
hc 1.2769159 1.524848 1.214287 0.6849574 Inf 0.7733416
tv 1.1855404 1.046846 1.214669 0.7124170 0.7733416 Inf
The object cor
is a list over the \(m\) imputations where each listed index is a correlation matrix
. To calculate the average over the correlation matrices, we can add the \(m\) listed indices and divide them by \(m\):
<- Reduce("+", cor) / length(cor) # m is equal to the length of the list
cor.rect <- fisher.backtrans(cor.rect) cor.rect
If we compare the wrong estimates in cor.wrong
cor.wrong
age hgt wgt bmi hc tv
age 1.00 0.98 0.95 0.63 0.86 0.86
hgt 0.98 1.00 0.94 0.60 0.91 0.81
wgt 0.95 0.94 1.00 0.79 0.84 0.87
bmi 0.63 0.60 0.79 1.00 0.60 0.64
hc 0.86 0.91 0.84 0.60 1.00 0.67
tv 0.86 0.81 0.87 0.64 0.67 1.00
with the correct estimates in cor.rect
round(cor.rect, digits = 2)
age hgt wgt bmi hc tv
age NaN 0.98 0.95 0.63 0.86 0.82
hgt 0.98 NaN 0.94 0.60 0.91 0.77
wgt 0.95 0.94 NaN 0.79 0.84 0.83
bmi 0.63 0.60 0.79 NaN 0.59 0.61
hc 0.86 0.91 0.84 0.59 NaN 0.64
tv 0.82 0.77 0.83 0.61 0.64 NaN
We see that the wrong estimates in cor.wrong
have the tendency to overestimate the correlation coefficient that is correctly combined following Rubin’s rules.
The correct estimates have a diagonal of NaN
’s, because the tranformation of a correlation of 1
yields Inf
and the backtransformation of Inf
has no representation in real number space. We know the diagonal is supposed to be 1, so we can simply correct this
diag(cor.rect) <- 1
cor.rect
age hgt wgt bmi hc tv
age 1.0000000 0.9755309 0.9506921 0.6278711 0.8565901 0.8249429
hgt 0.9755309 1.0000000 0.9439641 0.5959615 0.9103713 0.7711664
wgt 0.9506921 0.9439641 1.0000000 0.7914006 0.8383737 0.8287360
bmi 0.6278711 0.5959615 0.7914006 1.0000000 0.5917157 0.6110790
hc 0.8565901 0.9103713 0.8383737 0.5917157 1.0000000 0.6436419
tv 0.8249429 0.7711664 0.8287360 0.6110790 0.6436419 1.0000000
Why does the average data set not serve as a good basis for analysis?
In FIMD v2
, paragraph 5.1.2 Stef mentions the following:
The average workflow is faster and easier than the correct methods, since there is no need to replicate the analyses \(m\) times. In the words of Dempster and Rubin (1983), this workflow is
seductive because it can lull the user into the pleasurable state of believing that the data are complete after all.
The ensuing statistical analysis does not know which data are observed and which are missing, and treats all data values as real, which will underestimate the uncertainty of the parameters. The reported standard errors and p-values after data-averaging are generally too low. The correlations between the variables of the averaged data will be too high. For example, the correlation matrix in the average data are more extreme than the average of the \(m\) correlation matrices, which is an example of ecological fallacy. As researchers tend to like low p-values and high correlations, there is a cynical reward for the analysis of the average data. However, analysis of the average data cannot give a fair representation of the uncertainties associated with the underlying data, and hence is not recommended.
So, please stay away from averaging the imputed data sets. Instead, use the correct workflow of analyzing the imputed sets separately and combining the inference afterwards.
Linear models
4. Fit the following linear model on the imputed data:
lm(age ~ wgt + hgt)
<- imp %>%
fit1.lm with(lm(age ~ wgt + hgt))
<- pool(fit1.lm)
est1.lm est1.lm
Class: mipo m = 10
term m estimate ubar b t dfcom
1 (Intercept) 10 -7.47606562 5.886021e-02 7.345696e-05 5.894102e-02 745
2 wgt 10 0.07250458 3.471186e-05 1.548528e-07 3.488220e-05 745
3 hgt 10 0.10631248 1.088103e-05 3.591635e-08 1.092054e-05 745
df riv lambda fmi
1 741.8745 0.001372789 0.001370907 0.004052242
2 737.9341 0.004907203 0.004883240 0.007569354
3 739.5238 0.003630905 0.003617769 0.006301541
summary(est1.lm)
term estimate std.error statistic df p.value
1 (Intercept) -7.47606562 0.242777710 -30.79387 741.8745 8.864898e-135
2 wgt 0.07250458 0.005906115 12.27619 737.9341 1.179672e-31
3 hgt 0.10631248 0.003304623 32.17083 739.5238 1.071290e-142
5. Now expand the linear model from (4) with a squared term for hgt
:
lm(age ~ wgt + hgt + I(hgt^2))
<- imp %>%
fit2.lm with(lm(age ~ wgt + hgt + I(hgt^2)))
<- pool(fit2.lm)
est2.lm est2.lm
Class: mipo m = 10
term m estimate ubar b t dfcom
1 (Intercept) 10 -4.0568412458 2.479221e-01 4.443015e-04 2.484108e-01 744
2 wgt 10 0.0317693024 5.959819e-05 1.056828e-07 5.971444e-05 744
3 hgt 10 0.0389756596 8.508940e-05 2.479060e-07 8.536210e-05 744
4 I(hgt^2) 10 0.0003577151 2.116907e-09 4.546843e-12 2.121909e-09 744
df riv lambda fmi
1 740.3124 0.001971311 0.001967433 0.004652798
2 740.3326 0.001950580 0.001946783 0.004632131
3 739.0178 0.003204824 0.003194586 0.005881329
4 739.9209 0.002362658 0.002357089 0.005042820
summary(est2.lm)
term estimate std.error statistic df p.value
1 (Intercept) -4.0568412458 4.984083e-01 -8.139594 740.3124 1.677234e-15
2 wgt 0.0317693024 7.727512e-03 4.111194 740.3326 4.376700e-05
3 hgt 0.0389756596 9.239161e-03 4.218528 739.0178 2.764546e-05
4 I(hgt^2) 0.0003577151 4.606418e-05 7.765580 739.9209 2.710945e-14
Model comparisons
6. Compare the models from (4) and (5) to see which model would yield the best fit:
We have three choices for evaluation:
D1(fit2.lm, fit1.lm) # multivariate Wald test
test statistic df1 df2 dfcom p.value riv
1 ~~ 2 60.30423 1 740.9791 744 2.70652e-14 0.002362658
The \(D_1\) statistic requires a variance covariance matrix for the estimates.
D2(fit2.lm, fit1.lm) # combining test statistics
test statistic df1 df2 dfcom p.value riv
1 ~~ 2 60.24409 1 800930.6 NA 8.389226e-15 0.003363428
The \(D_2\) requires only the test statistics (e.g. p-values or \(X^2\) values) and hence is more flexible to apply than the \(D_1\) statistic: But this comes at a cost as the resulting inference is less informed by the data than under the \(D_1\) statistic.
D3(fit2.lm, fit1.lm) # likelihood ratio test
test statistic df1 df2 dfcom p.value riv
1 ~~ 2 58.2858 1 494231.8 744 2.270773e-14 0.002481764
For large sample size, \(D_3\) is equivalent to \(D_1\), however, \(D_3\) does not require the covariances of the complete data estimates. It is the preferred method for testing random effects, and connects to global fit statistics in structural equation models. The likelihood ratio test does not require normality. For large riv
(i.e. values > 10), the \(D_3\) statistics must be taken with a grain of salt. In general, given what we know today, the \(D_1\) statistic may be slightly more efficient than \(D_3\) for small samples (i.e. \(< 200\) cases); for larger samples (i.e. \(\geq 200\) cases) the \(D_1\) and \(D_3\) appear equally good and a choice between them is mostly a matter of convenience –> see also paragraph 5.3.4 in FIMD v2
for a comparison on when to use \(D_1\), \(D_2\) and/or \(D_3\).
Stepwise modeling
7. Fit a stepwise linear model to predict hgt
seperately to each of the imputed data sets.
We can use the step()
function in R
to select formula-based models. We start by specifying the scope of the evaluations:
<- list(upper = ~ age + wgt + hc + gen + phb + tv + reg,
scope lower = ~ 1)
The scope specifies the upper bound of the model (all variable to run the selection on) and lower bound of the mode, where a 1
indicates an intercept only model.
We can then specify the expressions to be evaluated:
<- expression(f1 <- lm(hgt ~ 1),
expr <- step(f1,
f2 scope = scope,
direction = "forward",
trace = 0
))
where f1
is the linear model to be evaluated and f2
the step()
function that evaluates the f1
function. Finally, we apply the with()
function to evaluate the expression expr
on object imp
:
<- with(imp, expr) fit
The fit
object contains the model evaluations To calculate the times each variable was selected, we can run:
<- lapply(fit$analyses, formula)
formulas <- lapply(formulas, terms)
terms <- unlist(lapply(terms, labels))
votes table(votes)
votes
age gen hc phb reg tv wgt
10 2 10 10 1 7 10
We see that reg
is only used in 1 models based on the 10 imputed datasets. tv
is used in 7 of the models and gen
is used in 2 of the 10 completed-data models. wgt
, hc
, age
and phb
are used in all models.
To determine if gen
should be a part of the final model, we may run a multivariate Wald test:
<- with(imp, lm(hgt ~ age + hc + phb + wgt + gen))
fit.gen <- with(imp, lm(hgt ~ age + hc + phb + wgt))
fit.nogen D1(fit.gen, fit.nogen)
test statistic df1 df2 dfcom p.value riv
1 ~~ 2 0.9319137 4 221.83 735 0.4461966 0.4369768
With a p-value of .059
we could conclude that gen
is not strictly needed in this model. We might also investigate the BIC on the seperate imputed sets and compare those across (not within) models. We draw the same conclusion from this evaluation - the BIC is lower for the model without gen
- although not by much. But then again, the p-value indicated the same trend.
<- fit.gen$analyses %>%
BIC.gen sapply(BIC)
<- fit.nogen$analyses %>%
BIC.nogen sapply(BIC)
To count the model evaluations in favor of BIC.nogen
–> better fit means lower BIC:
BIC.gen
[1] 5086.363 5078.323 5078.325 5100.998 5094.769 5085.871 5084.575 5099.043
[9] 5137.931 5125.774
BIC.nogen
[1] 5065.019 5063.648 5058.033 5082.689 5079.410 5070.772 5065.247 5080.096
[9] 5113.299 5101.303
sum(BIC.gen > BIC.nogen)
[1] 10
Please not that we can compare the BIC only over the models, not over the imputed data sets. The realized imputations differ for each set, but for each imputed set, the model comparison is based on the same realization. The sum(BIC.gen > BIC.nogen)
compares only the BIC’s against its counterpart for the same imputed data set. The resulting outcome can be considered as a majority vote: in our case 10 out of 10 model comparisons are in favor of the model without gen
as a predictor.
Conditional means
8. Calculate the average mean for bmi
for every region reg
over the imputed data.
To study the means for bmi
conditionally on reg
to get a picture of what the differences are:
%>%
imp ::complete("long") %>%
miceselect(reg, bmi) %>%
group_by(reg) %>%
summarise_all(.funs = mean)
# A tibble: 5 × 2
reg bmi
<fct> <dbl>
1 north 19.3
2 east 17.8
3 west 17.9
4 south 17.7
5 city 18.2
To also obtain information about the standard error of the mean, we could extend the summarise_all()
evaluation wit a custom standard error function:
<- function(x){
se sd(x) / sqrt(length(x))
}
and then calculate the summary again
%>%
imp ::complete("long") %>%
miceselect(reg, bmi) %>%
group_by(reg) %>%
summarise_all(list(~ mean(.), ~se(.)))
# A tibble: 5 × 3
reg mean se
<fct> <dbl> <dbl>
1 north 19.3 0.120
2 east 17.8 0.0742
3 west 17.9 0.0601
4 south 17.7 0.0697
5 city 18.2 0.0965
Mean differences: ANOVA
9. Test whether the means differ.
This is best done with an ANOVA. To do this correctly, we can apply the following workflow. First, we fit an intercept only model
<- imp %>%
fit.empty ::complete("all") %>%
micemap(lm, formula = bmi ~ 1)
and then we fit the model with reg
as a predictor
<- imp %>%
fit.reg ::complete("all") %>%
micemap(lm, formula = bmi ~ 1 + reg)
We can calculate the separate ANOVA’s from each fitted model:
<- lapply(with(imp, lm(age ~ 1))$analyses, aov)
aov.empty <- lapply(with(imp, lm(age ~ 1 + reg))$analyses, aov) aov.reg
And look at the summaries:
lapply(aov.empty, summary)
[[1]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[2]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[3]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[4]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[5]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[6]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[7]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[8]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[9]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
[[10]]
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 747 35503 47.53
The summary for the aov.empty
object has no p-values. This is expected as we are fitting an empty model (without any predictors) and hence have no model Mean Squares (MS) to calculate and test the ratio
\[F = \frac{\text{Model MS}}{\text{Residual MS}}\]
We do have those components for the model with reg
included as predictor:
lapply(aov.reg, summary)
[[1]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 757 189.23 4.046 0.00298 **
Residuals 743 34746 46.77
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[2]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 745 186.34 3.983 0.00332 **
Residuals 743 34758 46.78
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[3]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 739 184.67 3.947 0.00354 **
Residuals 743 34765 46.79
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[4]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 756 188.97 4.041 0.003 **
Residuals 743 34747 46.77
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[5]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 743 185.86 3.973 0.00338 **
Residuals 743 34760 46.78
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[6]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 757 189.23 4.046 0.00298 **
Residuals 743 34746 46.77
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[7]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 719 179.82 3.841 0.00425 **
Residuals 743 34784 46.82
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[8]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 739 184.68 3.947 0.00354 **
Residuals 743 34765 46.79
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[9]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 742 185.61 3.967 0.00341 **
Residuals 743 34761 46.78
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[[10]]
Df Sum Sq Mean Sq F value Pr(>F)
reg 4 755 188.86 4.038 0.00302 **
Residuals 743 34748 46.77
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We find that each of the separate ANOVA’s indicates significance, meaning that there is an overall effect for reg
on bmi
in each imputed data set.
To obtain an overall estimate for the ANOVA we can simply compare the empty model to the model with reg
included:
D1(fit.reg, fit.empty)
test statistic df1 df2 dfcom p.value riv
1 ~~ 2 4.604686 4 739.1867 743 0.00111894 0.009690917
And find that indeed the overall (i.e. pooled) effect over the imputations is also significant, which is not surprising as the F-values for the separate tests show little variation.
- End of practical