`mice`: Combining inferences

Multiple Imputation in Practice

Author

Mice Development Team

Published

June 28, 2024

This is the fifth vignette in a series of ten.

In this practical we will walk you through different ways of combining inferences based on multiply imputed data sets.

All the best,

Gerko and Stef

1. Open R and load the following packages and fix the random seed.

library(mice) # Data imputation
library(dplyr) # Data manipulation
library(magrittr) # Flexible piping in R
library(purrr) # Flexible functional programming
set.seed(123)

We choose seed value 123. This is an arbitrary value; any value would be an equally good seed value. Fixing the random seed enables you (and others) to exactly replicate anything that involves random number generators. If you set the seed in your R instance to 123, you will get the exact same results and plots as we present in this document if you follow the order and code of the exercises precisely.

2. Impute the boys data properly with passive imputation for bmi with the following parameters:

m = 10 for 10 imputed datasets.
maxit = 6 to give the algorithm 6 iterations to obtain a stable solution.
print = FALSE to omit printing of the iteration and imputation history.

We will use this data to go through the workflow of data analysis with mids (multiply imputed data set) objects.

We start by creating the method vector and specify the passive imputation of bmi

meth <- make.method(boys)
meth["bmi"] <- "~ I(wgt / (hgt / 100)^2)"

Then we remove bmi as a predictor for hgt and wgt to avoid circularity (bmi feeding back into hgt and wgt.

pred <- make.predictorMatrix(boys)
pred[c("hgt", "wgt"), "bmi"] <- 0
pred

    age hgt wgt bmi hc gen phb tv reg
age   0   1   1   1  1   1   1  1   1
hgt   1   0   1   0  1   1   1  1   1
wgt   1   1   0   0  1   1   1  1   1
bmi   1   1   1   0  1   1   1  1   1
hc    1   1   1   1  0   1   1  1   1
gen   1   1   1   1  1   0   1  1   1
phb   1   1   1   1  1   1   0  1   1
tv    1   1   1   1  1   1   1  0   1
reg   1   1   1   1  1   1   1  1   0

and we run the mice algorithm again with the new predictor matrix (we still ‘borrow’ the imputation methods object meth from before)

imp <-mice(boys, 
           meth = meth, 
           pred = pred, 
           print = FALSE, 
           m = 10, 
           maxit = 6)

We use the multiply imputed data set imp from now on.

Correlations

3. Calculate a correlation between all continuous variables for the imputed boys data

There are two ways in which we can calculate the correlation on the imputed data:

The wrong way: calculate an estimate over the average imputed dataset .

Quite often people are suggesting that using the average imputed dataset - so taking the average over the imputed data set such that any realized cell depicts the average over the corresponding data in the imputed data - would be efficient and conform Rubin’s rules. This is not true. Doing this will yield false inference.

To demonstrate this, let’s create the averaged data set and exclude the non-numerical columns:

ave <- imp %>%
  mice::complete("long") %>%
  group_by(.id) %>%
  summarise_all(.funs = mean) %>%
  select(-.id, -.imp, -phb, -gen, -reg)

head(ave)

# A tibble: 6 × 6
    age   hgt   wgt   bmi    hc    tv
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.035  50.1  3.65  14.5  33.7   2.5
2 0.038  53.5  3.37  11.8  35     3.5
3 0.057  50    3.14  12.6  35.2   2.8
4 0.06   54.5  4.27  14.4  36.7   2.5
5 0.062  57.5  5.03  15.2  37.3   2  
6 0.068  55.5  4.66  15.1  37     2.1

If we now calculate Pearson’s correlation, rounded to two digits:

cor.wrong <- ave %>%
  cor() %>%
  round(digits = 2)

we obtain:

cor.wrong

     age  hgt  wgt  bmi   hc   tv
age 1.00 0.98 0.95 0.63 0.86 0.86
hgt 0.98 1.00 0.94 0.60 0.91 0.81
wgt 0.95 0.94 1.00 0.79 0.84 0.87
bmi 0.63 0.60 0.79 1.00 0.60 0.64
hc  0.86 0.91 0.84 0.60 1.00 0.67
tv  0.86 0.81 0.87 0.64 0.67 1.00

The correct way: calculate an estimate for each imputed dataset and average over the estimates

It is best to do a Fisher transformation before pooling the correlation estimates - and a backtransformation afterwards. Therefore we define the following two functions that allow us to transform and backtransform any value:

fisher.trans <- function(x) 1/2 * log((1 + x) / (1 - x))
fisher.backtrans <- function(x) (exp(2 * x) - 1) / (exp(2 * x) + 1)

Now, to calculate the correlation on the imputed data

cor <- imp %>%
  mice::complete("all") %>%
  map(select, -phb, -gen, -reg) %>%  
  map(stats::cor) %>%
  map(fisher.trans)
cor

$`1`
          age       hgt      wgt       bmi        hc        tv
age       Inf 2.1971683 1.837486 0.7306936 1.2875134 1.1764013
hgt 2.1971683       Inf 1.771810 0.6796283 1.5304395 1.0150829
wgt 1.8374860 1.7718099      Inf 1.0661101 1.2173486 1.1632394
bmi 0.7306936 0.6796283 1.066110       Inf 0.6755864 0.6965303
hc  1.2875134 1.5304395 1.217349 0.6755864       Inf 0.7662706
tv  1.1764013 1.0150829 1.163239 0.6965303 0.7662706       Inf

$`2`
          age       hgt      wgt       bmi        hc        tv
age       Inf 2.1928290 1.839154 0.7396715 1.2805717 1.1540150
hgt 2.1928290       Inf 1.772569 0.6866849 1.5335605 1.0247420
wgt 1.8391536 1.7725691      Inf 1.0777861 1.2168596 1.1924681
bmi 0.7396715 0.6866849 1.077786       Inf 0.6757144 0.7175002
hc  1.2805717 1.5335605 1.216860 0.6757144       Inf 0.7716682
tv  1.1540150 1.0247420 1.192468 0.7175002 0.7716682       Inf

$`3`
         age       hgt      wgt       bmi        hc        tv
age      Inf 2.1980513 1.839317 0.7384530 1.2750962 1.1623767
hgt 2.198051       Inf 1.772417 0.6877455 1.5248987 1.0039437
wgt 1.839317 1.7724172      Inf 1.0774889 1.2124082 1.1611924
bmi 0.738453 0.6877455 1.077489       Inf 0.6812475 0.7128481
hc  1.275096 1.5248987 1.212408 0.6812475       Inf 0.7624198
tv  1.162377 1.0039437 1.161192 0.7128481 0.7624198       Inf

$`4`
          age       hgt      wgt       bmi        hc        tv
age       Inf 2.1961017 1.838918 0.7408736 1.2897920 1.1832366
hgt 2.1961017       Inf 1.774843 0.6907368 1.5444453 1.0383982
wgt 1.8389176 1.7748429      Inf 1.0800728 1.2256015 1.2125146
bmi 0.7408736 0.6907368 1.080073       Inf 0.6827374 0.7359459
hc  1.2897920 1.5444453 1.225601 0.6827374       Inf 0.7837830
tv  1.1832366 1.0383982 1.212515 0.7359459 0.7837830       Inf

$`5`
          age      hgt      wgt       bmi        hc        tv
age       Inf 2.195998 1.839240 0.7359324 1.2753551 1.1843788
hgt 2.1959978      Inf 1.772784 0.6840920 1.5270459 1.0261064
wgt 1.8392403 1.772784      Inf 1.0709307 1.2095539 1.1910395
bmi 0.7359324 0.684092 1.070931       Inf 0.6742064 0.7114433
hc  1.2753551 1.527046 1.209554 0.6742064       Inf 0.7587066
tv  1.1843788 1.026106 1.191039 0.7114433 0.7587066       Inf

$`6`
          age       hgt      wgt       bmi        hc        tv
age       Inf 2.1960817 1.839802 0.7380297 1.2683845 1.1098175
hgt 2.1960817       Inf 1.773711 0.6872310 1.5185628 0.9710611
wgt 1.8398023 1.7737114      Inf 1.0753483 1.2056839 1.1312176
bmi 0.7380297 0.6872310 1.075348       Inf 0.6778270 0.6868663
hc  1.2683845 1.5185628 1.205684 0.6778270       Inf 0.7000833
tv  1.1098175 0.9710611 1.131218 0.6868663 0.7000833       Inf

$`7`
          age       hgt      wgt       bmi        hc        tv
age       Inf 2.1962046 1.835994 0.7315759 1.2772518 1.1987818
hgt 2.1962046       Inf 1.772959 0.6825245 1.5256606 1.0441973
wgt 1.8359941 1.7729588      Inf 1.0685706 1.2150211 1.2020363
bmi 0.7315759 0.6825245 1.068571       Inf 0.6805006 0.7139035
hc  1.2772518 1.5256606 1.215021 0.6805006       Inf 0.7789061
tv  1.1987818 1.0441973 1.202036 0.7139035 0.7789061       Inf

$`8`
          age       hgt      wgt       bmi        hc        tv
age       Inf 2.1929580 1.840578 0.7439899 1.2888151 1.2043697
hgt 2.1929580       Inf 1.773407 0.6923113 1.5404146 1.0468795
wgt 1.8405779 1.7734068      Inf 1.0811815 1.2216658 1.2005007
bmi 0.7439899 0.6923113 1.081182       Inf 0.6844591 0.7123828
hc  1.2888151 1.5404146 1.221666 0.6844591       Inf 0.7906234
tv  1.2043697 1.0468795 1.200501 0.7123828 0.7906234       Inf

$`9`
          age       hgt      wgt       bmi        hc        tv
age       Inf 2.1955320 1.838406 0.7384463 1.2842531 1.1620413
hgt 2.1955320       Inf 1.773995 0.6878719 1.5270096 1.0147341
wgt 1.8384062 1.7739947      Inf 1.0758542 1.2183190 1.1719916
bmi 0.7384463 0.6878719 1.075854       Inf 0.6857832 0.7065790
hc  1.2842531 1.5270096 1.218319 0.6857832       Inf 0.7578655
tv  1.1620413 1.0147341 1.171992 0.7065790 0.7578655       Inf

$`10`
          age      hgt      wgt       bmi        hc        tv
age       Inf 2.194992 1.840380 0.7412752 1.2769159 1.1855404
hgt 2.1949924      Inf 1.773964 0.6897810 1.5248482 1.0468461
wgt 1.8403795 1.773964      Inf 1.0783437 1.2142874 1.2146691
bmi 0.7412752 0.689781 1.078344       Inf 0.6849574 0.7124170
hc  1.2769159 1.524848 1.214287 0.6849574       Inf 0.7733416
tv  1.1855404 1.046846 1.214669 0.7124170 0.7733416       Inf

The object cor is a list over the \(m\) imputations where each listed index is a correlation matrix. To calculate the average over the correlation matrices, we can add the \(m\) listed indices and divide them by \(m\):

cor.rect <- Reduce("+", cor) / length(cor) # m is equal to the length of the list
cor.rect <- fisher.backtrans(cor.rect)

If we compare the wrong estimates in cor.wrong

cor.wrong

     age  hgt  wgt  bmi   hc   tv
age 1.00 0.98 0.95 0.63 0.86 0.86
hgt 0.98 1.00 0.94 0.60 0.91 0.81
wgt 0.95 0.94 1.00 0.79 0.84 0.87
bmi 0.63 0.60 0.79 1.00 0.60 0.64
hc  0.86 0.91 0.84 0.60 1.00 0.67
tv  0.86 0.81 0.87 0.64 0.67 1.00

with the correct estimates in cor.rect

round(cor.rect, digits = 2)

     age  hgt  wgt  bmi   hc   tv
age  NaN 0.98 0.95 0.63 0.86 0.82
hgt 0.98  NaN 0.94 0.60 0.91 0.77
wgt 0.95 0.94  NaN 0.79 0.84 0.83
bmi 0.63 0.60 0.79  NaN 0.59 0.61
hc  0.86 0.91 0.84 0.59  NaN 0.64
tv  0.82 0.77 0.83 0.61 0.64  NaN

We see that the wrong estimates in cor.wrong have the tendency to overestimate the correlation coefficient that is correctly combined following Rubin’s rules.

The correct estimates have a diagonal of NaN’s, because the tranformation of a correlation of 1 yields Inf and the backtransformation of Inf has no representation in real number space. We know the diagonal is supposed to be 1, so we can simply correct this

diag(cor.rect) <- 1
cor.rect

          age       hgt       wgt       bmi        hc        tv
age 1.0000000 0.9755309 0.9506921 0.6278711 0.8565901 0.8249429
hgt 0.9755309 1.0000000 0.9439641 0.5959615 0.9103713 0.7711664
wgt 0.9506921 0.9439641 1.0000000 0.7914006 0.8383737 0.8287360
bmi 0.6278711 0.5959615 0.7914006 1.0000000 0.5917157 0.6110790
hc  0.8565901 0.9103713 0.8383737 0.5917157 1.0000000 0.6436419
tv  0.8249429 0.7711664 0.8287360 0.6110790 0.6436419 1.0000000

Why does the average data set not serve as a good basis for analysis?

In FIMD v2, paragraph 5.1.2 Stef mentions the following:

The average workflow is faster and easier than the correct methods, since there is no need to replicate the analyses \(m\) times. In the words of Dempster and Rubin (1983), this workflow is

seductive because it can lull the user into the pleasurable state of believing that the data are complete after all.

The ensuing statistical analysis does not know which data are observed and which are missing, and treats all data values as real, which will underestimate the uncertainty of the parameters. The reported standard errors and p-values after data-averaging are generally too low. The correlations between the variables of the averaged data will be too high. For example, the correlation matrix in the average data are more extreme than the average of the \(m\) correlation matrices, which is an example of ecological fallacy. As researchers tend to like low p-values and high correlations, there is a cynical reward for the analysis of the average data. However, analysis of the average data cannot give a fair representation of the uncertainties associated with the underlying data, and hence is not recommended.

So, please stay away from averaging the imputed data sets. Instead, use the correct workflow of analyzing the imputed sets separately and combining the inference afterwards.

Linear models

4. Fit the following linear model on the imputed data:

lm(age ~ wgt + hgt)

fit1.lm <- imp %>%
  with(lm(age ~ wgt + hgt))

est1.lm <- pool(fit1.lm)
est1.lm

Class: mipo    m = 10 
         term  m    estimate         ubar            b            t dfcom
1 (Intercept) 10 -7.47606562 5.886021e-02 7.345696e-05 5.894102e-02   745
2         wgt 10  0.07250458 3.471186e-05 1.548528e-07 3.488220e-05   745
3         hgt 10  0.10631248 1.088103e-05 3.591635e-08 1.092054e-05   745
        df         riv      lambda         fmi
1 741.8745 0.001372789 0.001370907 0.004052242
2 737.9341 0.004907203 0.004883240 0.007569354
3 739.5238 0.003630905 0.003617769 0.006301541

summary(est1.lm)

         term    estimate   std.error statistic       df       p.value
1 (Intercept) -7.47606562 0.242777710 -30.79387 741.8745 8.864898e-135
2         wgt  0.07250458 0.005906115  12.27619 737.9341  1.179672e-31
3         hgt  0.10631248 0.003304623  32.17083 739.5238 1.071290e-142

5. Now expand the linear model from (4) with a squared term for hgt:

lm(age ~ wgt + hgt + I(hgt^2))

fit2.lm <- imp %>%
  with(lm(age ~ wgt + hgt + I(hgt^2)))

est2.lm <- pool(fit2.lm)
est2.lm

Class: mipo    m = 10 
         term  m      estimate         ubar            b            t dfcom
1 (Intercept) 10 -4.0568412458 2.479221e-01 4.443015e-04 2.484108e-01   744
2         wgt 10  0.0317693024 5.959819e-05 1.056828e-07 5.971444e-05   744
3         hgt 10  0.0389756596 8.508940e-05 2.479060e-07 8.536210e-05   744
4    I(hgt^2) 10  0.0003577151 2.116907e-09 4.546843e-12 2.121909e-09   744
        df         riv      lambda         fmi
1 740.3124 0.001971311 0.001967433 0.004652798
2 740.3326 0.001950580 0.001946783 0.004632131
3 739.0178 0.003204824 0.003194586 0.005881329
4 739.9209 0.002362658 0.002357089 0.005042820

summary(est2.lm)

         term      estimate    std.error statistic       df      p.value
1 (Intercept) -4.0568412458 4.984083e-01 -8.139594 740.3124 1.677234e-15
2         wgt  0.0317693024 7.727512e-03  4.111194 740.3326 4.376700e-05
3         hgt  0.0389756596 9.239161e-03  4.218528 739.0178 2.764546e-05
4    I(hgt^2)  0.0003577151 4.606418e-05  7.765580 739.9209 2.710945e-14

Model comparisons

6. Compare the models from (4) and (5) to see which model would yield the best fit:

We have three choices for evaluation:

The \(D_1\) multivariate Wald test

D1(fit2.lm, fit1.lm) # multivariate Wald test

   test statistic df1      df2 dfcom     p.value         riv
 1 ~~ 2  60.30423   1 740.9791   744 2.70652e-14 0.002362658

The \(D_1\) statistic requires a variance covariance matrix for the estimates.

The \(D_2\) Combining test statistic

D2(fit2.lm, fit1.lm) # combining test statistics

   test statistic df1      df2 dfcom      p.value         riv
 1 ~~ 2  60.24409   1 800930.6    NA 8.389226e-15 0.003363428

The \(D_2\) requires only the test statistics (e.g. p-values or \(X^2\) values) and hence is more flexible to apply than the \(D_1\) statistic: But this comes at a cost as the resulting inference is less informed by the data than under the \(D_1\) statistic.

The \(D_3\) Likelihood ratio test

D3(fit2.lm, fit1.lm) # likelihood ratio test

   test statistic df1      df2 dfcom      p.value         riv
 1 ~~ 2   58.2858   1 494231.8   744 2.270773e-14 0.002481764

For large sample size, \(D_3\) is equivalent to \(D_1\), however, \(D_3\) does not require the covariances of the complete data estimates. It is the preferred method for testing random effects, and connects to global fit statistics in structural equation models. The likelihood ratio test does not require normality. For large riv (i.e. values > 10), the \(D_3\) statistics must be taken with a grain of salt. In general, given what we know today, the \(D_1\) statistic may be slightly more efficient than \(D_3\) for small samples (i.e. \(< 200\) cases); for larger samples (i.e. \(\geq 200\) cases) the \(D_1\) and \(D_3\) appear equally good and a choice between them is mostly a matter of convenience –> see also paragraph 5.3.4 in FIMD v2 for a comparison on when to use \(D_1\), \(D_2\) and/or \(D_3\).

Stepwise modeling

7. Fit a stepwise linear model to predict hgt seperately to each of the imputed data sets.

We can use the step() function in R to select formula-based models. We start by specifying the scope of the evaluations:

scope <- list(upper = ~ age + wgt + hc + gen + phb + tv + reg,
              lower = ~ 1)

The scope specifies the upper bound of the model (all variable to run the selection on) and lower bound of the mode, where a 1 indicates an intercept only model.

We can then specify the expressions to be evaluated:

expr <- expression(f1 <- lm(hgt ~ 1),
                   f2 <- step(f1, 
                              scope = scope, 
                              direction = "forward",
                              trace = 0
                              ))

where f1 is the linear model to be evaluated and f2 the step() function that evaluates the f1 function. Finally, we apply the with() function to evaluate the expression expr on object imp:

fit <- with(imp, expr)

The fit object contains the model evaluations To calculate the times each variable was selected, we can run:

formulas <- lapply(fit$analyses, formula)
terms <- lapply(formulas, terms)
votes <- unlist(lapply(terms, labels))
table(votes)

votes
age gen  hc phb reg  tv wgt 
 10   2  10  10   1   7  10

We see that reg is only used in 1 models based on the 10 imputed datasets. tv is used in 7 of the models and gen is used in 2 of the 10 completed-data models. wgt, hc, age and phb are used in all models.

To determine if gen should be a part of the final model, we may run a multivariate Wald test:

fit.gen <- with(imp, lm(hgt ~ age + hc + phb + wgt + gen))
fit.nogen <- with(imp, lm(hgt ~ age + hc + phb + wgt))
D1(fit.gen, fit.nogen)

   test statistic df1    df2 dfcom   p.value       riv
 1 ~~ 2 0.9319137   4 221.83   735 0.4461966 0.4369768

With a p-value of .059 we could conclude that gen is not strictly needed in this model. We might also investigate the BIC on the seperate imputed sets and compare those across (not within) models. We draw the same conclusion from this evaluation - the BIC is lower for the model without gen - although not by much. But then again, the p-value indicated the same trend.

BIC.gen <- fit.gen$analyses %>%
  sapply(BIC) 

BIC.nogen <- fit.nogen$analyses %>%
  sapply(BIC)

To count the model evaluations in favor of BIC.nogen –> better fit means lower BIC:

BIC.gen

 [1] 5086.363 5078.323 5078.325 5100.998 5094.769 5085.871 5084.575 5099.043
 [9] 5137.931 5125.774

BIC.nogen

 [1] 5065.019 5063.648 5058.033 5082.689 5079.410 5070.772 5065.247 5080.096
 [9] 5113.299 5101.303

sum(BIC.gen > BIC.nogen)

[1] 10

Please not that we can compare the BIC only over the models, not over the imputed data sets. The realized imputations differ for each set, but for each imputed set, the model comparison is based on the same realization. The sum(BIC.gen > BIC.nogen) compares only the BIC’s against its counterpart for the same imputed data set. The resulting outcome can be considered as a majority vote: in our case 10 out of 10 model comparisons are in favor of the model without gen as a predictor.

Conditional means

8. Calculate the average mean for bmi for every region reg over the imputed data.

To study the means for bmi conditionally on reg to get a picture of what the differences are:

imp %>%
  mice::complete("long") %>%
  select(reg, bmi) %>%
  group_by(reg) %>%
  summarise_all(.funs = mean)

# A tibble: 5 × 2
  reg     bmi
  <fct> <dbl>
1 north  19.3
2 east   17.8
3 west   17.9
4 south  17.7
5 city   18.2

To also obtain information about the standard error of the mean, we could extend the summarise_all() evaluation wit a custom standard error function:

se <- function(x){
  sd(x) / sqrt(length(x))
}

and then calculate the summary again

imp %>%
  mice::complete("long") %>%
  select(reg, bmi) %>%
  group_by(reg) %>%
  summarise_all(list(~ mean(.), ~se(.)))

# A tibble: 5 × 3
  reg    mean     se
  <fct> <dbl>  <dbl>
1 north  19.3 0.120 
2 east   17.8 0.0742
3 west   17.9 0.0601
4 south  17.7 0.0697
5 city   18.2 0.0965

Mean differences: ANOVA

9. Test whether the means differ.

This is best done with an ANOVA. To do this correctly, we can apply the following workflow. First, we fit an intercept only model

fit.empty <- imp %>%
  mice::complete("all") %>%
  map(lm, formula = bmi ~ 1)

and then we fit the model with reg as a predictor

fit.reg <- imp %>%
  mice::complete("all") %>%
  map(lm, formula = bmi ~ 1 + reg)

We can calculate the separate ANOVA’s from each fitted model:

aov.empty <- lapply(with(imp, lm(age ~ 1))$analyses, aov)
aov.reg <- lapply(with(imp, lm(age ~ 1 + reg))$analyses, aov)

And look at the summaries:

lapply(aov.empty, summary)

[[1]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[2]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[3]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[4]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[5]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[6]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[7]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[8]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[9]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53               

[[10]]
             Df Sum Sq Mean Sq F value Pr(>F)
Residuals   747  35503   47.53

The summary for the aov.empty object has no p-values. This is expected as we are fitting an empty model (without any predictors) and hence have no model Mean Squares (MS) to calculate and test the ratio

\[F = \frac{\text{Model MS}}{\text{Residual MS}}\]

We do have those components for the model with reg included as predictor:

lapply(aov.reg, summary)

[[1]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    757  189.23   4.046 0.00298 **
Residuals   743  34746   46.77                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[2]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    745  186.34   3.983 0.00332 **
Residuals   743  34758   46.78                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[3]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    739  184.67   3.947 0.00354 **
Residuals   743  34765   46.79                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[4]]
             Df Sum Sq Mean Sq F value Pr(>F)   
reg           4    756  188.97   4.041  0.003 **
Residuals   743  34747   46.77                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[5]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    743  185.86   3.973 0.00338 **
Residuals   743  34760   46.78                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[6]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    757  189.23   4.046 0.00298 **
Residuals   743  34746   46.77                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[7]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    719  179.82   3.841 0.00425 **
Residuals   743  34784   46.82                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[8]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    739  184.68   3.947 0.00354 **
Residuals   743  34765   46.79                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[9]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    742  185.61   3.967 0.00341 **
Residuals   743  34761   46.78                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[[10]]
             Df Sum Sq Mean Sq F value  Pr(>F)   
reg           4    755  188.86   4.038 0.00302 **
Residuals   743  34748   46.77                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We find that each of the separate ANOVA’s indicates significance, meaning that there is an overall effect for reg on bmi in each imputed data set.

To obtain an overall estimate for the ANOVA we can simply compare the empty model to the model with reg included:

D1(fit.reg, fit.empty)

   test statistic df1      df2 dfcom    p.value         riv
 1 ~~ 2  4.604686   4 739.1867   743 0.00111894 0.009690917

And find that indeed the overall (i.e. pooled) effect over the imputations is also significant, which is not surprising as the F-values for the separate tests show little variation.

- End of practical