Error: Column `estimate` not found in `.data` for pool() from mgcv::gam #218

mikeguggis · 2020-02-06T23:13:06Z

Hello,

I am getting the error "Error: Column estimate not found in .data" when trying to run pool() on a mira object generated from with() on a gam object (from mgcv::gam()). The appropriate tidy and glance functions exist.

Reproducible example

# Simulate data
library(MASS)
set.seed(1775)
N = 100
Xcomplete = mvrnorm(N, c(1,2), matrix(c(1,.5,.5,1),2,2))
W = cbind(1,Xcomplete[,1],Xcomplete[,1]^2,Xcomplete[,2])
eps = rnorm(N)
beta = c(1,1,2,3)

y = W %*% beta + eps

missingid1 = as.logical(rbinom(N,1,.1))
missingid2 = as.logical(rbinom(N,1,.1))
X = Xcomplete
X[missingid1,1] = NA
X[missingid2,2] = NA

datcomplete = as.data.frame(cbind(y=y,x1 = Xcomplete[,1], x2 = Xcomplete[,2]))
dat = as.data.frame(cbind(y=y,x1 = X[,1], x2 = X[,2]))

# Estimate fully observed GAM to get knots
library(mgcv)
out1 = gam(y ~ s(x1, bs = "bs") + x2, data = datcomplete)
knots = list(x1 = out1$smooth[[1]]$knots) #get knots

# MICE the data
library(mice)
mice1 = mice(dat)

# Estimate GAMs on MICEed data with user defined knots
micegam = with(data = mice1, gam(y ~ s(x1, bs = "bs") + x2, knots = knots))
poolgam = pool(micegam)
?broom::glance.gam
?broom::tidy.gam

Thank you for your time,

Mike

The text was updated successfully, but these errors were encountered:

stefvanbuuren · 2020-02-07T09:11:51Z

OK, thanks a lot.

I wasn't aware that broom::tidy.gam() doesn't produce the estimates and standard errors by default. mice 3.7.1 adds the parametric = TRUE parameter to the call to tidy.gam(), so now your example should run.

I do not know exactly what parametric = TRUE does, but I think comes down to a simplification of the model. Thus, it might have an effect on interpretation. As of now, it is still an open issue of how non-parametric smooths should be pooled.

Hope this helps, nevertheless.

mikeguggis · 2020-02-07T17:57:13Z

Thanks Stef. I believe pooling the non-parametric smooth terms and parametric terms is a little bit of work but fairly straight forward. Assuming you are using cubic splines, the user must first manually define the knots (so each imputed data set uses the same knots) then you pool mgcv::gam()$coefficients to get the mean and between variance. The covariance matrices (if they exist) are mgcv::gam()$Ve, mgcv::gam()$Vp, and mgcv::gam()$Vc provide the within variance.

However, if the user defined knot locations are estimated prior to the model estimation with the MI datasets then I am not sure what the appropriate covariance matrix is (unconditional on estimated knot locations).

stefvanbuuren added a commit that referenced this issue Feb 7, 2020

Add parametric = TRUE to accomodate tidy.gam() (#218)

14ad2e1

stefvanbuuren closed this as completed Feb 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: Column `estimate` not found in `.data` for pool() from mgcv::gam #218

Error: Column `estimate` not found in `.data` for pool() from mgcv::gam #218

mikeguggis commented Feb 6, 2020 •

edited

Loading

stefvanbuuren commented Feb 7, 2020

mikeguggis commented Feb 7, 2020 •

edited

Loading

Error: Column estimate not found in .data for pool() from mgcv::gam #218

Error: Column estimate not found in .data for pool() from mgcv::gam #218

Comments

mikeguggis commented Feb 6, 2020 • edited Loading

stefvanbuuren commented Feb 7, 2020

mikeguggis commented Feb 7, 2020 • edited Loading

Error: Column `estimate` not found in `.data` for pool() from mgcv::gam #218

Error: Column `estimate` not found in `.data` for pool() from mgcv::gam #218

mikeguggis commented Feb 6, 2020 •

edited

Loading

mikeguggis commented Feb 7, 2020 •

edited

Loading