You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am getting the error "Error: Column estimate not found in .data" when trying to run pool() on a mira object generated from with() on a gam object (from mgcv::gam()). The appropriate tidy and glance functions exist.
Reproducible example
# Simulate data
library(MASS)
set.seed(1775)
N = 100
Xcomplete = mvrnorm(N, c(1,2), matrix(c(1,.5,.5,1),2,2))
W = cbind(1,Xcomplete[,1],Xcomplete[,1]^2,Xcomplete[,2])
eps = rnorm(N)
beta = c(1,1,2,3)
y = W %*% beta + eps
missingid1 = as.logical(rbinom(N,1,.1))
missingid2 = as.logical(rbinom(N,1,.1))
X = Xcomplete
X[missingid1,1] = NA
X[missingid2,2] = NA
datcomplete = as.data.frame(cbind(y=y,x1 = Xcomplete[,1], x2 = Xcomplete[,2]))
dat = as.data.frame(cbind(y=y,x1 = X[,1], x2 = X[,2]))
# Estimate fully observed GAM to get knots
library(mgcv)
out1 = gam(y ~ s(x1, bs = "bs") + x2, data = datcomplete)
knots = list(x1 = out1$smooth[[1]]$knots) #get knots
# MICE the data
library(mice)
mice1 = mice(dat)
# Estimate GAMs on MICEed data with user defined knots
micegam = with(data = mice1, gam(y ~ s(x1, bs = "bs") + x2, knots = knots))
poolgam = pool(micegam)
?broom::glance.gam
?broom::tidy.gam
Thank you for your time,
Mike
The text was updated successfully, but these errors were encountered:
I wasn't aware that broom::tidy.gam() doesn't produce the estimates and standard errors by default. mice 3.7.1 adds the parametric = TRUE parameter to the call to tidy.gam(), so now your example should run.
I do not know exactly what parametric = TRUE does, but I think comes down to a simplification of the model. Thus, it might have an effect on interpretation. As of now, it is still an open issue of how non-parametric smooths should be pooled.
Thanks Stef. I believe pooling the non-parametric smooth terms and parametric terms is a little bit of work but fairly straight forward. Assuming you are using cubic splines, the user must first manually define the knots (so each imputed data set uses the same knots) then you pool mgcv::gam()$coefficients to get the mean and between variance. The covariance matrices (if they exist) are mgcv::gam()$Ve, mgcv::gam()$Vp, and mgcv::gam()$Vc provide the within variance.
However, if the user defined knot locations are estimated prior to the model estimation with the MI datasets then I am not sure what the appropriate covariance matrix is (unconditional on estimated knot locations).
Hello,
I am getting the error "Error: Column
estimate
not found in.data
" when trying to run pool() on a mira object generated from with() on a gam object (from mgcv::gam()). The appropriate tidy and glance functions exist.Reproducible example
Thank you for your time,
Mike
The text was updated successfully, but these errors were encountered: