Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quickpred for unordered factors #268

Closed
LukasWallrich opened this issue Sep 15, 2020 · 2 comments
Closed

quickpred for unordered factors #268

LukasWallrich opened this issue Sep 15, 2020 · 2 comments

Comments

@LukasWallrich
Copy link
Contributor

I am trying to specify an efficient predictorMatrix for a model that includes categorical variables, and ran into some trouble with quickpred()

Currently, quickpred converts factors without warning to their internal codes and then calculates correlations, which then leads to arbitrary output in the case of unordered factors. The documentation and vignettes do not specify how factors are handled, I only found the answer by looking at the code - should there be a warning? Also, does it make sense to use the second mincor criterion (correlation between value of predictor and missingness of target) to decide which variables to use to predict factors? If so, it would be great if that could be specified as an option to quickpred?

@stefvanbuuren
Copy link
Member

Thanks for your suggestions.

  • Will add a pointer to base::data.matrix(), the function that does the actual conversion to the documentation, and a short comment that the results may be nonsensical for unordered factors.
  • A good imputation model should also include factors that related to the missingness, so that's why quickpred() also looks into that correlation. Of course, we can have a separate criterion for that correlation, but the task on the user becomes more demanding, so - for all practical purposes - I've choose mincor as the sole criterion for both correlations. After all, it's just a quick predictor matrix setup.

@LukasWallrich
Copy link
Contributor Author

Thanks! I now wrote a version of the function that deals with unordered factors, initially just for my personal use: https://github.com/LukasWallrich/rNuggets/blob/master/R/mice_quickpred_extension.R - basically, it dummy codes those factors and tests whether any of the dummies exceeds mincor.

If you think it would make sense to include that into mice, I'd be happy to create a PR - just let me know.

@amices amices locked and limited conversation to collaborators Apr 1, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants