You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I didn't see this enhancement request on the 2302 page. We need lock-free, single-row predict for fastest parallel inference with the same booster. Can't the FastConfig object be enhanced to hold all storage required for inference? Then we can pass-in thread-specific FastConfig objects for parallel inference, and no mutex would be required. Let us tell you whether FastConfig should use a mutex (e.g. default true for backward compatibility).
Motivation
FastConfig was a step in the right direction (thank you), but the required sharing of one FastConfig across threads still requires a mutex and has contention slowness. Please consider supporting lock-free inference if we pass-in thread-specific FastConfig handles, or some other API for lock-free single-row predict.
I'm all up for that, the thing is it will require some rewriting of the predictors as currently all the "shared" data is stored inside the predictor. Also, I believe the "non-fast" single-prediction methods should not be dropped from the API for they are simpler to use. The difficulty is that the non-fast versions assume all "shared" data is handled by the predictors, so this will require some work.
Anyway, if someone is willing to do the work I'd say it's definitely a nice improvement to the "fast" methods ;)
Closed in favor of being in #2302. We decided to keep all feature requests in one place.
Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.
Summary
Hi, I didn't see this enhancement request on the 2302 page. We need lock-free, single-row predict for fastest parallel inference with the same booster. Can't the FastConfig object be enhanced to hold all storage required for inference? Then we can pass-in thread-specific FastConfig objects for parallel inference, and no mutex would be required. Let us tell you whether FastConfig should use a mutex (e.g. default true for backward compatibility).
Motivation
FastConfig was a step in the right direction (thank you), but the required sharing of one FastConfig across threads still requires a mutex and has contention slowness. Please consider supporting lock-free inference if we pass-in thread-specific FastConfig handles, or some other API for lock-free single-row predict.
Description
(see above)
References
Similar capability in xgboost:
https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.Booster.inplace_predict
The text was updated successfully, but these errors were encountered: