Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is _CVBooster object hidden class? #2105

Closed
matsuken92 opened this issue Apr 15, 2019 · 6 comments
Closed

Why is _CVBooster object hidden class? #2105

matsuken92 opened this issue Apr 15, 2019 · 6 comments

Comments

@matsuken92
Copy link
Contributor

Some of Kaggler use out-of-fold prediction (oof) which is calculated with validation dataset for stacking, blending, evaluation etc. In order to calculate oof with lgb.cv(), we need to return Boosters however it is not returned from the function. If I customize to return _CVBooster object, it can be calculated, but _CVBooster is hidden class, so I think it is not appropriate to return officially.

Is it possible to remove prefix underscore from _CVBooster? Then I can customize to return the object on lgb.cv().

I would appreciate it if you could advise to it.

_CVBooster

https://github.com/Microsoft/LightGBM/blob/master/python-package/lightgbm/engine.py#L248-L267

Example

https://www.kaggle.com/kenmatsu4/using-trained-booster-from-lightgbm-cv-w-callback

@guolinke
Copy link
Collaborator

ping @StrikerRUS

@StrikerRUS
Copy link
Collaborator

@guolinke When I joined LightGBM _CVBooster had been already implemented as protected class: #166.
I do not mind make this class public. But is it really needed? This class is pretty small and I suppose it's possible to achieve the goal without this class.

@matsuken92 Isn't ordinary Python list of Boosters not enough?
We already have a discussion about enhancement of cv return value here and here. Can you please look through these issue and tell you opinion? Maybe you have an idea of complete cv function redesign which will help to implement new features? Otherwise, please think about simple list or any other appropriate data structure. Because I think that end users do not need a full API of _CVBooster, but just a small part of it.

Anyway, PR from you in any form is very welcome!

@matsuken92
Copy link
Contributor Author

matsuken92 commented Apr 20, 2019

Thank you for your comment! Minimum requirements of my thought are:

  • Returning list of Boosters
  • Returning best_iteration to identify which iteration is best for predict when using early stopping.

In addition to above, current _CVBooster.__getattr__ function is useful for batch operation of boosters on each fold. The following link is the part of usage of the _CVBooster.__getattr__ calculating prediction result for test data.

https://github.com/matsuken92/LightGBM_test/blob/master/src/cv_booster_test.py#L117

I will send my idea as PR, so I would appreciate it if you could review it then!

@StrikerRUS
Copy link
Collaborator

@matsuken92 I see! Thanks for the clarification!

Of course, I'll be happy to review it.

Can you please also think about backward compatibility in terms of cv function return type? Maybe a new key in results dict or something else, idk...

@StrikerRUS
Copy link
Collaborator

Closing in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open (or post a comment if you are not a topic starter) this issue if you are actively working on implementing this feature.

@StrikerRUS StrikerRUS mentioned this issue Mar 5, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
@microsoft microsoft unlocked this conversation Jun 10, 2020
StrikerRUS added a commit that referenced this issue Aug 2, 2020
…283,#2105,#1445) (#3204)

* [python] add return_cvbooster flag to cv function and rename _CVBooster to make public (#283,#2105)

* [python] Reduce expected metric of unit testing

* [docs] add the CVBooster to the documentation

* [python] reflect the review comments

- Add some clarifications to the documentation
- Rename CVBooster.append to make private
- Decrease iteration rounds of testing to save CI time
- Use CVBooster as root member of lgb

* [python] add more checks in testing for cv

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* [python] add docstring for instance attributes of CVBooster

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* [python] fix docstring

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
@StrikerRUS
Copy link
Collaborator

Closed via #3204.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants