Decouple boosting types #3128

candalfigomoro · 2020-05-29T11:02:47Z

With xgboost you can build (a sort of) Random Forest by setting num_parallel_tree>1 and nrounds=1

In LightGBM we can build (a sort of) Random Forest by setting boosting='rf'

Since the num_parallel_tree and nrounds params are decoupled in xgboost, what you can do is to set num_parallel_tree>1 and nrounds>1 to build a "boosted random forest". This is not possible in LightGBM afaik.

The fact that boosting types are mutually exclusive, make also impossible to create other combinations such as dart+goss (see #2991).

The "rf" and "goss" modes should be decoupled from the boosting type. Maybe I want to use dart+rf or gbrt+rf (like in xgboost) to build boosted random forests. Maybe I want to use dart+goss.

@guolinke Maybe this can be considered for LightGBM 3 (#3071)?

The text was updated successfully, but these errors were encountered:

guolinke · 2020-08-06T00:45:22Z

I think it is a useful feature, but not trivial to implement it.
we will need to refactor the whole boosting part. And I don't have much time recently.
@shiyu1994 is helping me for the lightgbm project recently, but he is also busy this month.
We can start refactoring in the next couple of months.

candalfigomoro · 2020-08-16T18:27:56Z

I think this would be a huge win in the long term because the code would be much more modular, but I understand the required effort is big

StrikerRUS · 2021-01-12T21:42:50Z

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

yiwiz-sai · 2022-06-10T10:28:01Z

This feature is very useful, hope lightgbm team can support it.

num_parallel_tree is important to avoid overfitting, I think that's an advantage of xgboost,

StrikerRUS · 2022-06-12T15:00:03Z

Active work happens in #4827.

* add parameter data_sample_strategy * abstract GOSS as a sample strategy(GOSS1), togetherwith origial GOSS (Normal Bagging has not been abstracted, so do NOT use it now) * abstract Bagging as a subclass (BAGGING), but original Bagging members in GBDT are still kept * fix some variables * remove GOSS(as boost) and Bagging logic in GBDT * rename GOSS1 to GOSS(as sample strategy) * add warning about use GOSS as boosting_type * a little ; bug * remove CHECK when "gradients != nullptr" * rename DataSampleStrategy to avoid confusion * remove and add some ccomments, followingconvention * fix bug about GBDT::ResetConfig (ObjectiveFunction inconsistencty bet… * add std::ignore to avoid compiler warnings (anpotential fails) * update Makevars and vcxproj * handle constant hessian move resize of gradient vectors out of sample strategy * mark override for IsHessianChange * fix lint errors * rerun parameter_generator.py * update config_auto.cpp * delete redundant blank line * update num_data_ when train_data_ is updated set gradients and hessians when GOSS * check bagging_freq is not zero * reset config_ value merge ResetBaggingConfig and ResetGOSS * remove useless check * add ttests in test_engine.py * remove whitespace in blank line * remove arguments verbose_eval and evals_result * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by: James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by: James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by: James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by: James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by: James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by: James Lamb <jaylamb20@gmail.com> * Update src/boosting/sample_strategy.cpp modify warning about setting goss as `boosting_type` Co-authored-by: James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py replace load_boston() with make_regression() remove value checks of mean_squared_error in test_sample_strategy_with_boosting() * Update tests/python_package_test/test_engine.py add value checks of mean_squared_error in test_sample_strategy_with_boosting() * Modify warnning about using goss as boosting type * Update tests/python_package_test/test_engine.py add random_state=42 for make_regression() reduce the threshold of mean_square_error * Update src/boosting/sample_strategy.cpp Co-authored-by: James Lamb <jaylamb20@gmail.com> * remove goss from boosting types in documentation * Update src/boosting/bagging.hpp Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update src/boosting/bagging.hpp Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update src/boosting/goss.hpp Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update src/boosting/goss.hpp Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * rename GOSS with GOSSStrategy * update doc * address comments * fix table in doc * Update include/LightGBM/config.h Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update documentation * update test case * revert useless change in test_engine.py * add tests for evaluation results in test_sample_strategy_with_boosting * include <string> * change to assert_allclose in test_goss_boosting_and_strategy_equivalent * more tolerance in result checking, due to minor difference in results of gpu versions * change == to np.testing.assert_allclose * fix test case * set gpu_use_dp to true * change --report to --report-level for rstcheck * use gpu_use_dp=true in test_goss_boosting_and_strategy_equivalent * revert unexpected changes of non-ascii characters * revert unexpected changes of non-ascii characters * remove useless changes * allocate gradients_pointer_ and hessians_pointer when necessary * add spaces * remove redundant virtual * include <LightGBM/utils/log.h> for USE_CUDA * check for in test_goss_boosting_and_strategy_equivalent * check for identity in test_sample_strategy_with_boosting * remove cuda option in test_sample_strategy_with_boosting * Update tests/python_package_test/test_engine.py Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update tests/python_package_test/test_engine.py Co-authored-by: James Lamb <jaylamb20@gmail.com> * ResetGradientBuffers after ResetSampleConfig * ResetGradientBuffers after ResetSampleConfig * ResetGradientBuffers after bagging * remove useless code * check objective_function_ instead of gradients * enable rf with goss simplify params in test cases * remove useless changes * allow rf with feature subsampling alone * change position of ResetGradientBuffers * check for dask * add parameter types for data_sample_strategy Co-authored-by: Guangda Liu <v-guangdaliu@microsoft.com> Co-authored-by: Yu Shi <shiyu_k1994@qq.com> Co-authored-by: GuangdaLiu <90019144+GuangdaLiu@users.noreply.github.com> Co-authored-by: James Lamb <jaylamb20@gmail.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

github-actions · 2023-08-15T20:14:14Z

This issue has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

guolinke mentioned this issue Aug 6, 2020

Feature Requests & Voting Hub #2302

Open

guolinke added the feature request label Aug 6, 2020

guolinke mentioned this issue Aug 6, 2020

use dart and goss at the same time #2991

Closed

StrikerRUS closed this as completed Jan 12, 2021

candalfigomoro mentioned this issue Apr 20, 2021

lightgbm.cv strange results in Random Forest mode #4206

Closed

jameslamb mentioned this issue Jan 7, 2022

Decouple Boosting Types (fixes #3128) #4827

Merged

StrikerRUS reopened this Jun 12, 2022

shiyu1994 closed this as completed in #4827 Dec 28, 2022

github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple boosting types #3128

Decouple boosting types #3128

candalfigomoro commented May 29, 2020

guolinke commented Aug 6, 2020

candalfigomoro commented Aug 16, 2020

StrikerRUS commented Jan 12, 2021

yiwiz-sai commented Jun 10, 2022

StrikerRUS commented Jun 12, 2022

github-actions bot commented Aug 15, 2023

Decouple boosting types #3128

Decouple boosting types #3128

Comments

candalfigomoro commented May 29, 2020

guolinke commented Aug 6, 2020

candalfigomoro commented Aug 16, 2020

StrikerRUS commented Jan 12, 2021

yiwiz-sai commented Jun 10, 2022

StrikerRUS commented Jun 12, 2022

github-actions bot commented Aug 15, 2023