Investigate possibility of borrowing some features/ideas from Explainable Boosted Machines #3905

JoshuaC3 · 2021-02-03T19:05:51Z

Summary

Borrow ideas from InterpretMLs Explainable Boosting Machine to make LGBM more interpretable as well as more comparable to their EBM.

Description

I am sure you are somewhat aware of InterpretMLs Explainable Boosting Machine - also a Microsoft innovation!

I have been a long time user of LGBM and now an recent fan of EBM. Clearly, they are similar in ways and each have their strong points and weaknesses. There are trade-offs of choosing either. That said, I have been playing around with some settings in LGBM to make the final models behave more like the EBMs. The reasons for this are two fold:

Increase LGBM interpretability
Allow more direct comparisons of results/functionality.

Some tips, tricks and findings are as follows:

Setting n_estimators high, learning_rate low, and num_leaves low, begins to mimic some of the behaviours of the EBM. That is, iteratively building LOTS of VERY shallow trees, VERY slowly. I add feature constraints so that each tree is univariate. Pairwise interactions could also be added where needed. This allows the model to learn incrementally small amounts from each feature, rather like the EBM. Essentially, replicating a model of the form:

lgb = F0(X0) + F1(X1) + ... + Fn(Xn)

which is effectively a GLM/GAM.

lgb = LGBMRegressor(
    n_estimators=5000, #large
    learning_rate=0.01, #small
    num_leaves=4, #shallow
    interaction_constraints=[[i] for i in range(X.shape[1])], #[[i, j] for i in range(X.shape[1]) for j in range(X.shape[1]) if i != j]
#     feature_contri=[10 for i in range(X.shape[1])], #use to reduce single feature dependence?
#     monotone_constraints=[1 for i in range(X.shape[1] - 1)] + [-1], #use for expert knowledge. I used in my temperature vs gas use case. temp -> gas monotonically decreasing.
)

To me, however, there seems to be two relatively simple features that could be added to facilitate this further.

The addition of a Constant Term.
Schedulable Feature Cycling.

The Constant Term

This would allow the model to remove the extra/redundant initial "value"* from the first set of splits on the first feature. This is the same as shifting your y-variable (e.g. lgb.fit(X, y - C), but until the initial run, you do not know what C is.

Schedulable Feature Cycling

This could be done in many ways, but essentially, all that is needed is the ability to cycle through each feature in-turn. With the current model params this is done at random. Combined with the above issue, this means that any features disproportionately picked at random in the first few trees would have an artificially inflated "value"*.

Hopefully it is clear how this would help improve the understanding of how an LGBM model is behaving.

This is aimed at being an ongoing discussion, so please chime in. Any questions, please ask!!

Motivation

Increase LGBM interpretability.
Allow more direct comparisons of results/functionality.
Borrow other idea from EBMs.
Open discussion on how to do this.

References

*"value" as defined in the table generated by: tree = lgb.booster_.trees_to_dataframe().
InterpretML: A Unified Framework for Machine Learning Interpretability
InterpretML: A toolkit for understanding machine learning models
InterpretMLs Explainable Boosting Machine

The text was updated successfully, but these errors were encountered:

StrikerRUS · 2021-03-07T21:25:09Z

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

JoshuaC3 · 2021-03-08T13:50:39Z

The Constant Term
This would allow the model to remove the extra/redundant initial "value"* from the first set of splits on the first feature. This is the same as shifting your y-variable (e.g. lgb.fit(X, y - C), but until the initial run, you do not know what C is.

This constant term is simply the mean of y in EBMs. Is this just setting init_score to be the mean of y for all observations?

JoshuaC3 · 2021-03-08T14:14:14Z

Schedulable Feature Cycling
This could be done in many ways, but essentially, all that is needed is the ability to cycle through each feature in-turn. With the current model params this is done at random. Combined with the above issue, this means that any features disproportionately picked at random in the first few trees would have an artificially inflated "value"*.

@StrikerRUS this is conceptually a very easy thing to implement, though imagine it to be at the internal level (C code level). It would simply require a parameter, lets say feature_sampling, and the options to be random or cyclic.

Can I add this as a separate feature request and then add it under the New features section? I feel it would be incredibly easy to implement so being under New Algorithm makes it seem a bit too big a task.

StrikerRUS · 2021-03-08T19:41:14Z

@JoshuaC3

This constant term is simply the mean of y in EBMs.

For mean value I think boost_from_average param should help. For any other custom values init_score can be used, as you correctly mentioned.

Can I add this as a separate feature request and then add it under the New features section? I feel it would be incredibly easy to implement so being under New Algorithm makes it seem a bit too big a task.

Sure! As you are quite familiar with EBM, feel free to split this big issue into multiple smaller separate feature requests which will be self-contained and will not require a lot of efforts for writing new code. I believe it will help to evolve more people into improvement process of LightGBM.

JoshuaC3 · 2021-03-11T15:43:49Z

@StrikerRUS Thanks! I will do this.

In Python, how do you return the init_score after the model is trained? I cannot find any functions or attributes for it. Thanks!!

StrikerRUS · 2021-03-12T00:01:20Z

Do you mean something like this?

import numpy as np
import lightgbm as lgb

from sklearn.datasets import load_boston

X, y = load_boston(return_X_y=True)
lgb_data = lgb.Dataset(X, y, init_score=np.ones_like(y))
lgb_data.get_init_score()

In LightGBM init_score is tight with Dataset object.

JoshuaC3 · 2021-03-12T08:02:29Z

Precisely that. Thanks. I was looking on the Booster class and via the sklearn api.

I will add "Expose the get_init_score to the Booster class" as a feature request and also expose it in the sklearn api too. I think I might be able to do this as it all looks like pythons all the way up.

StrikerRUS changed the title ~~Bringing LightGBM inline with Explainable Boosted Machines~~ Investigate possibility of borrowing some features from Explainable Boosted Machines Mar 7, 2021

StrikerRUS added feature request help wanted labels Mar 7, 2021

StrikerRUS mentioned this issue Mar 7, 2021

Feature Requests & Voting Hub #2302

Open

StrikerRUS closed this as completed Mar 7, 2021

StrikerRUS changed the title ~~Investigate possibility of borrowing some features from Explainable Boosted Machines~~ Investigate possibility of borrowing some features/ideas from Explainable Boosted Machines Mar 7, 2021

This was referenced Mar 12, 2021

Include init_score on the Python Booster class #4065

Closed

Feature Cycling as an option instead of Random Feature Sampling #4066

Closed

This was referenced Apr 23, 2021

One Feature per Tree #4134

Closed

Add BoostFromAverage to Python model. #4234

Closed

shiyu1994 mentioned this issue Apr 28, 2021

Add init_score attr to python booster class. #4235

Closed

JoshuaC3 mentioned this issue May 21, 2021

Add BoostFromAverage value as an attribute to LightGBM models. #4313

Closed

veneres mentioned this issue Feb 21, 2024

Explainable boosting parameters #6335

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate possibility of borrowing some features/ideas from Explainable Boosted Machines #3905

Investigate possibility of borrowing some features/ideas from Explainable Boosted Machines #3905

JoshuaC3 commented Feb 3, 2021

StrikerRUS commented Mar 7, 2021

JoshuaC3 commented Mar 8, 2021 •

edited

Loading

JoshuaC3 commented Mar 8, 2021

StrikerRUS commented Mar 8, 2021

JoshuaC3 commented Mar 11, 2021

StrikerRUS commented Mar 12, 2021

JoshuaC3 commented Mar 12, 2021

Investigate possibility of borrowing some features/ideas from Explainable Boosted Machines #3905

Investigate possibility of borrowing some features/ideas from Explainable Boosted Machines #3905

Comments

JoshuaC3 commented Feb 3, 2021

Summary

Description

The Constant Term

Schedulable Feature Cycling

Motivation

References

StrikerRUS commented Mar 7, 2021

JoshuaC3 commented Mar 8, 2021 • edited Loading

JoshuaC3 commented Mar 8, 2021

StrikerRUS commented Mar 8, 2021

JoshuaC3 commented Mar 11, 2021

StrikerRUS commented Mar 12, 2021

JoshuaC3 commented Mar 12, 2021

JoshuaC3 commented Mar 8, 2021 •

edited

Loading