Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the calculation of split gain #6243

Open
yuanqingye opened this issue Dec 21, 2023 · 1 comment
Open

Regarding the calculation of split gain #6243

yuanqingye opened this issue Dec 21, 2023 · 1 comment
Labels

Comments

@yuanqingye
Copy link

Hi
I try to figure out how is split gain getting calculated here, it is key measure.

I noticed in issue #1230
The supporter wrote: The split gain and leaf output is calculated by sum_grad / sum_hess.

I want to know why? seems the split gain is related to the way we measure impurity(GINI,Entropy,etc)
In entropy case, I remember the split gain should be H(Y)-H(Y|X), and how it related to sum_grad/sum_hess?

And should it be different between classification and regression case? I mean it seems for regression and classification, we should have different way to calculate. But if it is the same, then we may use the same logic to calculate the impurity.

Any material regarding this is welcomed.

@yuanqingye
Copy link
Author

xgboost math explanation
I think this article write very well about the details
The only big part it didn't cover is the shrinkage or learning rate.
I think the discussion here may also provide some insight.
Discussion regarding learning rate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants