Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Loss Function for LGBMRegressor #5256

Closed
jdtrebbien opened this issue May 31, 2022 · 6 comments
Closed

Custom Loss Function for LGBMRegressor #5256

jdtrebbien opened this issue May 31, 2022 · 6 comments
Labels

Comments

@jdtrebbien
Copy link

jdtrebbien commented May 31, 2022

I want to use a custom loss function for LGBMRegressor but I cant find any documentation on it. If I understand it correctly I need to use the params 'objective' and 'metric' to completely change the loss function in training and evaluation. The function for 'objective' returning (grad, hess) and the function for 'metric' returning ('<loss_name>', loss, uses_max). I am just searching for the two functions that are being used when the default objective 'regression' (l2 loss) is beeing used so I can reproduce and change it. I already found the C++ code for the regression, but I am unable to reproduce it using two custom functions written in python.

This would be my approach:

def custom_l2_loss(y_pred, y_train):
    ...
    return grad, hess

def custom_l2_eval(y, data):
    ...
    return 'l2', loss.mean(), False


model = LGBMRegressor(
    objective=custom_l2_loss,
    metric=custom_l2_eval
)

model.fit(
    X_train,
    y_train,
    eval_set=[
        (X_val, y_val),
    ]
)

Does someone know how to reproduce the l2_loss in python?

@jmoralez
Copy link
Collaborator

Hi @jdtrebbien, thank you for your interest in LightGBM. If you only want a custom objective it is enough to specify objective, you can find an example of l2 here:

def test_regression_with_custom_objective():
X, y = load_boston(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)
gbm = lgb.LGBMRegressor(n_estimators=50, verbose=-1, objective=objective_ls)
gbm.fit(X_train, y_train, eval_set=[(X_test, y_test)], callbacks=[lgb.early_stopping(5)])
ret = mean_squared_error(y_test, gbm.predict(X_test))
assert ret < 7.0
assert gbm.evals_result_['valid_0']['l2'][gbm.best_iteration_ - 1] == pytest.approx(ret)

Keep in mind that the results will be a bit different at first because of the different init score #5114 (comment) but if you train for enough iterations you should get the same results.
You can also use the built-in metrics if you want, there's no need to use a custom metric if there's already a built-in one.

Please let us know if this helps.

@jdtrebbien
Copy link
Author

jdtrebbien commented May 31, 2022

Thank you very much, that is basically what I was looking for. But are you sure I dont need to change metric or something to also change the loss function being used on the validation set for early_stopping? Since it still says valid_0's l2: <num> for the validation set and not the name of my custom function.

One more thing if you have the time:
When using the custom l2 loss from this, any idea how to get the l1/2 loss? That is what I need to implement since its not one of the built-in metrics. For the normal loss it would be (sum((x - y) ** 0.5)) ** 2 instead of the normal l2 loss with (sum((x - y) ** 2)) ** 0.5 but I am a bit stuck when it comes to defining this for grad and hess...

@jmoralez
Copy link
Collaborator

That's the default metric for regression, if you do want to use a custom metric as well you have to set metric='None' and pass your custom metric on the fit method like here:

# custom metric (disable default metric)
gbm = lgb.LGBMRegressor(metric='None',
**params).fit(eval_metric=constant_metric, **params_fit)
assert len(gbm.evals_result_['training']) == 1
assert 'error' in gbm.evals_result_['training']

TBH I don't know about the l1/2 loss, the square root is defined only for non-negative numbers so I don't think it'll be a good objective but maybe someone else here can comment on it.

@StatMixedML
Copy link
Contributor

StatMixedML commented Jun 14, 2022

@jdtrebbien You can find an end-to-end way of how to use custom-loss and evaluation function on my LightGBMLSS Repo.

For the linked example, I use PyTorch's autograd function, so that you can derive gradients and hessians for any user-defined loss.

Let me know if that is useful.

@github-actions
Copy link

github-actions bot commented Jul 5, 2022

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!

@github-actions github-actions bot closed this as completed Jul 5, 2022
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants