Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to converge with gamma objective #4792

Open
murphyhopfensperger opened this issue Nov 11, 2021 · 0 comments
Open

Failure to converge with gamma objective #4792

murphyhopfensperger opened this issue Nov 11, 2021 · 0 comments
Labels

Comments

@murphyhopfensperger
Copy link

murphyhopfensperger commented Nov 11, 2021

I've written the following toy example after seeing weird behavior on a real-life dataset.
I was looking at the model residuals from various objectives (l1, l2, gamma, huber, fair, ...).
The gamma objective in particular had a weird fit -- I thought "wow, my data really isn't gamma distributed!"
But later I used XGBoost to fit the data with a gamma objective, and it worked fine.
I'm not sure if this is a bug, or a use of lower-precision floats, or a difference in how the learning_rate parameter is interpreted, or something else I haven't thought of. Please let me know if you have a good explanation.

(I'm not sure if this example even captures exactly what's going wrong in my real-life dataset, but it seems interesting in its own right.)

import numpy as np
import lightgbm as lgb
import xgboost as xgb

param = {
    "objective": "gamma",
    "n_estimators": 120,
    "learning_rate": 0.1,
    "n_jobs": 12,
}
lgbmr = lgb.LGBMRegressor(**param)


rng = np.random.default_rng(seed=0)
n, d = 10_000, 2
X = rng.standard_normal(size=(n, d))

print("Fitting LGB with smaller-scale data.")
beta = np.array([0.1, 0.1])
scale = np.exp(X @ beta)
y = rng.gamma(shape=1, scale=scale, size=n)
lgbmr.fit(X, y, eval_set=[(X, y)])
print()

print("Fitting LGB with larger-scale data -- blows up, gives no warning.")
beta = np.array([1, 1])
scale = np.exp(X @ beta)
y = rng.gamma(shape=1, scale=scale, size=n)
lgbmr.fit(X, y, eval_set=[(X, y)])
print()

print("Fitting XGB with larger-scale data -- seems to work fine.")
param["objective"] = "reg:gamma"
xgbr = xgb.XGBRegressor(**param)
xgbr.fit(X, y, eval_set=[(X, y)])

The output was as follows:

Fitting LGB with smaller-scale data.
[1]	training's gamma: 0.995646
[2]	training's gamma: 0.992686
[3]	training's gamma: 0.990159
[4]	training's gamma: 0.987979
[5]	training's gamma: 0.986115
[6]	training's gamma: 0.984338
[7]	training's gamma: 0.982616
[8]	training's gamma: 0.981248
[9]	training's gamma: 0.979855
[10]	training's gamma: 0.978606
[11]	training's gamma: 0.977463
[12]	training's gamma: 0.976517
[13]	training's gamma: 0.975383
[14]	training's gamma: 0.974455
[15]	training's gamma: 0.973503
[16]	training's gamma: 0.972654
[17]	training's gamma: 0.971636
[18]	training's gamma: 0.970804
[19]	training's gamma: 0.969939
[20]	training's gamma: 0.969261
[21]	training's gamma: 0.9685
[22]	training's gamma: 0.967748
[23]	training's gamma: 0.966967
[24]	training's gamma: 0.966194
[25]	training's gamma: 0.965429
[26]	training's gamma: 0.964798
[27]	training's gamma: 0.964224
[28]	training's gamma: 0.96343
[29]	training's gamma: 0.962718
[30]	training's gamma: 0.96202
[31]	training's gamma: 0.961464
[32]	training's gamma: 0.960952
[33]	training's gamma: 0.960191
[34]	training's gamma: 0.959657
[35]	training's gamma: 0.95906
[36]	training's gamma: 0.958654
[37]	training's gamma: 0.957875
[38]	training's gamma: 0.957091
[39]	training's gamma: 0.956517
[40]	training's gamma: 0.956008
[41]	training's gamma: 0.955579
[42]	training's gamma: 0.955121
[43]	training's gamma: 0.954588
[44]	training's gamma: 0.95406
[45]	training's gamma: 0.953735
[46]	training's gamma: 0.953209
[47]	training's gamma: 0.952507
[48]	training's gamma: 0.951984
[49]	training's gamma: 0.951473
[50]	training's gamma: 0.951067
[51]	training's gamma: 0.950776
[52]	training's gamma: 0.950218
[53]	training's gamma: 0.949764
[54]	training's gamma: 0.949266
[55]	training's gamma: 0.94886
[56]	training's gamma: 0.94847
[57]	training's gamma: 0.947966
[58]	training's gamma: 0.947551
[59]	training's gamma: 0.946995
[60]	training's gamma: 0.946652
[61]	training's gamma: 0.94624
[62]	training's gamma: 0.945568
[63]	training's gamma: 0.945195
[64]	training's gamma: 0.944863
[65]	training's gamma: 0.944243
[66]	training's gamma: 0.943886
[67]	training's gamma: 0.943555
[68]	training's gamma: 0.943183
[69]	training's gamma: 0.942717
[70]	training's gamma: 0.942475
[71]	training's gamma: 0.942146
[72]	training's gamma: 0.941657
[73]	training's gamma: 0.941363
[74]	training's gamma: 0.940961
[75]	training's gamma: 0.940545
[76]	training's gamma: 0.940093
[77]	training's gamma: 0.93984
[78]	training's gamma: 0.939344
[79]	training's gamma: 0.938802
[80]	training's gamma: 0.938202
[81]	training's gamma: 0.937892
[82]	training's gamma: 0.937617
[83]	training's gamma: 0.937154
[84]	training's gamma: 0.936727
[85]	training's gamma: 0.936241
[86]	training's gamma: 0.93598
[87]	training's gamma: 0.935576
[88]	training's gamma: 0.93516
[89]	training's gamma: 0.934821
[90]	training's gamma: 0.934372
[91]	training's gamma: 0.934051
[92]	training's gamma: 0.933707
[93]	training's gamma: 0.933503
[94]	training's gamma: 0.932964
[95]	training's gamma: 0.932546
[96]	training's gamma: 0.932308
[97]	training's gamma: 0.931935
[98]	training's gamma: 0.931565
[99]	training's gamma: 0.931239
[100]	training's gamma: 0.930936
[101]	training's gamma: 0.930384
[102]	training's gamma: 0.93014
[103]	training's gamma: 0.929847
[104]	training's gamma: 0.9294
[105]	training's gamma: 0.928906
[106]	training's gamma: 0.928356
[107]	training's gamma: 0.928088
[108]	training's gamma: 0.927662
[109]	training's gamma: 0.92743
[110]	training's gamma: 0.927176
[111]	training's gamma: 0.926836
[112]	training's gamma: 0.926553
[113]	training's gamma: 0.926183
[114]	training's gamma: 0.925815
[115]	training's gamma: 0.92556
[116]	training's gamma: 0.925322
[117]	training's gamma: 0.925128
[118]	training's gamma: 0.924945
[119]	training's gamma: 0.924617
[120]	training's gamma: 0.924366

Fitting LGB with larger-scale data -- blows up, gives no warning.
[1]	training's gamma: 2339.74
[2]	training's gamma: 2117.2
[3]	training's gamma: 1915.84
[4]	training's gamma: 1733.65
[5]	training's gamma: 1568.8
[6]	training's gamma: 1419.64
[7]	training's gamma: 1284.68
[8]	training's gamma: 1162.56
[9]	training's gamma: 1052.03
[10]	training's gamma: 952.018
[11]	training's gamma: 861.534
[12]	training's gamma: 779.665
[13]	training's gamma: 705.59
[14]	training's gamma: 638.567
[15]	training's gamma: 577.924
[16]	training's gamma: 523.054
[17]	training's gamma: 473.408
[18]	training's gamma: 428.489
[19]	training's gamma: 387.845
[20]	training's gamma: 351.071
[21]	training's gamma: 317.799
[22]	training's gamma: 287.694
[23]	training's gamma: 260.455
[24]	training's gamma: 235.81
[25]	training's gamma: 213.511
[26]	training's gamma: 193.336
[27]	training's gamma: 175.083
[28]	training's gamma: 158.567
[29]	training's gamma: 143.625
[30]	training's gamma: 130.106
[31]	training's gamma: 117.875
[32]	training's gamma: 106.809
[33]	training's gamma: 96.7976
[34]	training's gamma: 87.74
[35]	training's gamma: 79.5456
[36]	training's gamma: 2.19389e+07
[37]	training's gamma: 1.98511e+07
[38]	training's gamma: 1.7962e+07
[39]	training's gamma: 1.62527e+07
[40]	training's gamma: 1.47061e+07
[41]	training's gamma: 1.33066e+07
[42]	training's gamma: 1.20403e+07
[43]	training's gamma: 1.08945e+07
[44]	training's gamma: 9.85777e+06
[45]	training's gamma: 8.91968e+06
[46]	training's gamma: 8.07086e+06
[47]	training's gamma: 7.30281e+06
[48]	training's gamma: 6.60786e+06
[49]	training's gamma: 5.97904e+06
[50]	training's gamma: 5.41006e+06
[51]	training's gamma: 4.89522e+06
[52]	training's gamma: 4.42938e+06
[53]	training's gamma: 4.00787e+06
[54]	training's gamma: 3.62647e+06
[55]	training's gamma: 3.28137e+06
[56]	training's gamma: 2.9691e+06
[57]	training's gamma: 2.68656e+06
[58]	training's gamma: 2.4309e+06
[59]	training's gamma: 2.19957e+06
[60]	training's gamma: 1.99025e+06
[61]	training's gamma: 1.80085e+06
[62]	training's gamma: 1.62948e+06
[63]	training's gamma: 1.47441e+06
[64]	training's gamma: 1.33411e+06
[65]	training's gamma: 1.20715e+06
[66]	training's gamma: 1.09227e+06
[67]	training's gamma: 988330
[68]	training's gamma: 894279
[69]	training's gamma: 809177
[70]	training's gamma: 732174
[71]	training's gamma: 662499
[72]	training's gamma: 599454
[73]	training's gamma: 542409
[74]	training's gamma: 490792
[75]	training's gamma: 444087
[76]	training's gamma: 401827
[77]	training's gamma: 363588
[78]	training's gamma: 328989
[79]	training's gamma: 297682
[80]	training's gamma: 269354
[81]	training's gamma: 243722
[82]	training's gamma: 220529
[83]	training's gamma: 199543
[84]	training's gamma: 180554
[85]	training's gamma: 163373
[86]	training's gamma: 147826
[87]	training's gamma: 133759
[88]	training's gamma: 121031
[89]	training's gamma: 109513
[90]	training's gamma: 99092.2
[91]	training's gamma: 89662.7
[92]	training's gamma: 81130.6
[93]	training's gamma: 73410.4
[94]	training's gamma: 66424.9
[95]	training's gamma: 60104.1
[96]	training's gamma: 54384.9
[97]	training's gamma: 49209.9
[98]	training's gamma: 44527.4
[99]	training's gamma: 40290.5
[100]	training's gamma: 36456.8
[101]	training's gamma: 32987.9
[102]	training's gamma: 29849.1
[103]	training's gamma: 27009.1
[104]	training's gamma: 24439.3
[105]	training's gamma: 22114
[106]	training's gamma: 20010.1
[107]	training's gamma: 18106.3
[108]	training's gamma: 16383.8
[109]	training's gamma: 14825.1
[110]	training's gamma: 13414.8
[111]	training's gamma: 7.56148e+42
[112]	training's gamma: 7.56148e+42
[113]	training's gamma: 7.56148e+42
[114]	training's gamma: 7.56148e+42
[115]	training's gamma: 7.56148e+42
[116]	training's gamma: 7.56148e+42
[117]	training's gamma: 7.56148e+42
[118]	training's gamma: 7.56148e+42
[119]	training's gamma: 7.56148e+42
[120]	training's gamma: 7.56148e+42

Fitting XGB with larger-scale data -- seems to work fine.
[0]	validation_0-gamma-nloglik:4.24067
[1]	validation_0-gamma-nloglik:3.86973
[2]	validation_0-gamma-nloglik:3.54339
[3]	validation_0-gamma-nloglik:3.25407
[4]	validation_0-gamma-nloglik:2.99679
[5]	validation_0-gamma-nloglik:2.76785
[6]	validation_0-gamma-nloglik:2.56404
[7]	validation_0-gamma-nloglik:2.38233
[8]	validation_0-gamma-nloglik:2.22064
[9]	validation_0-gamma-nloglik:2.07680
[10]	validation_0-gamma-nloglik:1.94883
[11]	validation_0-gamma-nloglik:1.83493
[12]	validation_0-gamma-nloglik:1.73388
[13]	validation_0-gamma-nloglik:1.64406
[14]	validation_0-gamma-nloglik:1.56443
[15]	validation_0-gamma-nloglik:1.49390
[16]	validation_0-gamma-nloglik:1.43141
[17]	validation_0-gamma-nloglik:1.37609
[18]	validation_0-gamma-nloglik:1.32697
[19]	validation_0-gamma-nloglik:1.28383
[20]	validation_0-gamma-nloglik:1.24574
[21]	validation_0-gamma-nloglik:1.21177
[22]	validation_0-gamma-nloglik:1.18203
[23]	validation_0-gamma-nloglik:1.15571
[24]	validation_0-gamma-nloglik:1.13272
[25]	validation_0-gamma-nloglik:1.11236
[26]	validation_0-gamma-nloglik:1.09457
[27]	validation_0-gamma-nloglik:1.07899
[28]	validation_0-gamma-nloglik:1.06511
[29]	validation_0-gamma-nloglik:1.05307
[30]	validation_0-gamma-nloglik:1.04224
[31]	validation_0-gamma-nloglik:1.03291
[32]	validation_0-gamma-nloglik:1.02450
[33]	validation_0-gamma-nloglik:1.01722
[34]	validation_0-gamma-nloglik:1.01060
[35]	validation_0-gamma-nloglik:1.00510
[36]	validation_0-gamma-nloglik:1.00016
[37]	validation_0-gamma-nloglik:0.99591
[38]	validation_0-gamma-nloglik:0.99218
[39]	validation_0-gamma-nloglik:0.98879
[40]	validation_0-gamma-nloglik:0.98562
[41]	validation_0-gamma-nloglik:0.98274
[42]	validation_0-gamma-nloglik:0.98015
[43]	validation_0-gamma-nloglik:0.97810
[44]	validation_0-gamma-nloglik:0.97611
[45]	validation_0-gamma-nloglik:0.97423
[46]	validation_0-gamma-nloglik:0.97258
[47]	validation_0-gamma-nloglik:0.97108
[48]	validation_0-gamma-nloglik:0.96975
[49]	validation_0-gamma-nloglik:0.96850
[50]	validation_0-gamma-nloglik:0.96734
[51]	validation_0-gamma-nloglik:0.96617
[52]	validation_0-gamma-nloglik:0.96520
[53]	validation_0-gamma-nloglik:0.96400
[54]	validation_0-gamma-nloglik:0.96308
[55]	validation_0-gamma-nloglik:0.96223
[56]	validation_0-gamma-nloglik:0.96146
[57]	validation_0-gamma-nloglik:0.96076
[58]	validation_0-gamma-nloglik:0.95999
[59]	validation_0-gamma-nloglik:0.95938
[60]	validation_0-gamma-nloglik:0.95852
[61]	validation_0-gamma-nloglik:0.95782
[62]	validation_0-gamma-nloglik:0.95731
[63]	validation_0-gamma-nloglik:0.95669
[64]	validation_0-gamma-nloglik:0.95585
[65]	validation_0-gamma-nloglik:0.95542
[66]	validation_0-gamma-nloglik:0.95477
[67]	validation_0-gamma-nloglik:0.95436
[68]	validation_0-gamma-nloglik:0.95389
[69]	validation_0-gamma-nloglik:0.95345
[70]	validation_0-gamma-nloglik:0.95296
[71]	validation_0-gamma-nloglik:0.95247
[72]	validation_0-gamma-nloglik:0.95174
[73]	validation_0-gamma-nloglik:0.95151
[74]	validation_0-gamma-nloglik:0.95077
[75]	validation_0-gamma-nloglik:0.95050
[76]	validation_0-gamma-nloglik:0.94980
[77]	validation_0-gamma-nloglik:0.94961
[78]	validation_0-gamma-nloglik:0.94912
[79]	validation_0-gamma-nloglik:0.94859
[80]	validation_0-gamma-nloglik:0.94842
[81]	validation_0-gamma-nloglik:0.94798
[82]	validation_0-gamma-nloglik:0.94774
[83]	validation_0-gamma-nloglik:0.94759
[84]	validation_0-gamma-nloglik:0.94732
[85]	validation_0-gamma-nloglik:0.94697
[86]	validation_0-gamma-nloglik:0.94606
[87]	validation_0-gamma-nloglik:0.94569
[88]	validation_0-gamma-nloglik:0.94554
[89]	validation_0-gamma-nloglik:0.94514
[90]	validation_0-gamma-nloglik:0.94468
[91]	validation_0-gamma-nloglik:0.94454
[92]	validation_0-gamma-nloglik:0.94399
[93]	validation_0-gamma-nloglik:0.94355
[94]	validation_0-gamma-nloglik:0.94259
[95]	validation_0-gamma-nloglik:0.94227
[96]	validation_0-gamma-nloglik:0.94169
[97]	validation_0-gamma-nloglik:0.94070
[98]	validation_0-gamma-nloglik:0.94061
[99]	validation_0-gamma-nloglik:0.94021
[100]	validation_0-gamma-nloglik:0.93976
[101]	validation_0-gamma-nloglik:0.93884
[102]	validation_0-gamma-nloglik:0.93871
[103]	validation_0-gamma-nloglik:0.93853
[104]	validation_0-gamma-nloglik:0.93795
[105]	validation_0-gamma-nloglik:0.93715
[106]	validation_0-gamma-nloglik:0.93655
[107]	validation_0-gamma-nloglik:0.93595
[108]	validation_0-gamma-nloglik:0.93588
[109]	validation_0-gamma-nloglik:0.93577
[110]	validation_0-gamma-nloglik:0.93546
[111]	validation_0-gamma-nloglik:0.93517
[112]	validation_0-gamma-nloglik:0.93444
[113]	validation_0-gamma-nloglik:0.93372
[114]	validation_0-gamma-nloglik:0.93347
[115]	validation_0-gamma-nloglik:0.93334
[116]	validation_0-gamma-nloglik:0.93323
[117]	validation_0-gamma-nloglik:0.93269
[118]	validation_0-gamma-nloglik:0.93234
[119]	validation_0-gamma-nloglik:0.93197
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants