[python-package] Remove output_margin from XGBClassifier.predict_proba argument list. #3343

yanboliang · 2018-05-27T05:14:18Z

XGBClassifier.predict_proba is to predict class probabilities, so it doesn't make sense to support output_margin. If users want to output margin, they can just use predict(data, output_margin=True).

Some users report misuse of this argument when calling XGBClassifier.predict_proba at #3308. If users set output_margin=True when calling predict_proba by mistake, it will produce confused and meaningless result.

For example, this is the correct result:
(This is binary classification, each column is the probability of the sample being of a given class)

>>> model.predict_proba(X_test, output_margin=False)[0:10]
array([[0.9545844 , 0.04541559],
       [0.05245447, 0.9475455 ],
       [0.41897488, 0.5810251 ],
       [0.9831998 , 0.0168002 ],
       [0.4119159 , 0.5880841 ],
       [0.31113452, 0.6888655 ],
       [0.9705527 , 0.02944732],
       [0.93274003, 0.06725994],
       [0.11494881, 0.8850512 ],
       [0.6501156 , 0.34988442]], dtype=float32)

And this is wrong result:

>>> model.predict_proba(X_test, output_margin=True)[0:10]
array([[ 4.0454206 , -3.0454206 ],
       [-1.8939297 ,  2.8939297 ],
       [ 0.67301685,  0.32698315],
       [ 5.069422  , -4.069422  ],
       [ 0.64394915,  0.35605082],
       [ 0.20517927,  0.7948207 ],
       [ 4.4952626 , -3.4952626 ],
       [ 3.6295617 , -2.6295617 ],
       [-1.0411582 ,  2.0411582 ],
       [ 1.6195472 , -0.61954725]], dtype=float32)

codecov-io · 2018-05-27T06:13:00Z

Codecov Report

Merging #3343 into master will not change coverage.
The diff coverage is 100%.

@@            Coverage Diff            @@
##             master    #3343   +/-   ##
=========================================
  Coverage     45.69%   45.69%           
  Complexity      228      228           
=========================================
  Files           166      166           
  Lines         12972    12972           
  Branches        466      466           
=========================================
  Hits           5927     5927           
  Misses         6853     6853           
  Partials        192      192

Impacted Files	Coverage Δ	Complexity Δ
python-package/xgboost/sklearn.py	`87.36% <100%> (ø)`	`0 <0> (ø)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 480e3fd...a776be5. Read the comment docs.

hcho3 · 2018-05-28T17:30:49Z

I went ahead and merged the PR. Thanks!

Remove output_margin from XGBClassifier.predict_proba argument list.

a776be5

yanboliang mentioned this pull request May 27, 2018

Unexpected behaviour in predict_proba with output_margin = True #3308

Closed

hcho3 merged commit b018ef1 into dmlc:master May 28, 2018

yanboliang deleted the fix-predict-proba branch May 29, 2018 00:36

Sonchiwon mentioned this pull request Aug 29, 2018

Getting margin scores with 'XGBClassifier.predict(output_margin=True)' doesn't work as intended #3648

Closed

lock bot locked as resolved and limited conversation to collaborators Jan 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] Remove output_margin from XGBClassifier.predict_proba argument list. #3343

[python-package] Remove output_margin from XGBClassifier.predict_proba argument list. #3343

yanboliang commented May 27, 2018 •

edited

Loading

codecov-io commented May 27, 2018 •

edited

Loading

hcho3 commented May 28, 2018

[python-package] Remove output_margin from XGBClassifier.predict_proba argument list. #3343

[python-package] Remove output_margin from XGBClassifier.predict_proba argument list. #3343

Conversation

yanboliang commented May 27, 2018 • edited Loading

codecov-io commented May 27, 2018 • edited Loading

Codecov Report

hcho3 commented May 28, 2018

yanboliang commented May 27, 2018 •

edited

Loading

codecov-io commented May 27, 2018 •

edited

Loading