Compute metrics not on each iteration but with some fixed step #4107

normanwang92 · 2021-03-25T03:06:27Z

Summary

Motivation

Description

References

StrikerRUS · 2021-03-25T18:54:50Z

@normanwang92 Hi! No, there is no such possibility in LightGBM. Custom objectives and metrics can be used only in a specific language wrapper. How custom function in CLI should look like you think?

normanwang92 · 2021-03-26T02:23:07Z

What I really wanted is to apply metric_freq/eval_freq in lgb.train. What I found was evaluating custom metrics at every single step slows down training quite a lot, and it seems to drag down the GPU utilization rate. I tried using lgb.predict at every x step but it still quite inefficient even when you build the prediction incrementally. In some cases the incremental prediction takes longer than a grid search.

Would be great if we have an efficient way of getting custom evaluation metrics at every x step.

jameslamb · 2021-03-26T02:26:40Z

Sure, but how would you like to define the custom metrc? In the Python package, for example, you can pass in a Python function. How would you like to be able to define the custom metric for use with the CLI?

normanwang92 · 2021-03-26T06:17:11Z

I'm not sure really. In my case, all I'd like to have is an efficient way of evaluating custom metrics every x step. If the metric_freq can be made available in python API then that solves my problem! The reason I asked this question was metric_freq is CLI only and I wasn't sure if CLI takes custom feval. I haven't really thought about how I'd define custom metric for CLI sorry.

StrikerRUS · 2021-03-26T12:01:09Z

Here is the corresponding code for training routine:

LightGBM/python-package/lightgbm/engine.py

Lines 239 to 256 in d6ebd06

    
           # start training 
        
           for i in range(init_iteration, init_iteration + num_boost_round): 
        
               for cb in callbacks_before_iter: 
        
                   cb(callback.CallbackEnv(model=booster, 
        
                                           params=params, 
        
                                           iteration=i, 
        
                                           begin_iteration=init_iteration, 
        
                                           end_iteration=init_iteration + num_boost_round, 
        
                                           evaluation_result_list=None)) 
        
               booster.update(fobj=fobj) 
        
               evaluation_result_list = [] 
        
               # check evaluation result. 
        
               if valid_sets is not None: 
        
                   if is_valid_contain_train: 
        
                       evaluation_result_list.extend(booster.eval_train(feval)) 
        
                   evaluation_result_list.extend(booster.eval_valid(feval))

As a quick workaround I think you can add condition like i % period == 0 to the following if statement:

LightGBM/python-package/lightgbm/engine.py

Line 253 in d6ebd06

if valid_sets is not None:

StrikerRUS · 2021-03-27T22:39:46Z

@normanwang92 Does new issue heading reflect your real needs correctly?

normanwang92 · 2021-03-29T01:57:37Z

It does thank you! I'll def try it out!

normanwang92 · 2021-03-29T07:24:02Z

Here's my attempt to modify lgb.train (by passing period to params and adding i % period == 0 later). It seems to run fine but somehow the verbose info was missing: I set verbose to 0 and verbose_eval to 50 and trained for 1000 iterations but the eval metrics were never printed, except when the early stopping condition was triggered (at around 800 rounds). It's not deal-breaking but I wonder if I did anything wrong or if there's a better way to implement this logic.

`def train(params, train_set, num_boost_round=100,
valid_sets=None, valid_names=None,
fobj=None, feval=None, init_model=None,
feature_name='auto', categorical_feature='auto',
early_stopping_rounds=None, evals_result=None,
verbose_eval=True, learning_rates=None,
keep_training_booster=False, callbacks=None):

# create predictor first

params = copy.deepcopy(params)

period = params.pop('period') if 'period' in params.keys() else 1

if fobj is not None:
    for obj_alias in _ConfigAliases.get("objective"):
        params.pop(obj_alias, None)
    params['objective'] = 'none'
for alias in _ConfigAliases.get("num_iterations"):
    if alias in params:
        num_boost_round = params.pop(alias)
        _log_warning("Found `{}` in params. Will use it instead of argument".format(alias))
params["num_iterations"] = num_boost_round
for alias in _ConfigAliases.get("early_stopping_round"):
    if alias in params:
        early_stopping_rounds = params.pop(alias)
        _log_warning("Found `{}` in params. Will use it instead of argument".format(alias))
params["early_stopping_round"] = early_stopping_rounds
first_metric_only = params.get('first_metric_only', False)

if num_boost_round <= 0:
    raise ValueError("num_boost_round should be greater than zero.")
if isinstance(init_model, str):
    predictor = _InnerPredictor(model_file=init_model, pred_parameter=params)
elif isinstance(init_model, Booster):
    predictor = init_model._to_predictor(dict(init_model.params, **params))
else:
    predictor = None
init_iteration = predictor.num_total_iteration if predictor is not None else 0
# check dataset
if not isinstance(train_set, Dataset):
    raise TypeError("Training only accepts Dataset object")

train_set._update_params(params) \
         ._set_predictor(predictor) \
         .set_feature_name(feature_name) \
         .set_categorical_feature(categorical_feature)

is_valid_contain_train = False
train_data_name = "training"
reduced_valid_sets = []
name_valid_sets = []
if valid_sets is not None:
    if isinstance(valid_sets, Dataset):
        valid_sets = [valid_sets]
    if isinstance(valid_names, str):
        valid_names = [valid_names]
    for i, valid_data in enumerate(valid_sets):
        # reduce cost for prediction training data
        if valid_data is train_set:
            is_valid_contain_train = True
            if valid_names is not None:
                train_data_name = valid_names[i]
            continue
        if not isinstance(valid_data, Dataset):
            raise TypeError("Training only accepts Dataset object")
        reduced_valid_sets.append(valid_data._update_params(params).set_reference(train_set))
        if valid_names is not None and len(valid_names) > i:
            name_valid_sets.append(valid_names[i])
        else:
            name_valid_sets.append('valid_' + str(i))
# process callbacks
if callbacks is None:
    callbacks = set()
else:
    for i, cb in enumerate(callbacks):
        cb.__dict__.setdefault('order', i - len(callbacks))
    callbacks = set(callbacks)

# Most of legacy advanced options becomes callbacks
if verbose_eval is True:
    callbacks.add(callback.print_evaluation())
elif isinstance(verbose_eval, int):
    callbacks.add(callback.print_evaluation(verbose_eval))

if early_stopping_rounds is not None and early_stopping_rounds > 0:
    callbacks.add(callback.early_stopping(early_stopping_rounds, first_metric_only, verbose=bool(verbose_eval)))

if learning_rates is not None:
    callbacks.add(callback.reset_parameter(learning_rate=learning_rates))

if evals_result is not None:
    callbacks.add(callback.record_evaluation(evals_result))

callbacks_before_iter = {cb for cb in callbacks if getattr(cb, 'before_iteration', False)}
callbacks_after_iter = callbacks - callbacks_before_iter
callbacks_before_iter = sorted(callbacks_before_iter, key=attrgetter('order'))
callbacks_after_iter = sorted(callbacks_after_iter, key=attrgetter('order'))

# construct booster
try:
    booster = Booster(params=params, train_set=train_set)
    if is_valid_contain_train:
        booster.set_train_data_name(train_data_name)
    for valid_set, name_valid_set in zip(reduced_valid_sets, name_valid_sets):
        booster.add_valid(valid_set, name_valid_set)
finally:
    train_set._reverse_update_params()
    for valid_set in reduced_valid_sets:
        valid_set._reverse_update_params()
booster.best_iteration = 0

# start training
for i in range(init_iteration, init_iteration + num_boost_round):
    for cb in callbacks_before_iter:
        cb(callback.CallbackEnv(model=booster,
                                params=params,
                                iteration=i,
                                begin_iteration=init_iteration,
                                end_iteration=init_iteration + num_boost_round,
                                evaluation_result_list=None))

    booster.update(fobj=fobj)

    evaluation_result_list = []
    # check evaluation result.
    if valid_sets is not None and i % period == 0:
        if is_valid_contain_train:
            evaluation_result_list.extend(booster.eval_train(feval))
        evaluation_result_list.extend(booster.eval_valid(feval))
    try:
        for cb in callbacks_after_iter:
            cb(callback.CallbackEnv(model=booster,
                                    params=params,
                                    iteration=i,
                                    begin_iteration=init_iteration,
                                    end_iteration=init_iteration + num_boost_round,
                                    evaluation_result_list=evaluation_result_list))
    except callback.EarlyStopException as earlyStopException:
        booster.best_iteration = earlyStopException.best_iteration + 1
        evaluation_result_list = earlyStopException.best_score
        break
booster.best_score = collections.defaultdict(collections.OrderedDict)
for dataset_name, eval_name, score, _ in evaluation_result_list:
    booster.best_score[dataset_name][eval_name] = score
if not keep_training_booster:
    booster.model_from_string(booster.model_to_string(), False).free_dataset()
return booster`

shiyu1994 · 2021-04-05T11:31:56Z

@normanwang92 In the above code, the iteration index i starts from 0. So if you set period = 10, then the metric will be evaluated at iterations (iteration index starts from ONE instead of ZERO) 1,11,21,.... And with verbose_eval=50 (which means iterations 50, 100, 150,...), it will never meets with the evaluation iterations controlled by period.
Thus the modification in the if condition should be (i + 1) % period == 0 instead of i % period == 0, this works fine for me.

StrikerRUS · 2021-06-09T14:12:05Z

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute to this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

TremaMiguel · 2022-01-27T02:39:56Z

@jameslamb, @StrikerRUS I'm open to develop this. Just to double check these are the high level changes needed:

train accepts a period argument.
If period is in parameters use the log_evaluation callback. The callback will print the evaluation results to the console every period.

if "period" in params:
    callbacks_set.add(
         callback.log_evaluation(
         ...
         )
    )

Modify train() to run eval every certain period:

# check evaluation result.
  if valid_sets is not None and (i + 1) % period == 0:
      if is_valid_contain_train:
          evaluation_result_list.extend(booster.eval_train(feval))
      evaluation_result_list.extend(booster.eval_valid(feval))

StrikerRUS · 2022-01-29T00:48:50Z

@TremaMiguel I think we can reuse metric_freq CLI parameter for this purpose instead of adding one more argument for the train() function and making signature of functions incredibly difficult to understand especially for newcomers.

TremaMiguel · 2022-01-29T15:06:09Z

Ok, this issue is marked as uncompleted #2302, should it be marked as completed or removed from the list? Is there any open issue do you suggest me to work on?

StrikerRUS · 2022-01-29T18:07:48Z

@TremaMiguel I mean, this issue is actual and some users may benefit from implementing it. I just said that we can take metric_freq parameter (instead of new period argument) that is currently used for CLI and adopt it for using in Python interface.

StrikerRUS changed the title ~~Is there a way to use custom eval metrics in CLI version？Thanks!~~ Compute metrics not on each iteration but with some fixed step Mar 27, 2021

jameslamb added the feature request label Mar 29, 2021

StrikerRUS mentioned this issue Jun 9, 2021

Feature Requests & Voting Hub #2302

Open

StrikerRUS closed this as completed Jun 9, 2021

StrikerRUS added the help wanted label Jun 9, 2021

jmoralez mentioned this issue Oct 3, 2022

[python-package] How do I set evaluation frequency? #5518

Closed

This comment was marked as off-topic.

Sign in to view

github-actions bot locked as resolved and limited conversation to collaborators Aug 16, 2023

microsoft unlocked this conversation Aug 18, 2023

jmoralez mentioned this issue Oct 10, 2023

[Question] Is it possible to execute the evaluation functions at an intervall? #6136

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute metrics not on each iteration but with some fixed step #4107

Compute metrics not on each iteration but with some fixed step #4107

normanwang92 commented Mar 25, 2021

StrikerRUS commented Mar 25, 2021

normanwang92 commented Mar 26, 2021

jameslamb commented Mar 26, 2021

normanwang92 commented Mar 26, 2021

StrikerRUS commented Mar 26, 2021 •

edited

Loading

StrikerRUS commented Mar 27, 2021 •

edited

Loading

normanwang92 commented Mar 29, 2021

normanwang92 commented Mar 29, 2021

shiyu1994 commented Apr 5, 2021

StrikerRUS commented Jun 9, 2021

TremaMiguel commented Jan 27, 2022 •

edited

Loading

StrikerRUS commented Jan 29, 2022

TremaMiguel commented Jan 29, 2022

StrikerRUS commented Jan 29, 2022

This comment was marked as off-topic.

Compute metrics not on each iteration but with some fixed step #4107

Compute metrics not on each iteration but with some fixed step #4107

Comments

normanwang92 commented Mar 25, 2021

Summary

Motivation

Description

References

StrikerRUS commented Mar 25, 2021

normanwang92 commented Mar 26, 2021

jameslamb commented Mar 26, 2021

normanwang92 commented Mar 26, 2021

StrikerRUS commented Mar 26, 2021 • edited Loading

StrikerRUS commented Mar 27, 2021 • edited Loading

normanwang92 commented Mar 29, 2021

normanwang92 commented Mar 29, 2021

shiyu1994 commented Apr 5, 2021

StrikerRUS commented Jun 9, 2021

TremaMiguel commented Jan 27, 2022 • edited Loading

StrikerRUS commented Jan 29, 2022

TremaMiguel commented Jan 29, 2022

StrikerRUS commented Jan 29, 2022

This comment was marked as off-topic.

StrikerRUS commented Mar 26, 2021 •

edited

Loading

StrikerRUS commented Mar 27, 2021 •

edited

Loading

TremaMiguel commented Jan 27, 2022 •

edited

Loading