Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute metrics not on each iteration but with some fixed step #4107

Closed
normanwang92 opened this issue Mar 25, 2021 · 15 comments
Closed

Compute metrics not on each iteration but with some fixed step #4107

normanwang92 opened this issue Mar 25, 2021 · 15 comments

Comments

@normanwang92
Copy link

Summary

Motivation

Description

References

@StrikerRUS
Copy link
Collaborator

@normanwang92 Hi! No, there is no such possibility in LightGBM. Custom objectives and metrics can be used only in a specific language wrapper. How custom function in CLI should look like you think?

@normanwang92
Copy link
Author

What I really wanted is to apply metric_freq/eval_freq in lgb.train. What I found was evaluating custom metrics at every single step slows down training quite a lot, and it seems to drag down the GPU utilization rate. I tried using lgb.predict at every x step but it still quite inefficient even when you build the prediction incrementally. In some cases the incremental prediction takes longer than a grid search.

Would be great if we have an efficient way of getting custom evaluation metrics at every x step.

@jameslamb
Copy link
Collaborator

Sure, but how would you like to define the custom metrc? In the Python package, for example, you can pass in a Python function. How would you like to be able to define the custom metric for use with the CLI?

@normanwang92
Copy link
Author

I'm not sure really. In my case, all I'd like to have is an efficient way of evaluating custom metrics every x step. If the metric_freq can be made available in python API then that solves my problem! The reason I asked this question was metric_freq is CLI only and I wasn't sure if CLI takes custom feval. I haven't really thought about how I'd define custom metric for CLI sorry.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Mar 26, 2021

Here is the corresponding code for training routine:

# start training
for i in range(init_iteration, init_iteration + num_boost_round):
for cb in callbacks_before_iter:
cb(callback.CallbackEnv(model=booster,
params=params,
iteration=i,
begin_iteration=init_iteration,
end_iteration=init_iteration + num_boost_round,
evaluation_result_list=None))
booster.update(fobj=fobj)
evaluation_result_list = []
# check evaluation result.
if valid_sets is not None:
if is_valid_contain_train:
evaluation_result_list.extend(booster.eval_train(feval))
evaluation_result_list.extend(booster.eval_valid(feval))

As a quick workaround I think you can add condition like i % period == 0 to the following if statement:
if valid_sets is not None:

@StrikerRUS StrikerRUS changed the title Is there a way to use custom eval metrics in CLI version?Thanks! Compute metrics not on each iteration but with some fixed step Mar 27, 2021
@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Mar 27, 2021

@normanwang92 Does new issue heading reflect your real needs correctly?

@normanwang92
Copy link
Author

It does thank you! I'll def try it out!

@normanwang92
Copy link
Author

Here's my attempt to modify lgb.train (by passing period to params and adding i % period == 0 later). It seems to run fine but somehow the verbose info was missing: I set verbose to 0 and verbose_eval to 50 and trained for 1000 iterations but the eval metrics were never printed, except when the early stopping condition was triggered (at around 800 rounds). It's not deal-breaking but I wonder if I did anything wrong or if there's a better way to implement this logic.

`def train(params, train_set, num_boost_round=100,
valid_sets=None, valid_names=None,
fobj=None, feval=None, init_model=None,
feature_name='auto', categorical_feature='auto',
early_stopping_rounds=None, evals_result=None,
verbose_eval=True, learning_rates=None,
keep_training_booster=False, callbacks=None):

# create predictor first

params = copy.deepcopy(params)

period = params.pop('period') if 'period' in params.keys() else 1

if fobj is not None:
    for obj_alias in _ConfigAliases.get("objective"):
        params.pop(obj_alias, None)
    params['objective'] = 'none'
for alias in _ConfigAliases.get("num_iterations"):
    if alias in params:
        num_boost_round = params.pop(alias)
        _log_warning("Found `{}` in params. Will use it instead of argument".format(alias))
params["num_iterations"] = num_boost_round
for alias in _ConfigAliases.get("early_stopping_round"):
    if alias in params:
        early_stopping_rounds = params.pop(alias)
        _log_warning("Found `{}` in params. Will use it instead of argument".format(alias))
params["early_stopping_round"] = early_stopping_rounds
first_metric_only = params.get('first_metric_only', False)

if num_boost_round <= 0:
    raise ValueError("num_boost_round should be greater than zero.")
if isinstance(init_model, str):
    predictor = _InnerPredictor(model_file=init_model, pred_parameter=params)
elif isinstance(init_model, Booster):
    predictor = init_model._to_predictor(dict(init_model.params, **params))
else:
    predictor = None
init_iteration = predictor.num_total_iteration if predictor is not None else 0
# check dataset
if not isinstance(train_set, Dataset):
    raise TypeError("Training only accepts Dataset object")

train_set._update_params(params) \
         ._set_predictor(predictor) \
         .set_feature_name(feature_name) \
         .set_categorical_feature(categorical_feature)

is_valid_contain_train = False
train_data_name = "training"
reduced_valid_sets = []
name_valid_sets = []
if valid_sets is not None:
    if isinstance(valid_sets, Dataset):
        valid_sets = [valid_sets]
    if isinstance(valid_names, str):
        valid_names = [valid_names]
    for i, valid_data in enumerate(valid_sets):
        # reduce cost for prediction training data
        if valid_data is train_set:
            is_valid_contain_train = True
            if valid_names is not None:
                train_data_name = valid_names[i]
            continue
        if not isinstance(valid_data, Dataset):
            raise TypeError("Training only accepts Dataset object")
        reduced_valid_sets.append(valid_data._update_params(params).set_reference(train_set))
        if valid_names is not None and len(valid_names) > i:
            name_valid_sets.append(valid_names[i])
        else:
            name_valid_sets.append('valid_' + str(i))
# process callbacks
if callbacks is None:
    callbacks = set()
else:
    for i, cb in enumerate(callbacks):
        cb.__dict__.setdefault('order', i - len(callbacks))
    callbacks = set(callbacks)

# Most of legacy advanced options becomes callbacks
if verbose_eval is True:
    callbacks.add(callback.print_evaluation())
elif isinstance(verbose_eval, int):
    callbacks.add(callback.print_evaluation(verbose_eval))

if early_stopping_rounds is not None and early_stopping_rounds > 0:
    callbacks.add(callback.early_stopping(early_stopping_rounds, first_metric_only, verbose=bool(verbose_eval)))

if learning_rates is not None:
    callbacks.add(callback.reset_parameter(learning_rate=learning_rates))

if evals_result is not None:
    callbacks.add(callback.record_evaluation(evals_result))

callbacks_before_iter = {cb for cb in callbacks if getattr(cb, 'before_iteration', False)}
callbacks_after_iter = callbacks - callbacks_before_iter
callbacks_before_iter = sorted(callbacks_before_iter, key=attrgetter('order'))
callbacks_after_iter = sorted(callbacks_after_iter, key=attrgetter('order'))

# construct booster
try:
    booster = Booster(params=params, train_set=train_set)
    if is_valid_contain_train:
        booster.set_train_data_name(train_data_name)
    for valid_set, name_valid_set in zip(reduced_valid_sets, name_valid_sets):
        booster.add_valid(valid_set, name_valid_set)
finally:
    train_set._reverse_update_params()
    for valid_set in reduced_valid_sets:
        valid_set._reverse_update_params()
booster.best_iteration = 0

# start training
for i in range(init_iteration, init_iteration + num_boost_round):
    for cb in callbacks_before_iter:
        cb(callback.CallbackEnv(model=booster,
                                params=params,
                                iteration=i,
                                begin_iteration=init_iteration,
                                end_iteration=init_iteration + num_boost_round,
                                evaluation_result_list=None))

    booster.update(fobj=fobj)

    evaluation_result_list = []
    # check evaluation result.
    if valid_sets is not None and i % period == 0:
        if is_valid_contain_train:
            evaluation_result_list.extend(booster.eval_train(feval))
        evaluation_result_list.extend(booster.eval_valid(feval))
    try:
        for cb in callbacks_after_iter:
            cb(callback.CallbackEnv(model=booster,
                                    params=params,
                                    iteration=i,
                                    begin_iteration=init_iteration,
                                    end_iteration=init_iteration + num_boost_round,
                                    evaluation_result_list=evaluation_result_list))
    except callback.EarlyStopException as earlyStopException:
        booster.best_iteration = earlyStopException.best_iteration + 1
        evaluation_result_list = earlyStopException.best_score
        break
booster.best_score = collections.defaultdict(collections.OrderedDict)
for dataset_name, eval_name, score, _ in evaluation_result_list:
    booster.best_score[dataset_name][eval_name] = score
if not keep_training_booster:
    booster.model_from_string(booster.model_to_string(), False).free_dataset()
return booster`

@shiyu1994
Copy link
Collaborator

@normanwang92 In the above code, the iteration index i starts from 0. So if you set period = 10, then the metric will be evaluated at iterations (iteration index starts from ONE instead of ZERO) 1,11,21,.... And with verbose_eval=50 (which means iterations 50, 100, 150,...), it will never meets with the evaluation iterations controlled by period.
Thus the modification in the if condition should be (i + 1) % period == 0 instead of i % period == 0, this works fine for me.

@StrikerRUS
Copy link
Collaborator

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute to this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

@TremaMiguel
Copy link
Contributor

TremaMiguel commented Jan 27, 2022

@jameslamb, @StrikerRUS I'm open to develop this. Just to double check these are the high level changes needed:

  1. train accepts a period argument.

  2. If period is in parameters use the log_evaluation callback. The callback will print the evaluation results to the console every period.

if "period" in params:
    callbacks_set.add(
         callback.log_evaluation(
         ...
         )
    )
  1. Modify train() to run eval every certain period:
# check evaluation result.
  if valid_sets is not None and (i + 1) % period == 0:
      if is_valid_contain_train:
          evaluation_result_list.extend(booster.eval_train(feval))
      evaluation_result_list.extend(booster.eval_valid(feval))

@StrikerRUS
Copy link
Collaborator

@TremaMiguel I think we can reuse metric_freq CLI parameter for this purpose instead of adding one more argument for the train() function and making signature of functions incredibly difficult to understand especially for newcomers.

@TremaMiguel
Copy link
Contributor

Ok, this issue is marked as uncompleted #2302, should it be marked as completed or removed from the list? Is there any open issue do you suggest me to work on?

@StrikerRUS
Copy link
Collaborator

@TremaMiguel I mean, this issue is actual and some users may benefit from implementing it. I just said that we can take metric_freq parameter (instead of new period argument) that is currently used for CLI and adopt it for using in Python interface.

@github-actions

This comment was marked as off-topic.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 16, 2023
@microsoft microsoft unlocked this conversation Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants