Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Add function to generate a list of parameters #4195

Closed
david-cortes opened this issue Apr 18, 2021 · 5 comments
Closed

[R-package] Add function to generate a list of parameters #4195

david-cortes opened this issue Apr 18, 2021 · 5 comments

Comments

@david-cortes
Copy link
Contributor

In the R package, when one wants to fit a model, the functions have an argument params, where one passes an R list.

This is inconvenient and makes it hard to find the right parameter names. Would be better if either:

  • There was a function lgb.Params which would simply be a constructor wrapper returning a list of the parameters but with documented arguments, so that one could get autocomplete and pop-up documentation in IDEs by doing something like this:
model <- lightgbm(
  data,
  params=lgb.Params(<tab here>)
)
  • The functions accepted and documented their arguments without needing to pass them as a list.
@jameslamb
Copy link
Collaborator

Thanks very much for the recommendation!

I'm strongly opposed to removing the interface of passing parameters as a list to params. At this point, that would be a very impactful breaking change that would affect most users of LightGBM.

However, I'm open to more discussion on the idea of a function which can make the process of setting parameters easier. So far we've been hesitant to take that on because it introduces some maintenance burden and because copying the details of the LightGBM core library into wrapper packages introduces a risk of inconsistencies (like the one you've already noted in #4196).

Some questions about such a function:

@david-cortes
Copy link
Contributor Author

But it doesn't need to be a breaking change. What I suggest would be simply to make a function with documented arguments that would return a list with the same elements and values - e.g.

Parameters <- function(arg1=10, arg2="a") {
    return(as.list(environment()))
}

But with those arguments documented and already set to their defaults (or perhaps to NULL, guess the effect is the same), so that it would play along with RStudio and other IDEs documentation browser, tooltips, and autocomplete.

I think it should be possible to auto-generate such a file from the html or the documentation system of the C++ library, maybe adding a note that it's autogenerated.

And yes, would be nice if the R package documentation would mention everything that's possible to pass.

@jameslamb
Copy link
Collaborator

But it doesn't need to be a breaking change

Right, but the other proposal in #4195 (comment), "The functions accepter their arguments without needing to pass them as a list", would be a breaking change.

What I suggest would be simply to make a function with documented arguments that would return a list with the same elements and values

I'm open to the idea of a function like lgb.generate_parameters() that is similar to the one you've described, and it would be interesting to try to auto-generate it. I understand how that could make the experience in an IDE a bit easier and how that could be done in a way that is non-breaking.

  1. Are you interested in working on a feature like that?
  2. Have you worked with other machine learning libraries that do something like that, which you think would serve as a good reference?

@david-cortes
Copy link
Contributor Author

david-cortes commented Apr 18, 2021

I'll pass on making a parser from the html, but can provide you some example libraries which have a parameters constructor:

  • C50 (C5.0Control)
  • caret trainControl
  • partykit (ctree_control)

Although those are all R exclusives and the functions are not auto-generated.

@jameslamb
Copy link
Collaborator

Alright, thanks.

Since you're not interested in contributing this, I'll add it to this project's backlog of feature requests. In this project, we manage feature requests in a single issue (see #2302). I'll move this there, close this issue, and update the title to more accurately reflect the feature request.


Feature description

For anyone arriving at this issue, please comment if you'd like to contribute this feature. It can be summarized as follows.

Add a new exported function, lgb.generate_parameters(), to the R package with the following characteristics:

Since the goal of this function would be primarily to support interactive use, it does not need to support aliases for parameters (e.g. n_iter instead of num_iterations).

If possible, this function and its documentation should be automatically generated from https://github.com/microsoft/LightGBM/blob/7ea2bc4dbca53c03eaae2aa2b22f9b30447caea4/include/LightGBM/config.h. You can see https://github.com/microsoft/LightGBM/blob/7ea2bc4dbca53c03eaae2aa2b22f9b30447caea4/helpers/parameter_generator.py for inspiration. At a minimum, to ensure that this function remains consistent with LightGBM's core library, tests should be added which will break if any inconsistencies are detected (similar to

LightGBM/.ci/test.sh

Lines 50 to 51 in 7ea2bc4

diff $BUILD_DIRECTORY/docs/Parameters-backup.rst $BUILD_DIRECTORY/docs/Parameters.rst || exit -1
diff $BUILD_DIRECTORY/src/io/config_auto-backup.cpp $BUILD_DIRECTORY/src/io/config_auto.cpp || exit -1
).

@jameslamb jameslamb changed the title R: passing train parameters is inconvenient [R-package] Add function to generate a list of parameters Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants