Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove several top-level functions #7016

Closed
zundertj opened this issue Feb 19, 2023 · 3 comments
Closed

Remove several top-level functions #7016

zundertj opened this issue Feb 19, 2023 · 3 comments
Labels
breaking Change that breaks backwards compatibility enhancement New feature or an improvement of an existing feature python Related to Python Polars

Comments

@zundertj
Copy link
Collaborator

Problem description

Most of our api is behind DataFrame, Series, and Expr. For the functions that are not, that is mostly ok, but I think the below list is better placed elsewhere, so I think they can be removed? Unless I am missing something here. See also https://pola-rs.github.io/polars/py-polars/html/reference/functions.html for all top-level functions (except utils).

  • pl.arg_where: operates on Series and Expressions, returns index where condition is True. Usage of this should probably be discouraged, and we could, if we want to continue supporting this, move to Series.arg_where and Expr.arg_where? I.e. I dont see why this needs to be special vs all other methods.
  • pl.cut: should be on Series & Expr
  • pl.get_dummies: what exactly is the difference with DataFrame.to_dummies()? The docstrings are unable to make that clear. The docstring here: https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.get_dummies.html even suggests it is broken, why is "ham" not different from the other two columns, although it is not in the parameter list?
  • pl.ones & pl.zeros: we have broadcasting now, so how useful is this? I.e. pl.lit(1, pl.Int64).
@zundertj zundertj added python Related to Python Polars enhancement New feature or an improvement of an existing feature breaking Change that breaks backwards compatibility labels Feb 19, 2023
@ritchie46
Copy link
Member

ritchie46 commented Feb 19, 2023

pl.ones & pl.zeros: we have broadcasting now, so how useful is this? I.e. pl.lit(1, pl.Int64).

It only broadcast if it is selected with multiple expression, not if it is the only selection. It also has no eager option.

pl.arg_where: operates on Series and Expressions, returns index where condition is True. Usage of this should probably be discouraged, and we could, if we want to continue supporting this, move to Series.arg_where and Expr.arg_where? I.e. I dont see why this needs to be special vs all other methods.

I like the readability of arg_where(condition), I think it has merit as this function. We could also add it to Expr and Series, but I want to keep the to level function as well.

pl.get_dummies: what exactly is the difference with DataFrame.to_dummies()? The docstrings are unable to make that

Seems fine by me to only expose this on the DataFrame.

@stinodego
Copy link
Member

stinodego commented Feb 19, 2023

About get_dummies: it's not broken but the docstring example is wrong, it calls to_dummies twice.

I agree that I don't see the point of having it as a separate function, over just a DataFrame method (which it already is).

cut cannot be an Expr method because it returns a DataFrame, but it should probably be a Series method.

Agreed that arg_where should be a Series and Expr method. I don't think we should remove it completely.

@stinodego
Copy link
Member

I believe this issue has been resolved. Please open a new issue for any unresolved leftover issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Change that breaks backwards compatibility enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants