Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(python): Rename pearson_corr & spearman_rank_corr #7014

Merged
merged 56 commits into from
Mar 2, 2023

Conversation

josemasar
Copy link
Contributor

  • Renaming the .pearson_corr and .spearman_rank_corr methods to a much shorted .corr, with a method='pearson' / 'spearman'.

Issue:
#6802

@github-actions github-actions bot added the refactor Code improvements that do not change functionality label Feb 19, 2023
@josemasar josemasar changed the title refactor: rename pearson_corr & spearman_rank_corr refactor(python): rename pearson_corr & spearman_rank_corr Feb 19, 2023
@github-actions github-actions bot added the python Related to Python Polars label Feb 19, 2023
@josemasar josemasar changed the title refactor(python): rename pearson_corr & spearman_rank_corr refactor(python): Rename pearson_corr & spearman_rank_corr Feb 19, 2023
"""
if isinstance(a, str):
a = col(a)
if isinstance(b, str):
b = col(b)
return pli.wrap_expr(pypearson_corr(a._pyexpr, b._pyexpr, ddof))
if method == "pearson":
return pli.wrap_expr(pypearson_corr(a._pyexpr, b._pyexpr, ddof))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this method not support propagate_nans? That is not documented in the docstring currently, although I guess the best course of action is to support that actually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added clarification in the docstrings.

@@ -7407,7 +7407,7 @@ def unnest(self, names: str | Sequence[str]) -> Self:
return self._from_pydf(self._df.unnest(names))

@typing.no_type_check
def pearson_corr(self, **kwargs: Any) -> Self:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As on expressions, we need to keep this name and raise a deprecation warning.

b: str | pli.Expr,
method: CorrelationMethod = "pearson",
ddof: int = 1,
propagate_nans: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method, ddof and propogate_nans should be keyword only imo, but I guess that is what @stinodego is working on in another PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is a new function, so let's get it right straight away. Indeed those should be keyword only.

pl.pearson_corr("a", "b").alias("pearson"),
pl.spearman_rank_corr("a", "b").alias("spearman"),
pl.corr("a", "b").alias("pearson"),
pl.corr("a", "b", "spearman").alias("spearman"),
Copy link
Collaborator

@zundertj zundertj Feb 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should call method as a keyword argument, i.e. corr("a", "b", method="spearman"). Same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it´s a keyword argument.

@zundertj
Copy link
Collaborator

Im in favor of this change, as the Spearman rank correlation, although useful, is much less used vs the Pearson one, so it makes sense to hide that behind a kwarg.

@josemasar
Copy link
Contributor Author

@stinodego @zundertj I´ve got some trouble resolving the merge conflicts. Can you please help me on this?

@stinodego
Copy link
Member

stinodego commented Feb 22, 2023

I don't think I can solve it for you, since you're on a fork. I'd recommend updating your fork and then do an interactive rebase on master, only selecting the commits you need, and then fix conflicts for those.

@josemasar
Copy link
Contributor Author

@stinodego Thanks for the tip! The PR is now ready for the final review.

@ritchie46
Copy link
Member

Thanks for the PR and the reviews. 👍

@ritchie46 ritchie46 merged commit dc19709 into pola-rs:master Mar 2, 2023
@stinodego
Copy link
Member

stinodego commented Mar 2, 2023

@ritchie46 I don't think this was ready to be merged, the old methods weren't deprecated correctly. Didn't get around to properly reviewing this yet. I can make a follow-up PR to fix it.

@josemasar josemasar deleted the rename_pearson_corr_to_corr branch March 2, 2023 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Related to Python Polars refactor Code improvements that do not change functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants