-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add ewm_mean_by #15638
feat: add ewm_mean_by #15638
Conversation
CodSpeed Performance ReportMerging #15638 will not alter performanceComparing Summary
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #15638 +/- ##
=======================================
Coverage ? 81.34%
=======================================
Files ? 1374
Lines ? 176473
Branches ? 2544
=======================================
Hits ? 143549
Misses ? 32443
Partials ? 481 ☔ View full report in Codecov by Sentry. |
160c01d
to
0ac478c
Compare
4f6f59a
to
8a56c16
Compare
8a56c16
to
a2111dd
Compare
3314c48
to
9c0844b
Compare
9c0844b
to
ad8bb68
Compare
_ => None, | ||
}; | ||
if half_life.negative() { | ||
polars_bail!(InvalidOperation: "half_life cannot be negative"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We could do the assert immediately in polars_ensure
I know I am a little late to the party, but would it be a better approach to give the ability to provide the half_life as an expression instead? This would allow the above behaviour with something like: (taking liberties with the half life dtype) def ewm_mean_by(input_column, times_column, alpha):
return input_column.ewm_mean(alpha = alpha * times_column.diff()) (haven't checked the math here), It could be used to perform the operation in |
could you open a new issue for this please? |
closes #10794
Demo of why this matters: for non-evenly spaced points, there's a noticeable difference compared with plain
ewm_mean
:question - should it sort on behalf of users, or raise?
What I've tried introducing here is a
check_sorted
argument:check_sorted=False
, then Polars assumes your data to beTrying to follow @orlp 's guidance here
Other options could be:
.over
, and sopl.col('a').ewm_mean_by('times', half_life='4d').over('groups')
would always warn. I'd find that quite annoying as a user