Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dt.round() slow/fast path use different rounding #18239

Closed
2 tasks done
mcrumiller opened this issue Aug 16, 2024 · 2 comments · Fixed by #18245
Closed
2 tasks done

dt.round() slow/fast path use different rounding #18239

mcrumiller opened this issue Aug 16, 2024 · 2 comments · Fixed by #18245
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@mcrumiller
Copy link
Contributor

mcrumiller commented Aug 16, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

@MarcoGorelli
This is an example case produced in the hypothesis test for test_dt_round_fast_path_vs_slow_path:

from datetime import datetime
import polars as pl

datetimes = [
    datetime(1980, 1, 1, 0, 0, 0, 0),
    datetime(1969, 1, 1, 0, 0, 0, 1),
    datetime(1969, 1, 1, 0, 0, 0, 2),
    datetime(1969, 1, 1, 0, 0, 0, 3),
    datetime(1969, 1, 1, 0, 0, 0, 4),
    datetime(1969, 1, 1, 0, 0, 0, 5),
]
every = "2us"
s = pl.Series(datetimes)

out = pl.select(
    s.dt.round(every).alias("fast"),
    s.dt.round(pl.Series([every] * len(datetimes))).alias("slow"),
)

print(out)
shape: (6, 2)
┌────────────────────────────┬────────────────────────────┐
│ fast                       ┆ slow                       │
│ ---                        ┆ ---                        │
│ datetime[μs]               ┆ datetime[μs]               │
╞════════════════════════════╪════════════════════════════╡
│ 1980-01-01 00:00:00        ┆ 1980-01-01 00:00:00        │
│ 1969-01-01 00:00:00        ┆ 1969-01-01 00:00:00.000002 │
│ 1969-01-01 00:00:00.000002 ┆ 1969-01-01 00:00:00.000002 │
│ 1969-01-01 00:00:00.000002 ┆ 1969-01-01 00:00:00.000004 │
│ 1969-01-01 00:00:00.000004 ┆ 1969-01-01 00:00:00.000004 │
│ 1969-01-01 00:00:00.000004 ┆ 1969-01-01 00:00:00.000006 │
└────────────────────────────┴────────────────────────────┘

Issue description

The fast path rounds down and the slow path rounds up.

Expected behavior

Both should use same rounding.

Installed versions

main

@mcrumiller mcrumiller added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Aug 16, 2024
@mcrumiller
Copy link
Contributor Author

mcrumiller commented Aug 16, 2024

The different rounding routines are here:

Do we want to:

  1. round toward zero;
  2. round away from zero;
  3. round toward the even digit, as most rounding functions do?

(1) is the easiest to implement, (3) is more universally implemented.

@MarcoGorelli
Copy link
Collaborator

MarcoGorelli commented Aug 16, 2024

thanks @mcrumiller for spotting this! yeah this needs fixing

in Series.round Polars rounds away from zero, so perpahs .dt.round should be consistent with that

EDIT: having said, should the user really care about where "zero" (1970-01-01) is? A consistent "round down" approach might be better here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants