Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): huge speedup of scalar-to-array expansion on frame init from dict #6111

Merged

Conversation

alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented Jan 7, 2023

Spotted the existence of Series.extend_constant and reworked PR #6034 to use it; dramatically faster.

Example:

df = pl.DataFrame({
    "a": range(50_000_000),
    "b": 1234567890,
    "c": "tuvwxyz",
    "d": None,
})

# shape: (50000000, 4)
# ┌──────────┬────────────┬─────────┬──────┐
# │ a        ┆ b          ┆ c       ┆ d    │
# │ ---      ┆ ---        ┆ ---     ┆ ---  │
# │ i64      ┆ i64        ┆ str     ┆ f64  │
# ╞══════════╪════════════╪═════════╪══════╡
# │ 0        ┆ 1234567890 ┆ tuvwxyz ┆ null │
# │ 1        ┆ 1234567890 ┆ tuvwxyz ┆ null │
# │ 2        ┆ 1234567890 ┆ tuvwxyz ┆ null │
# │ 3        ┆ 1234567890 ┆ tuvwxyz ┆ null │
#  ...

Timings:

# Before: 7.1974 secs
# After:  0.8025 secs

Also:

  • Fixed scalar expansion in conjunction with columns override.
  • Fixed error when trying to expand scalars in the presence of struct columns.
  • Some minor typing improvements, allowing removal of some # type: ignore directives.
  • Additional test coverage for all of the above.

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Jan 7, 2023
@stinodego
Copy link
Member

Nice! I thought there had to be a better way than to multiply a list.

@alexander-beedie
Copy link
Collaborator Author

Nice! I thought there had to be a better way than to multiply a list.

I've returned to the scene of the crime... :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants