Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Don't rechunk when converting DataFrame to numpy/ndarray #16288

Merged
merged 1 commit into from
May 17, 2024
Merged

Conversation

ritchie46
Copy link
Member

closes #16267

@github-actions github-actions bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels May 17, 2024
Copy link

codecov bot commented May 17, 2024

Codecov Report

Attention: Patch coverage is 95.45455% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 80.79%. Comparing base (bbe73bd) to head (f520057).

Files Patch % Lines
crates/polars-core/src/chunked_array/ndarray.rs 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16288      +/-   ##
==========================================
- Coverage   80.81%   80.79%   -0.02%     
==========================================
  Files        1393     1393              
  Lines      179337   179329       -8     
  Branches     2921     2921              
==========================================
- Hits       144923   144897      -26     
- Misses      33911    33929      +18     
  Partials      503      503              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@abstractqqq abstractqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the polars_ensure! call happens now before the none_to_nan, which means that a float64 column with nulls can cause error. The docs says the contrary, and nulls should be filled by NaN automatically. Previously, this was avoided because the columns were pre-processed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Polars to_numpy slower with chunked array than going via pandas
2 participants