Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python,rust): Improve usability of Null type. #7136

Merged
merged 1 commit into from
Feb 24, 2023

Conversation

ghuls
Copy link
Collaborator

@ghuls ghuls commented Feb 23, 2023

Improve usability of pl.Null type:

  • Creating a Null array with repeat in eager works now: s = pl.repeat(None, 7, eager=True) Also bounds checks for converting to i32 series are slightly tweaked for lazy.
  • Converting Polars DataFrames with Null Series to PyArrow tables works now. pl.DataFrame([pl.Series("null", [None, None], dtype=pl.Null)]).to_arrow()

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Feb 23, 2023
@ghuls
Copy link
Collaborator Author

ghuls commented Feb 23, 2023

Some other things I noticed:

  • Formatting of dataframe columns with pl.List(pl.Null) does not work properly.
  • A plain Polars Null series does not get printed like other series.
>>> tbl = pa.table({"a": [None, None], "b": [[None, None], [None, None]]})
>>> pl.from_arrow(tbl)
shape: (2, 2)
┌──────┬─────────────────┐
│ ab               │
│ ------             │
│ nulllist[null]      │
╞══════╪═════════════════╡
│ nullfmt implemented │
│ nullfmt implemented │
└──────┴─────────────────┘
>>> df = pl.from_arrow(tbl)
>>> df["a"]
nullarray

pl.repeat does not work exactly the same for eager and lazy. Lazy can return pl.Int32 or pl.Int64 depending if the value fits in int32 boundaries for numbers while eager always returns pl.Int64.
@ritchie46 is this intentional?

Improve usability of pl.Null type:

  - Creating a Null array with repeat in eager works now:
      s = pl.repeat(None, 7, eager=True)
    Also bounds checks for converting to i32 series are slightly
    tweaked for lazy.
  - Converting Polars DataFrames with Null Series to PyArrow tables
    works now.
      pl.DataFrame([pl.Series("null", [None, None], dtype=pl.Null)]).to_arrow()
@ghuls ghuls force-pushed the feat_python_rust_improve_null_type branch from 15cab52 to 06f4fa1 Compare February 24, 2023 09:00
@ritchie46
Copy link
Member

ritchie46 commented Feb 24, 2023

Nope, this is not intentional. Can you make an issue? Yeap, I think we should implement the fmt for the nullarray as well.

@ghuls
Copy link
Collaborator Author

ghuls commented Feb 24, 2023

I made an issue for the fmt problem: #7153

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants