Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): improved exceptions on attempt to use invalid schema/dtypes #6653

Merged

Conversation

alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented Feb 3, 2023

Closes #6648.

Enables centralised dtype sanity checking for schema params inside _unpack_schema, and raises more helpful errors in general for attempts to use invalid dtypes.


Examples:

  • Use of invalid dtype in schema.

    pl.DataFrame(
        data = {"words": [["hello", "hi"], ["polar", "bears"]]}, 
        schema = {"words": pl.list(pl.Categorical)},
    )

    Before

    ValueError: Since Expr are lazy, the truthiness of an Expr is ambiguous. Hint: use '&' or '|' to logically combine Expr, not 'and'/'or', and use 'x.is_in([y,z])' instead of 'x in [y,z]' to check membership.
    

    After

    ValueError: Cannot infer dtype from 'COLUMN OF DTYPE: [Categorical(None)].list()' (type: Expr)
    
  • Use of invalid dtype in cast operation.

    df1 = pl.DataFrame({"words": [["hello", "hi"], ["polar", "bears"]]})
    df1.select(pl.col("words").cast(pl.list(pl.Categorical)))

    Before

    ValueError: could not convert value 'Unknown' as a Literal
    

    After

    ValueError: Cannot infer dtype from 'COLUMN OF DTYPE: [Categorical(None)].list()' (type: Expr)
    

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Feb 3, 2023
@ritchie46 ritchie46 merged commit d9fe8ff into pola-rs:master Feb 3, 2023
@alexander-beedie alexander-beedie deleted the sanity-check-schema-dtypes branch February 3, 2023 18:05
Vincenthays pushed a commit to Vincenthays/polars that referenced this pull request Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confusion between pl.list and pl.List
2 participants