-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Series dtype inference based on the type of the first anyvalue #7212
Comments
We should error. For |
I think it works because the primitive type for datetimes is int64 and extracting ints out of floats does work somewhere along the way. Plus, conversion logic is (at least one branch of it) based on the type of the first non-null element, that's pretty unobvious for the user - swap the order and get different result. |
I also find the conversion logic quite confusing: In import polars as pl
from datetime import datetime
series = pl.Series([1, True, "a", datetime.now(), None])
print(list(series)) prints On the other hand this code raises an error import polars as pl
from datetime import datetime
series = pl.Series([1, True, datetime.now(), "a", None])
print(list(series)) Explicitly specifying the import polars as pl
from datetime import datetime
series = pl.Series([1, True, datetime.now(), "a", None], dtype=pl.Object)
print(list(series)) # [1, True, datetime.datetime(2023, 4, 22, 12, 16, 42, 870943), 'a', None] In any case, I would expect that the description of the |
Closing in favor of #11156 |
Currently, it seems like when
Series
is constructed from any-values, it simply grabs the first non-null value and uses that to convert all other values to it? This may lead to some weird examples like:(pandas would simply cast both to 'object')
Should it just always raise an error in cases like this? Is there are a better way to handle this kind of cases?
The text was updated successfully, but these errors were encountered: