Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified Function Signature type coercion handling for Nulls #12698

Open
alamb opened this issue Oct 1, 2024 · 5 comments
Open

Unified Function Signature type coercion handling for Nulls #12698

alamb opened this issue Oct 1, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Oct 1, 2024

Is your feature request related to a problem or challenge?

The code in #12308 from @mesejo added explicit DataType::Null data type handling for type coercion for certain functions

However, this may have introduced a regression for handling Dictionary types #12670

Describe the solution you'd like

I would like the "normal" function signature resolution to handle Null coercion rather than requiring functions to provide a custom coerce method (as was done in #12308)

Comments from #12670 (comment)

Describe alternatives you've considered

Ideally I think the coercion logic should be able to substitute Null for any data type passed in

So given a signature like

                    Exact(vec![Utf8View, Utf8View]),

I would expect the coercion logic to be able to handle inputs like the following (by casting Null to Utf8View)

(Null, Null)
(Null, Utf8View) 
(Utf8View, Null)
(Utf8View, Utf8View)

Additional context

No response

@alamb alamb added the enhancement New feature or request label Oct 1, 2024
@alamb alamb changed the title Unified type coercion handling for Nulls Unified Function type coercion handling for Nulls rather Oct 1, 2024
@alamb alamb changed the title Unified Function type coercion handling for Nulls rather Unified Function Signature type coercion handling for Nulls rather Oct 1, 2024
@alamb alamb changed the title Unified Function Signature type coercion handling for Nulls rather Unified Function Signature type coercion handling for Nulls Oct 1, 2024
@jayzhan211
Copy link
Contributor

We can introduce TypeSignature::String similar to Numeric one

@findepi
Copy link
Member

findepi commented Oct 2, 2024

Would that allow to avoid use of Signature::user_defined? if functions use user_defined, then same functionality will have to be implemented independently, and inconsistencies are inevitable.
Can we make it a goal that not to require Signature::user_defined to implement useful things?

BTW this related to two topics

@alamb
Copy link
Contributor Author

alamb commented Oct 2, 2024

Can we make it a goal that not to require Signature::user_defined to implement useful things?

Yes, I think this should be the goal

The usecase for Signature::user_defined that I recall was to implement coercion rules for very specific functions (like coalesce)

We can introduce TypeSignature::String similar to Numeric one

I still don't fully understand why we can't just adjust the existing coercion rules to coerce Null to any needed type

@findepi
Copy link
Member

findepi commented Oct 2, 2024

Can we make it a goal that not to require Signature::user_defined to implement useful things?

Yes, I think this should be the goal

filed #12725 for this.

I still don't fully understand why we can't just adjust the existing coercion rules to coerce Null to any needed type

i would assume this is the case today.
#12712 changes strpos to accept various string types (but not Null), yet it works for null input values too.

@jayzhan211
Copy link
Contributor

jayzhan211 commented Oct 2, 2024

Would that allow to avoid use of Signature::user_defined? if functions use user_defined, then same functionality will have to be implemented independently, and inconsistencies are inevitable. Can we make it a goal that not to require Signature::user_defined to implement useful things?

BTW this related to two topics

Signature::user_defined is only used for special function ideally, it is used in many places because we don't have a nice signature system yet.

I have an old issue about the coercion rule and type signature #10507

I think the downside of coerce_from is that since all the coercion rule is in the one place, any change to it has unknown effect therefore hard to maintain, also it is not extensible. This is why we come out many other TypeSignature that avoid to go through the coerce_from.

Ideally we should have general signature like TypeSignature::Numeric or TypeSignature::String for most of the function. Only if the one that is too specific (only used in one function), we use TypeSignature::user_defined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants