-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix greatest/least function non-vectorized processing to ignore null argument types #16649
fix greatest/least function non-vectorized processing to ignore null argument types #16649
Conversation
I think it would be great to have consistency check testcase which ensures that from now on the contract that I understand that this patch fixes a consistency issue - but I'm a bit confused to see that it will be allowed to have for example: with the proposed changes the following
is that okay? couldn't the same happen for |
Technically neither of these functions are vectorized yet and just rely on the fallback stuff introduced in #16366. There are such tests that fit this purpose, but none of the fallback stuff has been added here https://github.com/apache/druid/blob/master/processing/src/test/java/org/apache/druid/math/expr/VectorExprSanityTest.java https://github.com/apache/druid/blob/master/sql/src/test/java/org/apache/druid/sql/calcite/SqlVectorizedExpressionSanityTest.java because its basically like.. all the functions i guess.
So, the The reason the output type is double in your first test is because the result is 1.0, while the reason it is string in the second one is a sort of legacy behavior of non-vectorized expressions where the output type of null values is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the explanation!
one more question: is it true that a null
will always be of type STRING
?
Null constants always will be currently, while null values from columns will be whatever column type they came from (e.g. long column make |
Description
Fixes a flaw in
greatest
andleast
functions non-vectorized expression processing to ignorenull
valued arguments for determining the output type. For example, given an expression likeleast(1.0, 1, null)
, thegetOutputType
method correctly choosesDOUBLE
as the output type, however during non-vectorized processing, we incorrectly consider it, which ends up with the defaultSTRING
output type, resulting in incorrectly using theSTRING
comparator instead of the double comparator as expected. This can cause odd effects and incorrect results because values like1.0
and1
are considered distinct when using the string comparator, while the same when using a numeric comparator.The added tests fail without the changes in this PR.
This PR has: