-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCALAR_IN_ARRAY: Optimization and behavioral follow-ups. #16311
Conversation
1) Align behavior for `null` scalars to the behavior of the native `in` and `inType` filters: return `true` if the array itself contains null, else return `null`. 2) Rename the class to more closely match the function name. 3) Add a specialization for constant arrays, where we build a `HashSet`. 4) Use `castForEqualityComparison` to properly handle cross-type comparisons. Additional tests verify comparisons between LONG and DOUBLE are now handled properly.
return ExprEval.ofLongBoolean(Arrays.asList(array).contains(scalarExpr.value())); | ||
|
||
if (scalarEval.value() == null) { | ||
return Arrays.asList(array).contains(null) ? ExprEval.ofLongBoolean(true) : ExprEval.ofLong(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the array
contains null
- then doesn't all places when false
would have been returned should return null
? for example:
c IN (2,null)
c = 2 OR c = null
c = 2 OR null
this last rewrite essentially mean that false
is no more
....other way around is to think about =
as IS NOT DISTINCT FROM
; however in that case the return value will never be null
as <=>
is a 2-valued
this is not a blocking comment; I was just wondering...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that if the array contains null
then the comparison acts like IS NOT DISTINCT FROM
(always returns true
or false
), whereas if the array does not contain null
, then the comparison acts like =
(returns null
if the lhs is null
). It's the same way the native in
and inType
filters behave, so this is designed to align with those.
A future patch would convert from SQL IN
to this SCALAR_IN_ARRAY
function. We'll need to make sure to handle NULL
appropriately in that patch-- it would not be ok to rewrite c IN (2, NULL)
to SCALAR_IN_ARRAY(c, ARRAY[2, NULL])
.
ExprEval doApply(final ExprEval arrayExpr, final ExprEval scalarEval) | ||
{ | ||
if (matchType == null) { | ||
return ExprEval.ofLong(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: matchType
is known at asSingleThreaded
call time; a strategy pattern could create a class returning only null
unconditionally
it may make it a bit more readable and matchType
may not be null
- other benefits are possibly negligable...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a specialization for WithNullArray
, so now the WithConstantArray
specialization requires a non-null constant array.
While doing this I also realized I had implemented WithConstantArray
inefficiently: by overriding doApply
rather than apply
, the constant array expr would still have eval()
called on it. I changed it to override apply
instead.
* Four changes to scalar_in_array as follow-ups to apache#16306: 1) Align behavior for `null` scalars to the behavior of the native `in` and `inType` filters: return `true` if the array itself contains null, else return `null`. 2) Rename the class to more closely match the function name. 3) Add a specialization for constant arrays, where we build a `HashSet`. 4) Use `castForEqualityComparison` to properly handle cross-type comparisons. Additional tests verify comparisons between LONG and DOUBLE are now handled properly. * Fix spelling. * Adjustments from review.
…6398) * Four changes to scalar_in_array as follow-ups to #16306: 1) Align behavior for `null` scalars to the behavior of the native `in` and `inType` filters: return `true` if the array itself contains null, else return `null`. 2) Rename the class to more closely match the function name. 3) Add a specialization for constant arrays, where we build a `HashSet`. 4) Use `castForEqualityComparison` to properly handle cross-type comparisons. Additional tests verify comparisons between LONG and DOUBLE are now handled properly. * Fix spelling. * Adjustments from review. Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Four changes to scalar_in_array as follow-ups to #16306:
Align behavior for
null
scalars to the behavior of the nativein
andinType
filters: returntrue
if the array itself contains null, else returnnull
.Rename the class to more closely match the function name.
Add a specialization for constant arrays, where we build a
HashSet
.Use
castForEqualityComparison
to properly handle cross-type comparisons.Additional tests verify comparisons between LONG and DOUBLE are now
handled properly.