Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix NestedCommonFormatColumnHandler to use nullable comparator when castToType is set #15921

Merged

Conversation

clintropolis
Copy link
Member

Description

Fixes a bug when the undocumented castToType parameter is set on 'auto' column schema, which should have been using the 'nullable' comparator to allow null values to be present when merging columns, but wasn't which would lead to null pointer exceptions. Also fixes an issue I noticed while adding tests that if 'FLOAT' type was specified for the castToType parameter it would be an exception because that type is not expected to be present, since 'auto' uses the native expressions to determine the input types and expressions don't have direct support for floats, only doubles.

In the future I should probably split this functionality out of the 'auto' schema (maybe even have a simpler version of the auto indexer dedicated to handling non-nested data) but still have the same results of writing out the newer 'nested common format' columns used by 'auto', but I haven't taken that on in this PR.


This PR has:

  • been self-reviewed.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@@ -100,7 +100,7 @@ public Comparator<ColumnValueSelector> getEncodedValueSelectorComparator()
{
if (castTo != null) {
return (s1, s2) ->
castTo.getStrategy().compare(
castTo.getNullableStrategy().compare(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you can get this once outside of the lambda instead of calling the getter over and over again.

@cryptoe cryptoe added this to the 29.0.1 milestone Feb 21, 2024
@abhishekagarwal87 abhishekagarwal87 merged commit cc5964f into apache:master Feb 22, 2024
82 of 83 checks passed
@clintropolis clintropolis deleted the fix-auto-indexer-comparator branch February 22, 2024 19:02
cryptoe pushed a commit to cryptoe/druid that referenced this pull request Mar 6, 2024
…astToType is set (apache#15921)

Fixes a bug when the undocumented castToType parameter is set on 'auto' column schema, which should have been using the 'nullable' comparator to allow null values to be present when merging columns, but wasn't which would lead to null pointer exceptions. Also fixes an issue I noticed while adding tests that if 'FLOAT' type was specified for the castToType parameter it would be an exception because that type is not expected to be present, since 'auto' uses the native expressions to determine the input types and expressions don't have direct support for floats, only doubles.

In the future I should probably split this functionality out of the 'auto' schema (maybe even have a simpler version of the auto indexer dedicated to handling non-nested data) but still have the same results of writing out the newer 'nested common format' columns used by 'auto', but I haven't taken that on in this PR.
cryptoe added a commit that referenced this pull request Mar 11, 2024
…astToType is set (#15921) (#16053)

Fixes a bug when the undocumented castToType parameter is set on 'auto' column schema, which should have been using the 'nullable' comparator to allow null values to be present when merging columns, but wasn't which would lead to null pointer exceptions. Also fixes an issue I noticed while adding tests that if 'FLOAT' type was specified for the castToType parameter it would be an exception because that type is not expected to be present, since 'auto' uses the native expressions to determine the input types and expressions don't have direct support for floats, only doubles.

In the future I should probably split this functionality out of the 'auto' schema (maybe even have a simpler version of the auto indexer dedicated to handling non-nested data) but still have the same results of writing out the newer 'nested common format' columns used by 'auto', but I haven't taken that on in this PR.

Co-authored-by: Clint Wylie <cwylie@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants