-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INSERT/REPLACE can omit clustering when catalog has default #16260
INSERT/REPLACE can omit clustering when catalog has default #16260
Conversation
…ql-type-inference
sql/src/test/java/org/apache/druid/sql/calcite/CalciteCatalogIngestionDmlTest.java
Fixed
Show fixed
Hide fixed
sql/src/test/java/org/apache/druid/sql/calcite/CalciteCatalogIngestionDmlTest.java
Fixed
Show fixed
Hide fixed
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java
Outdated
Show resolved
Hide resolved
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java
Show resolved
Hide resolved
…' into use-catalog-clustering-columns
…' into use-catalog-clustering-columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 ; just left some questions :)
final SqlIdentifier colIdent = new SqlIdentifier( | ||
Collections.singletonList(keyCol.expr()), | ||
null, SqlParserPos.ZERO, | ||
Collections.singletonList(SqlParserPos.ZERO) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering what will happen in the following case:
- say colunmn
c
is aclusterKey
- we are selecting from a join which has column
c
on both sides
but it seems like the column in the select list will take precedence.
one more thing I was wondering about: do we have a check that all keyCols
are present in the selected column list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About whether there is a check that al keyCols are present in the selected column list, see the following tests:
testInsertTableWithClusteringWithClusteringOnNewColumnFromQuery
testInsertTableWithClusteringWithClusteringOnBadColumn
Do these cover the cases you are talking about?
About the join issue, do you have a concrete query in example, just to clarify?
final IdentifierNamespace insertNs = (IdentifierNamespace) targetNamespace; | ||
SqlIdentifier identifier = insertNs.getId(); | ||
SqlValidatorTable catalogTable = getCatalogReader().getTable(identifier.names); | ||
if (catalogTable != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't the fall-thru from this condtional will cause that the CLUSTER BY
on the ingestNode
will not be applied (line399 right now); even if its there - is that okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the ingestNode already has the clustering columns, they will be used. There are existing tests which test that the clustering columns are used in the plan returned from dml query, when clustering is defined at query time, and the table is / it not in catalog. Let me know if this covers the issue that think could occur.
Description
This PR contains a portion of the changes from the inactive draft PR for integrating the catalog with the Calcite planner #13686 from @paul-rogers, allowing for tables that are defined in the catalog to have any defined clustering columns used in DML INSERT/REPLACE operations without needing to be specified at query time. If the user specified a clustering columns at query time, these columns are preferred to the catalog defined clustering columns.
This PR has: