Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce dynamic table append #15897

Merged
merged 77 commits into from
Mar 1, 2024
Merged

Conversation

kgyrtkirk
Copy link
Member

@kgyrtkirk kgyrtkirk commented Feb 13, 2024

Dynamic table appaned TABLE(APPEND(...))

This is essentially an SQL level syntactic sugar which matches columns by name from multiple tables.
Suppose you have 3 tables:

  • table1 has column1
  • table2 has column2
  • table3 has column1, column2, column3

It is possible to create an union view of the above tables is possible by using UNION ALL:

SELECT column1,NULL AS column2,NULL AS column3 FROM table1
UNION ALL
SELECT NULL AS column1,column2,NULL AS column3 FROM table2
UNION ALL
SELECT column1,column2,column3 FROM table3

However depending on the size of the table's schema it might be quite complicated to do that; TABLE(APPEND('table1','table2','table3')) represents the same in a more compact form.

It adds one additional thing

  • if a column may have conflicting types (same column defined with different types in the inputs) - ColumnType.leastRestrictive is used to get a common type for them

Release note

Introduce dynamic table append - SELECT * FROM TABLE(APPEND('table1','table2')) implicitly creates an union based on the table schemas.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Member

@rohangarg rohangarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for the changes @kgyrtkirk :)
I have left a few comments on the changes, please let me know your thoughts

Copy link
Member

@vtlim vtlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new docs look great. I left some suggestions to improve scanning and readability.

docs/querying/datasource.md Show resolved Hide resolved
@@ -211,6 +219,50 @@ Inline datasources are not available in Druid SQL.
Refer to the [Query execution](query-execution.md#inline) page for more details on how queries are executed when you
use inline datasources.

### Dynamic table append `TABLE(APPEND(...))`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Since this has the same native query syntax as union should this content go within the existing union section?
  • Suggest removing the SQL syntax from the header title, so just have #### Dynamic table append

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the section to be the next after union

what do you think:

  • should this be made a subsection of union?
  • should I remove the native as that matches the standard union's behaviour?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like both of those ideas. Since the native union behavior is the same, the new content can go in there, and removing the repeated native section won't have people wondering what the difference is.

docs/querying/datasource.md Outdated Show resolved Hide resolved
docs/querying/datasource.md Outdated Show resolved Hide resolved
docs/querying/datasource.md Outdated Show resolved Hide resolved
docs/querying/datasource.md Outdated Show resolved Hide resolved
docs/querying/datasource.md Outdated Show resolved Hide resolved
Copy link
Member

@vtlim vtlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment on heading styling. With that and the changes in #15897 (comment), docs should be all set.

docs/querying/datasource.md Outdated Show resolved Hide resolved
Copy link
Member

@vtlim vtlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for docs

Copy link
Member

@rohangarg rohangarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@rohangarg rohangarg merged commit bf0995f into apache:master Mar 1, 2024
84 checks passed
@adarshsanjeev adarshsanjeev added this to the 30.0.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants