Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to execute a substrait plan? #485

Closed
bianhq opened this issue Apr 2, 2023 · 3 comments
Closed

How to execute a substrait plan? #485

bianhq opened this issue Apr 2, 2023 · 3 comments

Comments

@bianhq
Copy link

bianhq commented Apr 2, 2023

Hi,
After reading the documents on the official website, I am still confused about:

  1. How to convert the substrait plan to the query plan of a specific query engine? I understand that substrait does not do any query optimization; it is only a cross-query-engine plan format (correct me if I am wrong). But is there any guideline or official examples for converting the substrait plan to the query plan in a query engine? For example, should I map a substrait plan to a very early-stage logical plan that is to be consumed by the planner and optimizer (so that we can benefit from the query optimization in the query engine) or the physical plan that is executed by the executors (this is mentioned in some talks)? How can we do this if the query engine is not open-sourced or if the query engine does not have a well-defined form of intermediate query plans?

  2. Are there any existing integrations (converters) from substrait to the query engines such as Spark and Trino? The official website uses Spark and Trino as example use cases, so I suppose that Spark and Trino integrations are already implemented and tested. I have found the duckdb integration, but I did not find any official or third-party open-source integrations for Trino or Spark. Should I implement these integrations by myself?

Thanks a lot!

@bianhq
Copy link
Author

bianhq commented Apr 2, 2023

I found that in duckdb's substrait extension, the substrait plan is converted to the parse tree (i.e., the input of the logical planner) of duckdb.

@westonpace
Copy link
Member

westonpace commented Apr 3, 2023 via email

@bianhq
Copy link
Author

bianhq commented Apr 3, 2023

Thank you for the detailed and clear reply!
It would be nice if there were more third-party or official integrations for popular query engines such as Spark and Trino.

@bianhq bianhq closed this as completed Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants