Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Sqllogictests for INSERT INTO external table #7294

Merged
merged 2 commits into from
Aug 17, 2023

Conversation

devinjdangelo
Copy link
Contributor

@devinjdangelo devinjdangelo commented Aug 15, 2023

Which issue does this PR close?

Closes #7228
(^ this issue was closed prematurely with #7276 by accident, but this one should close it for real)

Rationale for this change

Existing Sqllogic tests for insert into only cover memory tables. Expanding this to cover external tables will improve our test coverage and validation of changes to FileSinks.

What changes are included in this PR?

  • Adds insert_to_external.slt which replicates many of the tests in insert.slt but for external ListingTables
  • Refactor the logic for creating local file paths if not existing in Copy To physical plan to a new method ListingTableUrl::parse_create_local_if_not_exists
  • Add new option for external table creation to allow creating local path

Examples:

Create a single csv file and append values

statement ok
CREATE EXTERNAL TABLE
single_file_test(a bigint, b bigint)
STORED AS csv
LOCATION 'test_files/scratch/single_csv_table.csv'
OPTIONS(
create_local_path 'true',
single_file 'true',
);

query II
INSERT INTO single_file_test values (1, 2), (3, 4);
----
2

query II
select * from single_file_test;
----
1 2
3 4

Create a table backed by a directory of parquet files, insert new files

statement ok
CREATE EXTERNAL TABLE
directory_test(a bigint, b bigint)
STORED AS parquet
LOCATION 'test_files/scratch/external_parquet_table_q0'
OPTIONS(
create_local_path 'true',
);

query II
INSERT INTO directory_test values (1, 2), (3, 4);
----
2

query II
select * from directory_test;
----
1 2
3 4

Are these changes tested?

Yes

Are there any user-facing changes?

More options supported!

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Aug 15, 2023
@github-actions github-actions bot removed sql SQL Planner optimizer Optimizer rules labels Aug 16, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @devinjdangelo -- I think this looks good to me

I am going to merge this one to keep things moving . I also filed #7319 to track adding some documentation

@@ -145,6 +145,36 @@ impl TableProviderFactory for ListingTableFactory {
},
}?;

let create_local_path_mode = cmd
.options
.get("create_local_path")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be great to eventually get the Options into a structured form (#7283 (comment)) so we can have the compiler check values like this rather than relying on string matching. I have it on my list today to file a ticket about that

@alamb alamb merged commit 6a775b7 into apache:main Aug 17, 2023
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create new empty external table
2 participants