-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Fix create v1 table on REST Catalog #10369
Conversation
Hi @nastra, can you please take a look at this fix when available? Thanks a lot! |
@@ -861,6 +862,19 @@ public static Builder buildFromEmpty() { | |||
return new Builder(); | |||
} | |||
|
|||
public static Builder buildFromMetadataUpdates(List<MetadataUpdate> metadataUpdates) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix looks correct but I don't think we should expose this in TableMetadata
. It should be fine to have this inside CatalogHandlers#create
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestion, that seems better, fixed!
SCHEMA, | ||
PartitionSpec.unpartitioned(), | ||
ImmutableMap.of("format-version", formatVersion)); | ||
assertThatNoException().isThrownBy(() -> createTableTransaction1.commitTransaction()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than checking that no exception was thrown it would be better to just load the table and ensure it exists.
Something like this short piece should be enough:
@ParameterizedTest
@ValueSource(ints = {1, 2})
public void createTableTransaction(int formatVersion) {
RESTCatalogAdapter adapter = Mockito.spy(new RESTCatalogAdapter(backendCatalog));
RESTCatalog catalog = catalog(adapter);
if (requiresNamespaceCreate()) {
catalog.createNamespace(NS);
}
catalog
.newCreateTableTransaction(
TABLE,
SCHEMA,
PartitionSpec.unpartitioned(),
ImmutableMap.of("format-version", String.valueOf(formatVersion)))
.commitTransaction();
BaseTable table = (BaseTable) catalog.loadTable(TABLE);
assertThat(table.operations().current().formatVersion()).isEqualTo(formatVersion);
}
Can you please also update the other test accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact we should move this to CatalogTests
, so that we make sure this works across different catalog implementations:
@ParameterizedTest
@ValueSource(ints = {1, 2})
public void createTableTransaction(int formatVersion) {
if (requiresNamespaceCreate()) {
catalog().createNamespace(NS);
}
catalog()
.newCreateTableTransaction(
TABLE,
SCHEMA,
PartitionSpec.unpartitioned(),
ImmutableMap.of("format-version", String.valueOf(formatVersion)))
.commitTransaction();
BaseTable table = (BaseTable) catalog().loadTable(TABLE);
assertThat(table.operations().current().formatVersion()).isEqualTo(formatVersion);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests have been moved to CatalogTests
, and modified as your suggestion. Please take a look when available, thanks a lot.
@@ -374,8 +374,7 @@ private static TableMetadata create(TableOperations ops, UpdateTableRequest requ | |||
// the only valid requirement is that the table will be created | |||
request.requirements().forEach(requirement -> requirement.validate(ops.current())); | |||
|
|||
TableMetadata.Builder builder = TableMetadata.buildFromEmpty(); | |||
request.updates().forEach(update -> update.applyTo(builder)); | |||
TableMetadata.Builder builder = TableMetadata.buildFromMetadataUpdates(request.updates()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just inline the fix here when iterating updates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java
Outdated
Show resolved
Hide resolved
@@ -991,7 +991,7 @@ public Builder assignUUID(String newUuid) { | |||
|
|||
// it is only safe to set the format version directly while creating tables | |||
// in all other cases, use upgradeFormatVersion | |||
private Builder setInitialFormatVersion(int newFormatVersion) { | |||
public Builder setInitialFormatVersion(int newFormatVersion) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the changes in this PR LGTM, but we need to be careful about making this method public (see also #8381 (comment)). Let me think a bit about this to see whether there's a slightly different way to approach this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one alternative would be to lazily initialize formatVersion
inside TableMetadata.Builder
so that calling setInitialFormatVersion
wouldn't be necessary from CatalogHandlers
during table creation. Let's see what others think here
/cc @aokolnychyi @rdblue @amogh-jahagirdar @RussellSpitzer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be made public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hantangwangd I've raised a PR to your branch https://github.com/hantangwangd/iceberg/pull/1/files, but missed @nastra on the lazy formatVersion initialization. Not sure entirely what that meant, but in my approach we add a new buildFromEmpty(int formatVersion
API so that the format version is only initialized once and never mutated (unless going through an actual upgrade API call)
.forEach( | ||
update -> { | ||
if (update instanceof UpgradeFormatVersion) { | ||
builder.setInitialFormatVersion(((UpgradeFormatVersion) update).formatVersion()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right way to set the format version. The initial metadata needs to be created as v1, rather than dangerously setting the version to 1 after constructing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue Thanks for your comment. Do you mean that we should expose something like buildFromSpecifiedVersion(formatVersion)
in TableMetadata
for getting a builder with specified format version, and check the updates list in advance here to get the format version before creating the builder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we use a builder that lazily initialize formatVersion
as @nastra described above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stepped through the code/debugger with the new tests, I think the right solution is to use the TableMetadata#newTableMetadata API in the create
method rather than buildFromEmpty and apply the updates. That will build the initial metadata as V1 rather than trying to mutate it later after it's built, which is what I think @rdblue is saying to avoid. I'll let him confirm though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or an alternative is to add a buildFromEmpty(int formatVersion)
API and apply the updates. Either way, agree with the idea that the format version should be defined once upfront when building a new table metadata instance for creation rather than mutate it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hantangwangd I've raised a PR to your branch https://github.com/hantangwangd/iceberg/pull/1/files to show what I mean. cc @nastra @rdblue @aokolnychyi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amogh-jahagirdar Thank you for the change, that makes sense. I absorbed it into this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks for fixing @hantangwangd and @amogh-jahagirdar
This PR fix the problems of
Cannot downgrade v2 table to v1
when creating v1 table on REST Catalog.Fix issue: #8756