Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS: Retain Glue Catalog table description after updating Iceberg table #10199

Merged
merged 7 commits into from
May 15, 2024

Conversation

aajisaka
Copy link
Member

@aajisaka aajisaka commented Apr 22, 2024

Problem

In AWS Glue Catalog, user can set arbitrary description to the table and its columns via Web UI or even from Amazon Athena. However, the descriptions will be removed after upgrading the table.

What this PR do

This patch is to keep the original table description when calling UpdateTable API. Keeping column description is hard and I think we can open a different PR to do this.

Testing

Manually tested. Also, added an integration test and it passed on my environment.

@github-actions github-actions bot added the AWS label Apr 22, 2024
@geruh
Copy link
Contributor

geruh commented Apr 22, 2024

Awesome, thanks for contributing this @aajisaka! Manually did this change work? For adding some tests, if the unit test logic in this module is difficult. I'd suggest leveraging this modules integration test workflow to update an created table with the glue client directly.

https://github.com/apache/iceberg/blob/main/aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java

@aajisaka
Copy link
Member Author

Thank you for your suggestion @geruh

I'll test this patch manually and update the class.

@aajisaka
Copy link
Member Author

aajisaka commented Apr 25, 2024

I ran the integration test using my AWS account and the testUpdateTable() was successful.

export AWS_TEST_ACCOUNT_ID=<test account id>
export AWS_REGION=<test region>
export AWS_TEST_BUCKET=<test bucket>
 ./gradlew :iceberg-aws:integrationTest

@aajisaka
Copy link
Member Author

Also, I built the jars and manually tested on AWS Glue as a Spark runtime.

@aajisaka
Copy link
Member Author

Hi @geruh would you review this PR?

@aajisaka
Copy link
Member Author

aajisaka commented May 8, 2024

Hi @geruh @amogh-jahagirdar would you review the latest change?

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great @aajisaka, just some nits and then I think it's good to merge!

@geruh
Copy link
Contributor

geruh commented May 15, 2024

Awesome work @aajisaka, looks good to me!

@amogh-jahagirdar amogh-jahagirdar merged commit 2058053 into apache:main May 15, 2024
41 checks passed
@aajisaka aajisaka deleted the keep-description branch May 15, 2024 00:15
@aajisaka
Copy link
Member Author

Thank you @geruh @amogh-jahagirdar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants