Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alter table to generic types #2791

Open
fcomuniz opened this issue Jul 7, 2021 · 2 comments
Open

Alter table to generic types #2791

fcomuniz opened this issue Jul 7, 2021 · 2 comments
Labels

Comments

@fcomuniz
Copy link

fcomuniz commented Jul 7, 2021

Hi, this project is very awesome, we are starting to use it now for saving our data using iceberg and it is working very well.

One issue we have encountered is when we want to change the data type from an int to a string, just so we can have a more generic data type to store the given information. This happens when there is a schema change from the source and we don't want to lose the data that was in that column.

For example, from spark-sql

create table iceberg.bronze.test__test (id int) using iceberg;
alter table iceberg.bronze.test__test alter column id type string;

Gives the following error

        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalArgumentException: Cannot change column type: id: long -> string
        at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:459)
        at org.apache.iceberg.SchemaUpdate.updateColumn(SchemaUpdate.java:244)
        at org.apache.iceberg.spark.Spark3Util.applySchemaChanges(Spark3Util.java:160)
        at org.apache.iceberg.spark.SparkCatalog.commitChanges(SparkCatalog.java:432)
        at org.apache.iceberg.spark.SparkCatalog.alterTable(SparkCatalog.java:216)
        at org.apache.iceberg.spark.SparkCatalog.alterTable(SparkCatalog.java:79)
        at org.apache.spark.sql.execution.datasources.v2.AlterTableExec.run(AlterTableExec.scala:37)
        ... 44 more

I would just like some pointers as to how i could change the data type of the data, and an idea as to how it could be implemented into the alter table statement.

Thanks for any help you can give.

@paulboocock
Copy link

We have two common type evolutions that we'd like to see:

int -> float
int -> string

Actually, for the latter it's more like:
numeric -> string

This is for similar reasons as the initial comment, where the source data starts as one type but as it changes we don't want to lose that historical data

Copy link

github-actions bot commented Jul 5, 2024

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

@github-actions github-actions bot added the stale label Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants