Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33558][SQL][TESTS] Unify v1 and v2 ALTER TABLE .. ADD PARTITION tests #30685

Closed

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Dec 9, 2020

What changes were proposed in this pull request?

  1. Move the ALTER TABLE .. ADD PARTITION parsing tests to AlterTableAddPartitionParserSuite
  2. Place v1 tests for ALTER TABLE .. ADD PARTITION from DDLSuite and v2 tests from AlterTablePartitionV2SQLSuite to the common trait AlterTableAddPartitionSuiteBase, so, the tests will run for V1, Hive V1 and V2 DS.

Why are the changes needed?

  • The unification will allow to run common ALTER TABLE .. ADD PARTITION tests for both DSv1 and Hive DSv1, DSv2
  • We can detect missing features and differences between DSv1 and DSv2 implementations.

Does this PR introduce any user-facing change?

No

How was this patch tested?

By running new test suites:

$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableAddPartitionSuite"

"part6" -> "abc",
"part7" -> "true",
"part8" -> "2020-11-23",
"part9" -> s"2020-11-23${if (version == "V2") " " else "T"}22:13:10.123456")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the behavior diff between V1 and V2 in showing partitions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, V1 doesn't care of correctness of partition values. We can create partitions with any garbage like 2020-11-23 --22:13:10.123456. V1 takes the string AS IS, and creates partition. No checks that partition value matches partition field type.

sql(s"CREATE TABLE $t (id bigint, data string) $defaultUsing PARTITIONED BY (id)")
sql(s"ALTER TABLE $t ADD PARTITION (id=2) LOCATION 'loc1'")

val errMsg = intercept[AnalysisException] {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is one more difference: Hive's exception is not handled, and propagated to the upper layer. Comparing to V1 (in-memory) and V2, where PartitionsAlreadyExistException is thrown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try-catch the hive exception and throw PartitionsAlreadyExistException

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will open JIRA for that, and fix separately since this PR is about test changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the fix #30711

.set(s"spark.sql.catalog.$catalog", classOf[InMemoryPartitionTableCatalog].getName)
.set(s"spark.sql.catalog.non_part_$catalog", classOf[InMemoryTableCatalog].getName)

override protected def checkLocation(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will remove this after SPARK-33393

@MaxGekk
Copy link
Member Author

MaxGekk commented Dec 9, 2020

@cloud-fan @HyukjinKwon Please, have a look at this PR.

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37095/

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37095/

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Test build #132493 has finished for PR 30685 at commit 2ce716b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class AlterTableAddPartitionSuite extends AlterTableAddPartitionSuiteBase with SharedSparkSession

protected def withNsTable(ns: String, tableName: String)(f: String => Unit): Unit = {
withNamespace(ns) {
sql(s"CREATE NAMESPACE $ns")
val t = s"$ns.$tableName"
Copy link
Contributor

@cloud-fan cloud-fan Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the caller side can just pass in the namespace, and here we return $catalog.$ns.$tableName

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an alternative is: def withNsTable(ns: String, tableName: String, cat: String = catalog)

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37111/

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37111/

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Test build #132509 has finished for PR 30685 at commit f2c4ecc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 10, 2020

Test build #132513 has finished for PR 30685 at commit 50c59aa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master/3.1! (since it's test only)

@cloud-fan cloud-fan closed this in af37c7f Dec 10, 2020
cloud-fan pushed a commit that referenced this pull request Dec 10, 2020
…N tests

### What changes were proposed in this pull request?
1. Move the `ALTER TABLE .. ADD PARTITION` parsing tests to `AlterTableAddPartitionParserSuite`
2. Place v1 tests for `ALTER TABLE .. ADD PARTITION` from `DDLSuite` and v2 tests from `AlterTablePartitionV2SQLSuite` to the common trait `AlterTableAddPartitionSuiteBase`, so, the tests will run for V1, Hive V1 and V2 DS.

### Why are the changes needed?
- The unification will allow to run common `ALTER TABLE .. ADD PARTITION` tests for both DSv1 and Hive DSv1, DSv2
- We can detect missing features and differences between DSv1 and DSv2 implementations.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running new test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableAddPartitionSuite"
```

Closes #30685 from MaxGekk/unify-alter-table-add-partition-tests.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit af37c7f)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@cloud-fan
Copy link
Contributor

I've reverted it from 3.1, as the tests failed. It's probably because some v2 PRs go to master only.

@MaxGekk MaxGekk deleted the unify-alter-table-add-partition-tests branch February 19, 2021 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants