[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types #30654

yaooqinn · 2020-12-07T18:36:54Z

What changes were proposed in this pull request?

Add migration guide for CHAR VARCHAR types

Why are the changes needed?

for migration

Does this PR introduce any user-facing change?

doc change

How was this patch tested?

passing ci

…AR types

SparkQA · 2020-12-07T18:50:37Z

Test build #132386 has finished for PR 30654 at commit 3cb8724.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-12-07T19:24:41Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36987/

SparkQA · 2020-12-07T19:57:10Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36987/

maropu · 2020-12-08T01:03:28Z

docs/sql-migration-guide.md

@@ -54,6 +54,8 @@ license: |

  - In Spark 3.1, creating or altering a view will capture runtime SQL configs and store them as view properties. These configs will be applied during the parsing and analysis phases of the view resolution. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.useCurrentConfigsForView` to `true`.

+  - In Spark 3.1, CHAR/CHARACTER and VARCHAR types become individual types from string. By default, they can only be used in table schema, not functions/operators. To restore the behavior before Spark 3.1, where treats them as string with length parameter simply ignored, you can set `spark.sql.legacy.charVarcharAsString` to `true`.


how about CHAR/CHARACTER and VARCHAR types become individual types from string -> we support CHAR/CHARACTER and VARCHAR types in our type system framework instead of replacing them with STRING types?

nit: table schema -> a table schema

, where treats them as string with length parameter simply ignored, -> , which treats them as STRING types and ignores a length parameter (e.g., CHAR(4)), ?

thanks, updated

…AR types

maropu · 2020-12-08T02:04:04Z

docs/sql-migration-guide.md

@@ -54,6 +54,8 @@ license: |

  - In Spark 3.1, creating or altering a view will capture runtime SQL configs and store them as view properties. These configs will be applied during the parsing and analysis phases of the view resolution. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.useCurrentConfigsForView` to `true`.

+  - In Spark 3.1, we support CHAR/CHARACTER and VARCHAR types in our type system framework instead of replacing them with STRING types. By default, they can only be used in table schema, not functions/operators. To restore the behavior before Spark 3.1, which treats them as STRING types and ignores a length parameter, e.g. `CHAR(4)`, you can set `spark.sql.legacy.charVarcharAsString` to `true`.


Ur, one more nit comment: By default, they can... -> Currently, they can...?

maropu

cc: @cloud

…AR types

SparkQA · 2020-12-08T02:49:29Z

Test build #132396 has finished for PR 30654 at commit 80d4421.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-12-08T03:15:40Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36996/

SparkQA · 2020-12-08T03:43:07Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36996/

dongjoon-hyun · 2020-12-08T20:13:42Z

docs/sql-migration-guide.md

@@ -54,6 +54,8 @@ license: |

  - In Spark 3.1, creating or altering a view will capture runtime SQL configs and store them as view properties. These configs will be applied during the parsing and analysis phases of the view resolution. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.useCurrentConfigsForView` to `true`.

+  - In Spark 3.1, we support CHAR/CHARACTER and VARCHAR types in our type system framework instead of replacing them with STRING types. Currently, they can only be used in a table schema, not functions/operators. To restore the behavior before Spark 3.1, which treats them as STRING types and ignores a length parameter, e.g. `CHAR(4)`, you can set `spark.sql.legacy.charVarcharAsString` to `true`.


I'm not sure about the nuance here. I'll leave this to @cloud-fan .

cloud-fan · 2020-12-09T05:13:48Z

docs/sql-migration-guide.md

@@ -54,6 +54,8 @@ license: |

  - In Spark 3.1, creating or altering a view will capture runtime SQL configs and store them as view properties. These configs will be applied during the parsing and analysis phases of the view resolution. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.useCurrentConfigsForView` to `true`.

+  - In Spark 3.1, we support CHAR/CHARACTER and VARCHAR types in our type system framework instead of replacing them with STRING types. Currently, they can only be used in a table schema, not functions/operators. To restore the behavior before Spark 3.1, which treats them as STRING types and ignores a length parameter, e.g. `CHAR(4)`, you can set `spark.sql.legacy.charVarcharAsString` to `true`.


Since Spark 3.1, CHAR/CHARACTER and VARCHAR types are supported in the table schema. Table scan/insertion will respect the char/varchar semantic. If char/varchar is used in places other than table schema, an exception will be thrown (CAST is an exception that simply treats char/varchar as string like before). To restore ...

cloud-fan · 2020-12-09T06:44:09Z

thanks, merging to master/3.1! (doc only change, no need to wait for jenkins)

…AR types ### What changes were proposed in this pull request? Add migration guide for CHAR VARCHAR types ### Why are the changes needed? for migration ### Does this PR introduce _any_ user-facing change? doc change ### How was this patch tested? passing ci Closes #30654 from yaooqinn/SPARK-33641-F. Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit c88edda) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

SparkQA · 2020-12-09T06:55:44Z

Test build #132466 has finished for PR 30654 at commit dc88a0b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-12-09T07:49:00Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37068/

SparkQA · 2020-12-09T07:58:12Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37068/

* [SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types ### What changes were proposed in this pull request? Add migration guide for CHAR VARCHAR types ### Why are the changes needed? for migration ### Does this PR introduce _any_ user-facing change? doc change ### How was this patch tested? passing ci Closes apache#30654 from yaooqinn/SPARK-33641-F. Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> * [SPARK-33669] Wrong error message from YARN application state monitor when sc.stop in yarn client mode ### What changes were proposed in this pull request? This change make InterruptedIOException to be treated as InterruptedException when closing YarnClientSchedulerBackend, which doesn't log error with "YARN application has exited unexpectedly xxx" ### Why are the changes needed? For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries to interrupt Yarn application monitor thread. In MonitorThread.run() it catches InterruptedException to gracefully response to stopping request. But client.monitorApplication method also throws InterruptedIOException when the hadoop rpc call is calling. In this case, MonitorThread will not know it is interrupted, a Yarn App failed is returned with "Failed to contact YARN for application xxxxx; YARN application has exited unexpectedly with state xxxxx" is logged with error level. which confuse user a lot. ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? very simple patch, seems no need? Closes apache#30617 from sqlwindspeaker/yarn-client-interrupt-monitor. Authored-by: suqilong <suqilong@qiyi.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> * [SPARK-33655][SQL] Improve performance of processing FETCH_PRIOR ### What changes were proposed in this pull request? Currently, when a client requests FETCH_PRIOR to Thriftserver, Thriftserver reiterates from the start position. Because Thriftserver caches a query result with an array when THRIFTSERVER_INCREMENTAL_COLLECT feature is off, FETCH_PRIOR can be implemented without reiterating the result. A trait FeatureIterator is added in order to separate the implementation for iterator and an array. Also, FeatureIterator supports moves cursor with absolute position, which will be useful for the implementation of FETCH_RELATIVE, FETCH_ABSOLUTE. ### Why are the changes needed? For better performance of Thriftserver. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? FetchIteratorSuite Closes apache#30600 from Dooyoung-Hwang/refactor_with_fetch_iterator. Authored-by: Dooyoung Hwang <dooyoung.hwang@sk.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> * [SPARK-33719][DOC] Add make_date/make_timestamp/make_interval into the doc of ANSI Compliance ### What changes were proposed in this pull request? Add make_date/make_timestamp/make_interval into the doc of ANSI Compliance ### Why are the changes needed? Users can know that these functions throw runtime exceptions under ANSI mode if the result is not valid. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Build doc and check it in browser: ![image](https://user-images.githubusercontent.com/1097932/101608930-34a79e80-39bb-11eb-9294-9d9b8c3f6faa.png) Closes apache#30683 from gengliangwang/improveDoc. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> * [SPARK-33071][SPARK-33536][SQL][FOLLOW-UP] Rename deniedMetadataKeys to nonInheritableMetadataKeys in Alias ### What changes were proposed in this pull request? This PR is a followup of apache#30488. This PR proposes to rename `Alias.deniedMetadataKeys` to `Alias.nonInheritableMetadataKeys` to make it less confusing. ### Why are the changes needed? To make it easier to maintain and read. ### Does this PR introduce _any_ user-facing change? No. This is rather a code cleanup. ### How was this patch tested? Ran the unittests written in the previous PR manually. Jenkins and GitHub Actions in this PR should also test them. Closes apache#30682 from HyukjinKwon/SPARK-33071-SPARK-33536. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org> * [SPARK-33722][SQL] Handle DELETE in ReplaceNullWithFalseInPredicate ### What changes were proposed in this pull request? This PR adds `DeleteFromTable` to supported plans in `ReplaceNullWithFalseInPredicate`. ### Why are the changes needed? This change allows Spark to optimize delete conditions like we optimize filters. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This PR extends the existing test cases to also cover `DeleteFromTable`. Closes apache#30688 from aokolnychyi/spark-33722. Authored-by: Anton Okolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> Co-authored-by: Kent Yao <yaooqinn@hotmail.com> Co-authored-by: suqilong <suqilong@qiyi.com> Co-authored-by: Dooyoung Hwang <dooyoung.hwang@sk.com> Co-authored-by: Gengliang Wang <gengliang.wang@databricks.com> Co-authored-by: HyukjinKwon <gurwls223@apache.org> Co-authored-by: Anton Okolnychyi <aokolnychyi@apple.com>

yaooqinn added 2 commits December 8, 2020 02:35

[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCH…

898172b

…AR types

[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCH…

3cb8724

…AR types

github-actions bot added the DOCS label Dec 7, 2020

maropu reviewed Dec 8, 2020

View reviewed changes

[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCH…

a50e5ec

…AR types

maropu reviewed Dec 8, 2020

View reviewed changes

maropu approved these changes Dec 8, 2020

View reviewed changes

[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCH…

80d4421

…AR types

dongjoon-hyun reviewed Dec 8, 2020

View reviewed changes

dongjoon-hyun requested a review from cloud-fan December 8, 2020 20:14

cloud-fan reviewed Dec 9, 2020

View reviewed changes

typo

dc88a0b

cloud-fan closed this in c88edda Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types #30654

[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types #30654

yaooqinn commented Dec 7, 2020

SparkQA commented Dec 7, 2020

SparkQA commented Dec 7, 2020

SparkQA commented Dec 7, 2020

maropu Dec 8, 2020

maropu Dec 8, 2020

maropu Dec 8, 2020

yaooqinn Dec 8, 2020

maropu Dec 8, 2020

maropu left a comment

SparkQA commented Dec 8, 2020

SparkQA commented Dec 8, 2020

SparkQA commented Dec 8, 2020

dongjoon-hyun Dec 8, 2020

cloud-fan Dec 9, 2020

cloud-fan commented Dec 9, 2020

SparkQA commented Dec 9, 2020

SparkQA commented Dec 9, 2020

SparkQA commented Dec 9, 2020

		@@ -54,6 +54,8 @@ license: \|

		- In Spark 3.1, creating or altering a view will capture runtime SQL configs and store them as view properties. These configs will be applied during the parsing and analysis phases of the view resolution. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.useCurrentConfigsForView` to `true`.

		- In Spark 3.1, CHAR/CHARACTER and VARCHAR types become individual types from string. By default, they can only be used in table schema, not functions/operators. To restore the behavior before Spark 3.1, where treats them as string with length parameter simply ignored, you can set `spark.sql.legacy.charVarcharAsString` to `true`.

		@@ -54,6 +54,8 @@ license: \|

		- In Spark 3.1, creating or altering a view will capture runtime SQL configs and store them as view properties. These configs will be applied during the parsing and analysis phases of the view resolution. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.useCurrentConfigsForView` to `true`.

		- In Spark 3.1, we support CHAR/CHARACTER and VARCHAR types in our type system framework instead of replacing them with STRING types. By default, they can only be used in table schema, not functions/operators. To restore the behavior before Spark 3.1, which treats them as STRING types and ignores a length parameter, e.g. `CHAR(4)`, you can set `spark.sql.legacy.charVarcharAsString` to `true`.

[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types #30654

[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types #30654

Conversation

yaooqinn commented Dec 7, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

SparkQA commented Dec 7, 2020

SparkQA commented Dec 7, 2020

SparkQA commented Dec 7, 2020

maropu Dec 8, 2020

Choose a reason for hiding this comment

maropu Dec 8, 2020

Choose a reason for hiding this comment

maropu Dec 8, 2020

Choose a reason for hiding this comment

yaooqinn Dec 8, 2020

Choose a reason for hiding this comment

maropu Dec 8, 2020

Choose a reason for hiding this comment

maropu left a comment

Choose a reason for hiding this comment

SparkQA commented Dec 8, 2020

SparkQA commented Dec 8, 2020

SparkQA commented Dec 8, 2020

dongjoon-hyun Dec 8, 2020

Choose a reason for hiding this comment

cloud-fan Dec 9, 2020

Choose a reason for hiding this comment

cloud-fan commented Dec 9, 2020

SparkQA commented Dec 9, 2020

SparkQA commented Dec 9, 2020

SparkQA commented Dec 9, 2020