-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33655][SQL] Improve performance of processing FETCH_PRIOR #30600
[SPARK-33655][SQL] Improve performance of processing FETCH_PRIOR #30600
Conversation
@juliuszsompolski cc : @HyukjinKwon , @dongjoon-hyun , @cloud-fan |
ok to test |
cc @wangyum as well |
Test build #132179 has finished for PR 30600 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
Jenkins, retest this please |
retest this please |
Test build #132191 has finished for PR 30600 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
- Remove setRelativePosition & setAbsolutePosition. - Add fetchPrior, fetchAbsolute and getFetchStart.
b490f49
to
2b1b700
Compare
Test build #132228 has finished for PR 30600 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Thanks @Dooyoung-Hwang
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me too. @wangyum can you do a final check when you find some time?
if (order.equals(FetchOrientation.FETCH_FIRST)) iter.fetchAbsolute(0) | ||
else if (order.equals(FetchOrientation.FETCH_PRIOR)) iter.fetchPrior(maxRowsL) | ||
else iter.fetchNext() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add curly braces? https://github.com/databricks/scala-style-guide/blob/master/README.md#curly
cc @maropu
- Add curly braces for if-else
Test build #132328 has finished for PR 30600 at commit
|
retest this please. |
Test build #132368 has finished for PR 30600 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
Merged to master. |
Thanks!! |
* [SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types ### What changes were proposed in this pull request? Add migration guide for CHAR VARCHAR types ### Why are the changes needed? for migration ### Does this PR introduce _any_ user-facing change? doc change ### How was this patch tested? passing ci Closes apache#30654 from yaooqinn/SPARK-33641-F. Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> * [SPARK-33669] Wrong error message from YARN application state monitor when sc.stop in yarn client mode ### What changes were proposed in this pull request? This change make InterruptedIOException to be treated as InterruptedException when closing YarnClientSchedulerBackend, which doesn't log error with "YARN application has exited unexpectedly xxx" ### Why are the changes needed? For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries to interrupt Yarn application monitor thread. In MonitorThread.run() it catches InterruptedException to gracefully response to stopping request. But client.monitorApplication method also throws InterruptedIOException when the hadoop rpc call is calling. In this case, MonitorThread will not know it is interrupted, a Yarn App failed is returned with "Failed to contact YARN for application xxxxx; YARN application has exited unexpectedly with state xxxxx" is logged with error level. which confuse user a lot. ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? very simple patch, seems no need? Closes apache#30617 from sqlwindspeaker/yarn-client-interrupt-monitor. Authored-by: suqilong <suqilong@qiyi.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> * [SPARK-33655][SQL] Improve performance of processing FETCH_PRIOR ### What changes were proposed in this pull request? Currently, when a client requests FETCH_PRIOR to Thriftserver, Thriftserver reiterates from the start position. Because Thriftserver caches a query result with an array when THRIFTSERVER_INCREMENTAL_COLLECT feature is off, FETCH_PRIOR can be implemented without reiterating the result. A trait FeatureIterator is added in order to separate the implementation for iterator and an array. Also, FeatureIterator supports moves cursor with absolute position, which will be useful for the implementation of FETCH_RELATIVE, FETCH_ABSOLUTE. ### Why are the changes needed? For better performance of Thriftserver. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? FetchIteratorSuite Closes apache#30600 from Dooyoung-Hwang/refactor_with_fetch_iterator. Authored-by: Dooyoung Hwang <dooyoung.hwang@sk.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> * [SPARK-33719][DOC] Add make_date/make_timestamp/make_interval into the doc of ANSI Compliance ### What changes were proposed in this pull request? Add make_date/make_timestamp/make_interval into the doc of ANSI Compliance ### Why are the changes needed? Users can know that these functions throw runtime exceptions under ANSI mode if the result is not valid. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Build doc and check it in browser: ![image](https://user-images.githubusercontent.com/1097932/101608930-34a79e80-39bb-11eb-9294-9d9b8c3f6faa.png) Closes apache#30683 from gengliangwang/improveDoc. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> * [SPARK-33071][SPARK-33536][SQL][FOLLOW-UP] Rename deniedMetadataKeys to nonInheritableMetadataKeys in Alias ### What changes were proposed in this pull request? This PR is a followup of apache#30488. This PR proposes to rename `Alias.deniedMetadataKeys` to `Alias.nonInheritableMetadataKeys` to make it less confusing. ### Why are the changes needed? To make it easier to maintain and read. ### Does this PR introduce _any_ user-facing change? No. This is rather a code cleanup. ### How was this patch tested? Ran the unittests written in the previous PR manually. Jenkins and GitHub Actions in this PR should also test them. Closes apache#30682 from HyukjinKwon/SPARK-33071-SPARK-33536. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org> * [SPARK-33722][SQL] Handle DELETE in ReplaceNullWithFalseInPredicate ### What changes were proposed in this pull request? This PR adds `DeleteFromTable` to supported plans in `ReplaceNullWithFalseInPredicate`. ### Why are the changes needed? This change allows Spark to optimize delete conditions like we optimize filters. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This PR extends the existing test cases to also cover `DeleteFromTable`. Closes apache#30688 from aokolnychyi/spark-33722. Authored-by: Anton Okolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> Co-authored-by: Kent Yao <yaooqinn@hotmail.com> Co-authored-by: suqilong <suqilong@qiyi.com> Co-authored-by: Dooyoung Hwang <dooyoung.hwang@sk.com> Co-authored-by: Gengliang Wang <gengliang.wang@databricks.com> Co-authored-by: HyukjinKwon <gurwls223@apache.org> Co-authored-by: Anton Okolnychyi <aokolnychyi@apple.com>
![pan3793](https://badgen.net/badge/Hello/pan3793/green) [![Closes #370](https://badgen.net/badge/Preview/Closes%20%23370/blue)](https://github.com/yaooqinn/kyuubi/pull/370) ![332](https://badgen.net/badge/%2B/332/red) ![24](https://badgen.net/badge/-/24/green) ![8](https://badgen.net/badge/commits/8/yellow) ![Feature](https://badgen.net/badge/Label/Feature/) ![Bug](https://badgen.net/badge/Label/Bug/) [❨?❩](https://pullrequestbadge.com/?utm_medium=github&utm_source=yaooqinn&utm_campaign=badge_info)<!-- PR-BADGE: PLEASE DO NOT REMOVE THIS COMMENT --> <!-- Thanks for sending a pull request! Here are some tips for you: 1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html 2. If the PR is related to an issue in https://github.com/yaooqinn/kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'. 3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'. --> ### _Why are the changes needed?_ <!-- Please clarify why the changes are needed. For instance, 1. If you add a feature, you can talk about the use case of it. 2. If you fix a bug, you can clarify why it is a bug. --> close #360 Ref: apache/spark#30600 ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/latest/tools/testing.html#running-tests) locally before make a pull request Closes #370 from pan3793/KYUUBI-360. e79b8cb [Cheng Pan] [KYUUBI #360] comments 0fae3db [Cheng Pan] fix import 3d1b2a6 [Cheng Pan] [KYUUBI #360] fix ut eda3e59 [Cheng Pan] [KYUUBI #360] fix import 16178d6 [Cheng Pan] [KYUUBI #360] ut 179404d [Cheng Pan] [KYUUBI #360] nit 455af6b [Cheng Pan] [KYUUBI #360] correct getNextRowSet with FETCH_PRIOR FETCH_FIRST 2307f1f [Cheng Pan] [KYUUBI #360] move ThriftUtils to kyuubi-common Authored-by: Cheng Pan <379377944@qq.com> Signed-off-by: Kent Yao <yao@apache.org>
What changes were proposed in this pull request?
Currently, when a client requests FETCH_PRIOR to Thriftserver, Thriftserver reiterates from the start position. Because Thriftserver caches a query result with an array when THRIFTSERVER_INCREMENTAL_COLLECT feature is off, FETCH_PRIOR can be implemented without reiterating the result. A trait FeatureIterator is added in order to separate the implementation for iterator and an array. Also, FeatureIterator supports moves cursor with absolute position, which will be useful for the implementation of FETCH_RELATIVE, FETCH_ABSOLUTE.
Why are the changes needed?
For better performance of Thriftserver.
Does this PR introduce any user-facing change?
No
How was this patch tested?
FetchIteratorSuite