[FEA] Implement lower/upper_bound for struct-typed columns #7690

gerashegalov · 2021-03-23T18:32:27Z

Is your feature request related to a problem? Please describe.
Distributed sort in Spark-RAPIDS requires Range Partitioning

Describe the solution you'd like
Range Partitioning is implemented utilizing lower/upper_bound calls. We would like bound be working for struct columns.

Describe alternatives you've considered
For datasets fitting on 1 GPU, the number of shuffle partitions can be set to 1 in Spark which bypasses calls to lower/upper_bounds. This is not generally applicable.

Additional context
See cuDF PR #7422 and NVIDIA/spark-rapids#1883

This PR add support for `lower_bound` and `upper_bound` binary searchs for structs column. This closes #7690. In addition to adding binary search for structs, I also did some refactoring for `tests/search/search_test.cpp`, extracting dictionary search test from it. As such, basic search tests, dictionary search tests and (the new) struct search tests are put in separate source files. This is easier to access and future maintainance. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Mike Wilson (https://github.com/hyperbolic2346) - David Wendt (https://github.com/davidwendt) - Keith Kraus (https://github.com/kkraus14) URL: #7865

gerashegalov added feature request New feature or request Needs Triage Need team to review and classify labels Mar 23, 2021

revans2 added the Spark Functionality that helps Spark RAPIDS label Mar 23, 2021

ttnghia self-assigned this Mar 25, 2021

gerashegalov mentioned this issue Mar 26, 2021

[FEA] Allow RangePartitioning to work with structs NVIDIA/spark-rapids#1607

Closed

kkraus14 added libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Mar 26, 2021

ttnghia mentioned this issue Apr 5, 2021

Struct binary search (lower_bound/upper_bound) #7865

Merged

ttnghia linked a pull request Apr 5, 2021 that will close this issue

Struct binary search (lower_bound/upper_bound) #7865

Merged

rapids-bot bot closed this as completed in #7865 Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Implement lower/upper_bound for struct-typed columns #7690

[FEA] Implement lower/upper_bound for struct-typed columns #7690

gerashegalov commented Mar 23, 2021

[FEA] Implement lower/upper_bound for struct-typed columns #7690

[FEA] Implement lower/upper_bound for struct-typed columns #7690

Comments

gerashegalov commented Mar 23, 2021