Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support GpuSize #1972

Merged
merged 2 commits into from
Mar 19, 2021
Merged

support GpuSize #1972

merged 2 commits into from
Mar 19, 2021

Conversation

sperlingxx
Copy link
Collaborator

This PR is to support GpuSize.
The intention is to fix test failures on GpuExplode under spark 3.1. The root cause of the failure is that in spark 3.1 newly added rule InferFiltersFromGenerate will insert a group by expressions include GPU unsupported expression Size.

Signed-off-by: sperlingxx lovedreamf@gmail.com

Signed-off-by: sperlingxx <lovedreamf@gmail.com>
Signed-off-by: sperlingxx <lovedreamf@gmail.com>
@sperlingxx
Copy link
Collaborator Author

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code itself looks fine, but I just would prefer for us to do the right thing instead of putting in a hack like this.

expr[Size](
"The size of an array or a map",
ExprChecks.unaryProjectNotLambda(TypeSig.INT, TypeSig.INT,
(TypeSig.ARRAY + TypeSig.MAP).nested(TypeSig.all),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to removed UDT from this for the non-spark use case. This is just to make the docs correct, because we don't support UDT anywhere.

// In JNI layer, we add 1 to the child index when fetching child column of ListType to keep
// alignment.
// So, in JVM layer, we have to use -1 as index to fetch the real first child of list_column.
withResource(inputBase.getChildColumnView(-1)) { offset =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets just add in JNI for this instead.

https://github.com/rapidsai/cudf/blob/6fe8b1b7be71c0f3587d3ef43c535aec376d1e45/cpp/include/cudf/lists/count_elements.hpp#L49

It does everything we want, except for the legacy null behavior, and we can fix that up without too much work.

var ret = intput.getBase.countElements()
if (legacySizeOfNull) {
    withResource(input.getBase.isNull()) { isNull =>
        withResource(GpuScalar.from(-1)) { negOne =>
            withResource(ret) { count =>
                ret = isNull.ifElse(negOne, count)
            }
        }
    }
}

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the part about this is blocking issues with the nightly tests. Lets merge this in as is and have a follow on to make ti work correctly. I'll get a PR up for the JNI code ASAP so we can get a real fix in soon too.

@revans2
Copy link
Collaborator

revans2 commented Mar 19, 2021

I filed #1974 as a follow on issue for this.

@revans2 revans2 merged commit 4b7c78e into NVIDIA:branch-0.5 Mar 19, 2021
@sameerz sameerz added the task Work required that improves the product but is not user facing label Mar 21, 2021
@sperlingxx sperlingxx deleted the gpu_size branch March 22, 2021 07:59
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Signed-off-by: sperlingxx <lovedreamf@gmail.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Signed-off-by: sperlingxx <lovedreamf@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants