Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove validation that requires introspection #11938

Merged
merged 14 commits into from
Oct 20, 2022

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Oct 17, 2022

Description

This PR removes optional validation for some APIs. Performing these validations requires data introspection, which we do not want. This PR resolves #5505.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function breaking Breaking change labels Oct 17, 2022
@vyasr vyasr requested a review from a team as a code owner October 17, 2022 22:38
@vyasr vyasr self-assigned this Oct 17, 2022
@vyasr vyasr requested a review from a team as a code owner October 17, 2022 22:38
@github-actions github-actions bot added the Python Affects Python cuDF API. label Oct 17, 2022
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just a couple of comments.

cpp/include/cudf/filling.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/copying.hpp Show resolved Hide resolved
Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cython/Python approval.

@vyasr vyasr requested a review from davidwendt October 18, 2022 20:29
Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to love negative lines of code.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 18, 2022

rerun tests

@codecov
Copy link

codecov bot commented Oct 18, 2022

Codecov Report

Base: 87.40% // Head: 86.87% // Decreases project coverage by -0.53% ⚠️

Coverage data is based on head (aafe0d2) compared to base (f72c4ce).
Patch coverage: 88.75% of modified lines in pull request are covered.

❗ Current head aafe0d2 differs from pull request most recent head aef6e1e. Consider uploading reports for the commit aef6e1e to get more accurate results

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-22.12   #11938      +/-   ##
================================================
- Coverage         87.40%   86.87%   -0.54%     
================================================
  Files               133      133              
  Lines             21833    21900      +67     
================================================
- Hits              19084    19025      -59     
- Misses             2749     2875     +126     
Impacted Files Coverage Δ
python/cudf/cudf/core/dataframe.py 93.73% <ø> (-0.05%) ⬇️
python/cudf/cudf/core/indexed_frame.py 92.03% <ø> (ø)
python/cudf/cudf/core/udf/__init__.py 50.00% <ø> (ø)
python/cudf/cudf/io/orc.py 92.94% <ø> (-0.09%) ⬇️
python/cudf/cudf/testing/dataset_generator.py 72.83% <ø> (-0.42%) ⬇️
...thon/dask_cudf/dask_cudf/tests/test_distributed.py 18.86% <ø> (+4.94%) ⬆️
python/cudf/cudf/core/_base_index.py 82.20% <43.75%> (-3.35%) ⬇️
python/cudf/cudf/io/text.py 91.66% <66.66%> (-8.34%) ⬇️
python/strings_udf/strings_udf/__init__.py 84.31% <76.00%> (-12.57%) ⬇️
python/cudf/cudf/core/index.py 92.96% <95.71%> (+0.33%) ⬆️
... and 24 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 18, 2022

@jlowe I can fix the java errors, but before I do so do you know if this removal will cause any problems? Does anything in the Spark plugin etc currently rely on this behavior that might need to add in an extra layer to do the check now?

@jlowe
Copy link
Member

jlowe commented Oct 19, 2022

Does anything in the Spark plugin etc currently rely on this behavior that might need to add in an extra layer to do the check now?

No, I don't see any cases where the plugin is relying on the check for proper behavior. However this change will break the Java APIs and thus the Spark plugin build when it merges. I'll work on a Spark plugin PR that can be tested against this one to help avoid surprises and reduce the time the plugin build is broken.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 19, 2022

Awesome thanks @jlowe. What is the best path forward here? Should I go ahead and update the JNI code in this PR so that you can build directly against it, or is that not something that you need?

@jlowe
Copy link
Member

jlowe commented Oct 19, 2022

What is the best path forward here?

Go ahead and make the Java changes in this PR. I've already posted a plugin PR at NVIDIA/spark-rapids#6860 with the assumption the check parameter will be removed from Table.scatter and Table.repeat. After you post the Java changes to this PR, I'll do a build against the cudf PR to verify the Spark plugin is still happy with my plugin changes to fix the build. Assuming all goes well, we'll be ready to quickly fix the Spark plugin build when this PR merges and is published in the nightly snapshot.

@github-actions github-actions bot added the Java Affects Java cuDF API. label Oct 19, 2022
@vyasr
Copy link
Contributor Author

vyasr commented Oct 19, 2022

Yup I saw the PR. I've made the changes here and TableJni.cpp now compiles for me locally, so I think it's ready.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 19, 2022

Ah there appear to be some tests that I need to update. That shouldn't affect your ability to build and test the plugin against this though. I'll get on those tests.

@vyasr vyasr requested a review from a team as a code owner October 19, 2022 23:05
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built this against NVIDIA/spark-rapids#6860 and verified the integration tests for joins and arrays passed (i.e.: the places where we use Table.scatter and Table.repeat).

@vyasr
Copy link
Contributor Author

vyasr commented Oct 20, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit ff41841 into rapidsai:branch-22.12 Oct 20, 2022
@vyasr vyasr deleted the task/remove_validation branch October 20, 2022 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team breaking Breaking change improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[DISCUSSION] libcudf should not introspect input data to perform error checking
7 participants