Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in negative size checks for columns #12118

Merged
merged 3 commits into from
Nov 15, 2022

Conversation

revans2
Copy link
Contributor

@revans2 revans2 commented Nov 10, 2022

Description

This fixes #12116
This just adds in a few checks for negative sizes to avoid any issues with rounding errors and also helps us detect errors sooner. It will not fix small negative allocations for device buffers directly.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@revans2 revans2 added bug Something isn't working 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS non-breaking Non-breaking change labels Nov 10, 2022
@revans2 revans2 self-assigned this Nov 10, 2022
@revans2 revans2 requested a review from a team as a code owner November 10, 2022 16:44
@@ -133,6 +133,7 @@ class column {
_null_count{null_count},
_children{std::move(children)}
{
CUDF_EXPECTS(size >= 0, "Column size cannot be negative.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the only one you need. The other ones are redundant since the all call through to here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct, but I added the other checks as a way to skip allocating negative data and null mask buffers with a negative size if we know early on that it is not needed. If you want me to remove the other checks I will.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that makes sense. We should leave those in.
The other thing to do here is to add @throw tag to the doxygen comments for each of these

* @throws cudf::logic_error if `size < 0`

The column_factories ones go into cpp/include/cudf/column/column_factories.hpp

@codecov
Copy link

codecov bot commented Nov 10, 2022

Codecov Report

Base: 87.47% // Head: 88.12% // Increases project coverage by +0.64% 🎉

Coverage data is based on head (21d0ab2) compared to base (f817d96).
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-22.12   #12118      +/-   ##
================================================
+ Coverage         87.47%   88.12%   +0.64%     
================================================
  Files               133      135       +2     
  Lines             21826    22133     +307     
================================================
+ Hits              19093    19504     +411     
+ Misses             2733     2629     -104     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/interval.py 85.45% <0.00%> (-9.10%) ⬇️
python/cudf/cudf/io/text.py 91.66% <0.00%> (-8.34%) ⬇️
python/cudf/cudf/core/_base_index.py 81.28% <0.00%> (-4.27%) ⬇️
python/cudf/cudf/io/json.py 92.06% <0.00%> (-2.68%) ⬇️
python/cudf/cudf/utils/utils.py 89.91% <0.00%> (-0.69%) ⬇️
python/cudf/cudf/core/column/timedelta.py 90.17% <0.00%> (-0.58%) ⬇️
python/cudf/cudf/core/column/datetime.py 89.21% <0.00%> (-0.51%) ⬇️
python/cudf/cudf/core/column/column.py 87.96% <0.00%> (-0.46%) ⬇️
python/dask_cudf/dask_cudf/core.py 73.72% <0.00%> (-0.41%) ⬇️
python/cudf/cudf/io/parquet.py 90.45% <0.00%> (-0.39%) ⬇️
... and 43 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@revans2
Copy link
Contributor Author

revans2 commented Nov 14, 2022

@davidwendt thanks for the review. I added in the doxegen tags.

@davidwendt
Copy link
Contributor

@davidwendt thanks for the review. I added in the doxegen tags.

Can you move the @throw tags up in the doxygen next to the other @throw tags?
Guidelines for reference: https://github.com/rapidsai/cudf/blob/branch-22.12/cpp/doxygen/developer_guide/DOCUMENTATION.md#function-parameters

@revans2
Copy link
Contributor Author

revans2 commented Nov 15, 2022

@davidwendt sorry about getting the placement wrong. Hopefully it is fixed now.

@revans2
Copy link
Contributor Author

revans2 commented Nov 15, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 186e129 into rapidsai:branch-22.12 Nov 15, 2022
@revans2 revans2 deleted the add_neg_size_checks branch November 15, 2022 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Allocating columns with small negative lengths succeeds
4 participants