Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes up the overflowed fixed-point round on nullable column #10316

Merged
merged 4 commits into from
Feb 22, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions cpp/src/round/round.cu
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020, NVIDIA CORPORATION.
* Copyright (c) 2020-2022, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -16,6 +16,8 @@

#include <cudf/column/column_factories.hpp>
#include <cudf/copying.hpp>
#include <cudf/detail/copy_range.cuh>
#include <cudf/detail/iterator.cuh>
#include <cudf/detail/null_mask.hpp>
#include <cudf/detail/nvtx/ranges.hpp>
#include <cudf/detail/round.hpp>
Expand Down Expand Up @@ -259,8 +261,22 @@ std::unique_ptr<column> round_with(column_view const& input,
// overflow. Under this circumstance, we can simply output a zero column because no digits can
// survive such a large scale movement.
if (scale_movement > cuda::std::numeric_limits<Type>::digits10) {
auto zero_scalar = make_fixed_point_scalar<T>(0, scale_type{-decimal_places});
detail::fill_in_place(out_view, 0, out_view.size(), *zero_scalar, stream);
if (input.nullable()) {
auto device_view = column_device_view::create(out_view, stream);
detail::copy_range(thrust::make_constant_iterator(static_cast<Type>(0)),
detail::make_validity_iterator(*device_view),
out_view,
out_view.offset(),
out_view.offset() + out_view.size(),
stream);
} else {
detail::copy_range(thrust::make_constant_iterator(static_cast<Type>(0)),
thrust::make_constant_iterator(false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, if the input does not have null, the output will be a column of all nulls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right. It is unnecessary to set the null mask if the input doesn't have one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, then we can just use thrust::uninitialize_fill to fill the zero value for the output.

Copy link
Contributor Author

@sperlingxx sperlingxx Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. While, I am quite curious why we use unititialize_fill instead of thrust::flll here. Is it because thrust::fill will have extra cost on initializing the data before assigning the value? @ttnghia

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very interesting. I believe that for plain types they are the same. Otherwise, unititialize_fill calls copy constructor while thrust::fill call assignment operator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, it would be nice if you can add a unit test for this case 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting!
The unit test for validity was added into one of the existed test blocks.

out_view,
out_view.offset(),
out_view.offset() + out_view.size(),
stream);
}
} else {
Type const n = std::pow(10, scale_movement);
thrust::transform(rmm::exec_policy(stream),
Expand Down
51 changes: 28 additions & 23 deletions cpp/tests/round/round_tests.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020-2021, NVIDIA CORPORATION.
* Copyright (c) 2020-2022, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -311,36 +311,41 @@ TYPED_TEST(RoundTestsFixedPointTypes, TestForBlog)
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
}

TEST_F(RoundTests, TestScaleMovementExceedingMaxPrecision)
TYPED_TEST(RoundTestsFixedPointTypes, TestScaleMovementExceedingMaxPrecision)
{
using namespace numeric;
using dec32_wrapper = cudf::test::fixed_point_column_wrapper<int32_t>;
using dec64_wrapper = cudf::test::fixed_point_column_wrapper<int64_t>;
using dec128_wrapper = cudf::test::fixed_point_column_wrapper<__int128_t>;
using decimalXX = TypeParam;
using RepType = cudf::device_storage_type_t<decimalXX>;
using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;

// max precision of int32 = 9
// scale movement = -(-11) -1 = 10 > 9
auto const input_32 =
dec32_wrapper{{14, 15, 16, 24, 25, 26, -14, -15, -16, -24, -25, -26}, scale_type{1}};
auto const expected_32 = dec32_wrapper{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, scale_type{11}};
auto const result_32 = cudf::round(input_32, -11, cudf::rounding_method::HALF_UP);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_32, result_32->view());

// max precision of int64 = 18
// scale movement = -(-20) -1 = 19 > 18
auto const input_64 =
dec64_wrapper{{14, 15, 16, 24, 25, 26, -14, -15, -16, -24, -25, -26}, scale_type{1}};
auto const expected_64 = dec64_wrapper{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, scale_type{20}};
auto const result_64 = cudf::round(input_64, -20, cudf::rounding_method::HALF_EVEN);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_64, result_64->view());

// max precision of int128 = 38
// scale movement = -(-40) -1 = 39 > 18
auto const input_128 =
dec128_wrapper{{14, 15, 16, 24, 25, 26, -14, -15, -16, -24, -25, -26}, scale_type{1}};
auto const expected_128 = dec128_wrapper{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, scale_type{40}};
auto const result_128 = cudf::round(input_128, -40, cudf::rounding_method::HALF_UP);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_128, result_128->view());
// scale movement = -(-40) -1 = 39 > 38
auto const target_scale = cuda::std::numeric_limits<RepType>::digits10 + 1 + 1;

auto const input =
fp_wrapper{{14, 15, 16, 24, 25, 26, -14, -15, -16, -24, -25, -26}, scale_type{1}};
auto const expected = fp_wrapper{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, scale_type{target_scale}};
auto const result = cudf::round(input, -target_scale, cudf::rounding_method::HALF_UP);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());

auto const input_even =
fp_wrapper{{14, 15, 16, 24, 25, 26, -14, -15, -16, -24, -25, -26}, scale_type{1}};
auto const expected_even =
fp_wrapper{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, scale_type{target_scale}};
auto const result_even = cudf::round(input, -target_scale, cudf::rounding_method::HALF_EVEN);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_even, result_even->view());

const std::initializer_list<bool> validity = {1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0};
auto const input_null =
fp_wrapper{{14, 15, 16, 24, 25, 26, -14, -15, -16, -24, -25, -26}, validity, scale_type{1}};
auto const expected_null =
fp_wrapper{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, validity, scale_type{target_scale}};
auto const result_null = cudf::round(input_null, -target_scale, cudf::rounding_method::HALF_UP);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_null, result_null->view());
}

TYPED_TEST(RoundTestsFloatingPointTypes, SimpleFloatingPointTestHalfUp0)
Expand Down