Skip to content

Commit

Permalink
Fix orc reader assert on create data_type (#8174)
Browse files Browse the repository at this point in the history
The following error shows in the gtests/ORC_TEST when run under libcudf built with Debug.
```
[ RUN      ] OrcWriterNumericTypeTest/0.SingleColumn
ORC_TEST: ../include/cudf/types.hpp:267: cudf::data_type::data_type(cudf::type_id, int32_t): Assertion `id == type_id::DECIMAL32 || id == type_id::DECIMAL64' failed.
Aborted (core dumped)
```
The assert occurs in the `data_type` constructor meant for fixed-point types only.
https://github.com/rapidsai/cudf/blob/96c0706ad2b2dd788608540a97a5938d57cf3a44/cpp/include/cudf/types.hpp#L265-L268

The orc reader implementation is using this ctor for all types and just passing 0 for the scale value when not a fixed-point type.
The assert only occurs in a debug build.

This PR changes the logic to call the appropriate `data_type` constructor depending on the type-id of the parsed column.

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Devavret Makkar (https://github.com/devavret)
  - Vukasin Milovanovic (https://github.com/vuule)
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)

URL: #8174
  • Loading branch information
davidwendt authored May 7, 2021
1 parent 57a8ad2 commit 245d8c1
Showing 1 changed file with 9 additions and 6 deletions.
15 changes: 9 additions & 6 deletions cpp/src/io/orc/reader_impl.cu
Original file line number Diff line number Diff line change
Expand Up @@ -430,12 +430,15 @@ table_with_metadata reader::impl::read(size_type skip_rows,
// Remove this once we support Decimal128 data type
CUDF_EXPECTS((col_type != type_id::DECIMAL64) or (_metadata->ff.types[col].precision <= 18),
"Decimal data has precision > 18, Decimal64 data type doesn't support it.");
// sign of the scale is changed since cuDF follows c++ libraries like CNL
// which uses negative scaling, but liborc and other libraries
// follow positive scaling.
auto scale =
(col_type == type_id::DECIMAL64) ? -static_cast<int32_t>(_metadata->ff.types[col].scale) : 0;
column_types.emplace_back(col_type, scale);
if (col_type == type_id::DECIMAL64) {
// sign of the scale is changed since cuDF follows c++ libraries like CNL
// which uses negative scaling, but liborc and other libraries
// follow positive scaling.
auto const scale = -static_cast<int32_t>(_metadata->ff.types[col].scale);
column_types.emplace_back(col_type, scale);
} else {
column_types.emplace_back(col_type);
}

// Map each ORC column to its column
orc_col_map[col] = column_types.size() - 1;
Expand Down

0 comments on commit 245d8c1

Please sign in to comment.