Skip to content

Commit

Permalink
Merge branch 'branch-0.11' of github.com:rapidsai/cudf into fea-media…
Browse files Browse the repository at this point in the history
…n_with_null
  • Loading branch information
karthikeyann committed Nov 19, 2019
2 parents dc1f1dc + d640dd8 commit beb9539
Show file tree
Hide file tree
Showing 46 changed files with 4,382 additions and 390 deletions.
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,15 @@
- PR #3278 Add `to_host` utility to copy `column_view` to host
- PR #3087 Add new cudf::experimental bool8 wrapper
- PR #3219 Construct column from column_view
- PR #3229 Define and implement new search APIs
- PR #3308 java add API for memory usage callbacks
- PR #2691 Row-wise reduction and scan operations via CuPy
- PR #3291 Add normalize_nans_and_zeros
- PR #3344 java split API
- PR #2791 Add `groupby.std()`
- PR #3368 Enable dropna argument in dask_cudf groupby
- PR #3298 add null replacement iterator for column_device_view
- PR #3396 Update device_atomics with new bool8 and timestamp specializations

## Improvements

Expand Down Expand Up @@ -93,6 +95,7 @@
- PR #3350 Port NVStrings booleans convert functions
- PR #3231 Add `column::release()` to give up ownership of contents.
- PR #3157 Use enum class rather than enum for mask_allocation_policy
- PR #3232 Port NVStrings datetime conversion to cudf strings column
- PR #3136 Define and implement new transpose API
- PR #3237 Define and implement new transform APIs
- PR #3245 Move binaryop files to legacy
Expand All @@ -115,13 +118,18 @@
- PR #3294 Update to arrow-cpp and pyarrow 0.15.1
- PR #3310 Add `row_hasher` and `element_hasher` utilities
- PR #3286 Clean up the starter code on README
- PR #3322 Port NVStrings pad operations to cudf strings column
- PR #3345 Add cache member for number of characters in string_view class
- PR #3299 Define and implement new `is_sorted` APIs
- PR #3328 Partition by stripes in dask_cudf ORC reader
- PR #3243 Use upstream join code in dask_cudf
- PR #3371 Add `select` method to `table_view`
- PR #3309 Add java and JNI bindings for search bounds
- PR #3380 Concatenate columns of strings
- PR #3382 Add fill function for strings column
- PR #3391 Move device_atomics_tests.cu files to legacy
- PR #3389 Move quantiles.hpp + group_quantiles.hpp files to legacy
- PR #3398 Move reshape.hpp files to legacy

## Bug Fixes

Expand Down Expand Up @@ -170,6 +178,7 @@
- PR #3383 Fix : properly compute null counts for rolling_window.
- PR #3386 Removing external includes from `column_view.hpp`
- PR #3369 Add write_partition to dask_cudf to fix to_parquet bug
- PR #3388 Support getitem with bools when DataFrame has a MultiIndex


# cuDF 0.10.0 (16 Oct 2019)
Expand Down
2 changes: 2 additions & 0 deletions conda/recipes/libcudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ test:
- test -f $PREFIX/lib/libcudftestutil.a
- test -f $PREFIX/include/cudf/legacy/bitmask.hpp
- test -f $PREFIX/include/cudf/legacy/column.hpp
- test -f $PREFIX/include/cudf/legacy/reshape.hpp
- test -f $PREFIX/include/cudf/legacy/table.hpp
- test -f $PREFIX/include/cudf/utilities/legacy/nvcategory_util.hpp
- test -f $PREFIX/include/cudf/utilities/legacy/type_dispatcher.hpp
Expand All @@ -72,6 +73,7 @@ test:
- test -f $PREFIX/include/cudf/legacy/merge.hpp
- test -f $PREFIX/include/cudf/legacy/join.hpp
- test -f $PREFIX/include/cudf/legacy/predicates.hpp
- test -f $PREFIX/include/cudf/legacy/quantiles.hpp
- test -f $PREFIX/include/cudf/legacy/reduction.hpp
- test -f $PREFIX/include/cudf/legacy/replace.hpp
- test -f $PREFIX/include/cudf/legacy/rolling.hpp
Expand Down
23 changes: 16 additions & 7 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -400,8 +400,8 @@ add_library(cudf
src/datetime/legacy/datetime_ops.cu
src/datetime/datetime_util.cpp
src/hash/legacy/hashing.cu
src/quantiles/quantiles.cu
src/quantiles/group_quantiles.cu
src/quantiles/legacy/quantiles.cu
src/quantiles/legacy/group_quantiles.cu
src/reductions/legacy/reductions.cu
src/reductions/legacy/min.cu
src/reductions/legacy/max.cu
Expand All @@ -416,8 +416,8 @@ add_library(cudf
src/reductions/legacy/group_std.cu
src/reductions/legacy/scan.cu
src/replace/legacy/replace.cu
src/replace/replace.cu
src/reshape/stack.cu
src/replace/replace.cu
src/reshape/legacy/stack.cu
src/transpose/transpose.cu
src/transpose/legacy/transpose.cu
src/merge/legacy/merge.cu
Expand Down Expand Up @@ -477,6 +477,7 @@ add_library(cudf
src/filling/legacy/repeat.cu
src/filling/legacy/tile.cu
src/search/legacy/search.cu
src/search/search.cu
src/column/column.cu
src/column/column_view.cpp
src/column/column_device_view.cu
Expand All @@ -488,19 +489,27 @@ add_library(cudf
src/sort/sort.cu
src/column/legacy/interop.cpp
src/strings/attributes.cu
src/strings/copying/copying.cu
src/strings/copying/concatenate.cu
src/strings/sorting/sorting.cu
src/strings/substring.cu
src/strings/combine.cu
src/strings/char_types/char_types.cu
src/strings/case.cu
src/strings/char_types/char_types.cu
src/strings/combine.cu
src/strings/convert/convert_integers.cu
src/strings/convert/convert_booleans.cu
src/strings/convert/convert_datetime.cu
src/strings/convert/convert_floats.cu
src/strings/convert/convert_integers.cu
src/strings/copying/copying.cu
src/strings/filling/fill.cu
src/strings/find.cu
src/strings/filling/fill.cu
src/strings/padding.cu
src/strings/sorting/sorting.cu
src/strings/strings_column_factories.cu
src/strings/strings_column_view.cu
src/strings/strings_scalar_factories.cpp
src/strings/sorting/sorting.cu
src/strings/substring.cu
src/strings/utilities.cu
src/scalar/scalar.cpp
Expand Down
12 changes: 6 additions & 6 deletions cpp/benchmarks/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -96,15 +96,15 @@ ConfigureBench(TYPE_DISPATCHER_BENCH "${TD_BENCH_SRC}")
###################################################################################################
# - quantiles benchmark ---------------------------------------------------------------------------

set(QUANTILES_BENCH_SRC
"${CMAKE_CURRENT_SOURCE_DIR}/quantiles/group_quantiles_benchmark.cu")
set(LEGACY_QUANTILES_BENCH_SRC
"${CMAKE_CURRENT_SOURCE_DIR}/quantiles/legacy/group_quantiles_benchmark.cu")

ConfigureBench(QUANTILES_BENCH "${QUANTILES_BENCH_SRC}")
ConfigureBench(LEGACY_QUANTILES_BENCH "${LEGACY_QUANTILES_BENCH_SRC}")

###################################################################################################
# - stack benchmark -------------------------------------------------------------------------------

set(STACK_BENCH_SRC
"${CMAKE_CURRENT_SOURCE_DIR}/reshape/stack_benchmark.cu")
set(LEGACY_STACK_BENCH_SRC
"${CMAKE_CURRENT_SOURCE_DIR}/reshape/legacy/stack_benchmark.cu")

ConfigureBench(STACK_BENCH "${STACK_BENCH_SRC}")
ConfigureBench(LEGACY_STACK_BENCH "${LEGACY_STACK_BENCH_SRC}")
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

#include <tests/utilities/legacy/column_wrapper.cuh>

#include <cudf/quantiles.hpp>
#include <cudf/legacy/quantiles.hpp>
#include <random>

#include <benchmark/benchmark.h>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
#include <tests/utilities/legacy/column_wrapper.cuh>
#include <tests/utilities/legacy/column_wrapper_factory.hpp>

#include <cudf/reshape.hpp>
#include <cudf/legacy/reshape.hpp>
#include <cudf/types.h>

#include <benchmarks/fixture/benchmark_fixture.hpp>
Expand Down
2 changes: 1 addition & 1 deletion cpp/include/cudf/legacy/groupby.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

#include <cudf/cudf.h>
#include <cudf/types.hpp>
#include <cudf/quantiles.hpp>
#include <cudf/legacy/quantiles.hpp>

#include <tuple>
#include <vector>
Expand Down
1 change: 0 additions & 1 deletion cpp/include/cudf/legacy/io_types.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
#include <memory>
#include <utility>

#include <cudf/cudf.h>
#include <cudf/types.hpp>
#include <cudf/legacy/table.hpp>
#include <cudf/legacy/io_types.h>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@

#pragma once

#include "cudf.h"
#include "types.hpp"
#include <cudf/types.hpp>

namespace cudf {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

#pragma once

#include "cudf.h"
#include <cudf/cudf.h>

namespace cudf {

Expand Down
7 changes: 6 additions & 1 deletion cpp/include/cudf/scalar/scalar.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -131,13 +131,18 @@ class fixed_width_scalar : public scalar {
*
* @param stream The CUDA stream to do the operation in
*/
T value(cudaStream_t stream = 0) { return _data.value(stream); }
T value(cudaStream_t stream = 0) const { return _data.value(stream); }

/**
* @brief Returns a raw pointer to the value in device memory
*/
T* data() { return _data.data(); }

/**
* @brief Returns a raw pointer to the value in device memory
*/
T const* data() const { return _data.data(); }

protected:
rmm::device_scalar<T> _data{}; ///< device memory containing the value

Expand Down
135 changes: 135 additions & 0 deletions cpp/include/cudf/search.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
/*
* Copyright (c) 2019, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

#include <cudf/types.hpp>
#include <cudf/column/column.hpp>
#include <cudf/scalar/scalar.hpp>
#include <cudf/table/table.hpp>

#include <vector>

namespace cudf {
namespace experimental {

/**---------------------------------------------------------------------------*
* @brief Find smallest indices in a sorted table where values should be
* inserted to maintain order
*
* For each row v in @p values, find the first index in @p t where
* inserting the row will maintain the sort order of @p t
*
* Example:
*
* Single column:
* idx 0 1 2 3 4
* column = { 10, 20, 20, 30, 50 }
* values = { 20 }
* result = { 1 }
*
* Multi Column:
* idx 0 1 2 3 4
* t = {{ 10, 20, 20, 20, 20 },
* { 5.0, .5, .5, .7, .7 },
* { 90, 77, 78, 61, 61 }}
* values = {{ 20 },
* { .7 },
* { 61 }}
* result = { 3 }
*
* @param t Table to search
* @param values Find insert locations for these values
* @param column_order Vector of column sort order
* @param null_precedence Vector of null_precedence enums
* values
* @param mr Device memory resource to use for device memory allocation
* @return std::unique_ptr<column> A non-nullable column of cudf::size_type elements
* containing the insertion points.
*---------------------------------------------------------------------------**/
std::unique_ptr<column> lower_bound(table_view const& t,
table_view const& values,
std::vector<order> const& column_order,
std::vector<null_order> const& null_precedence,
rmm::mr::device_memory_resource* mr = rmm::mr::get_default_resource());

/**---------------------------------------------------------------------------*
* @brief Find largest indices in a sorted table where values should be
* inserted to maintain order
*
* For each row v in @p values, find the last index in @p t where
* inserting the row will maintain the sort order of @p t
*
* Example:
*
* Single Column:
* idx 0 1 2 3 4
* column = { 10, 20, 20, 30, 50 }
* values = { 20 }
* result = { 3 }
*
* Multi Column:
* idx 0 1 2 3 4
* t = {{ 10, 20, 20, 20, 20 },
* { 5.0, .5, .5, .7, .7 },
* { 90, 77, 78, 61, 61 }}
* values = {{ 20 },
* { .7 },
* { 61 }}
* result = { 5 * *
* @param column Table to search
* @param values Find insert locations for these values
* @param column_order Vector of column sort order
* @param null_precedence Vector of null_precedence enums
* values
* @param mr Device memory resource to use for device memory allocation
* @return std::unique_ptr<column> A non-nullable column of cudf::size_type elements
* containing the insertion points.
*---------------------------------------------------------------------------**/
std::unique_ptr<column> upper_bound(table_view const& t,
table_view const& values,
std::vector<order> const& column_order,
std::vector<null_order> const& null_precedence,
rmm::mr::device_memory_resource* mr = rmm::mr::get_default_resource());

/**---------------------------------------------------------------------------*
* @brief Find if the `value` is present in the `col`
*
* @throws cudf::logic_error
* If `col.type() != values.type()`
*
* @example:
*
* Single Column:
* idx 0 1 2 3 4
* col = { 10, 20, 20, 30, 50 }
* Scalar:
* value = { 20 }
* result = true
*
* @param col A column object
* @param value A scalar value to search for in `col`
* @param mr Device memory resource to use for device memory allocation
*
* @return bool If `value` is found in `column` true, else false.
*---------------------------------------------------------------------------**/
bool contains(column_view const& col, scalar const& value,
rmm::mr::device_memory_resource* mr = rmm::mr::get_default_resource());

} // namespace experimental
} // namespace cudf


Loading

0 comments on commit beb9539

Please sign in to comment.