Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example to build custom application and link to libcudf #7671

Merged
merged 33 commits into from
Jun 18, 2021
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
c128957
Migrating from link_to_libcudf repo
isVoid Mar 22, 2021
fa688f1
Doc, dockerfile revise
isVoid Mar 24, 2021
7ef5fe0
Multiple library usage, code style fixes.
isVoid Mar 24, 2021
abf20ab
several code style updates after merge
isVoid Mar 24, 2021
43ebd65
further trim down codes
isVoid Mar 24, 2021
eed062d
Just using basic rmm mr
isVoid Mar 24, 2021
22d8bcb
remove unused imports
isVoid Mar 24, 2021
a9a3bb5
simpler cmakefile
isVoid Mar 24, 2021
d3b789c
removed ccmake dependency
isVoid Mar 24, 2021
d956daa
Reorder includes and stale line remove
isVoid Mar 24, 2021
9957a7f
Remove cuda mr init
isVoid Mar 24, 2021
6f0b9ce
simpler dockerfile, readme update
isVoid Mar 26, 2021
8f6a98e
Organized into basic example
isVoid Mar 26, 2021
b2fdb37
creating an agg now a func
isVoid Apr 2, 2021
5ebf755
Update build description to include pre-built case
isVoid Apr 2, 2021
736e440
Moving buildargs below to avoid rerun apt-installs
isVoid Apr 5, 2021
9494537
cuda 11.2 and main branch
isVoid Apr 22, 2021
78e1d7c
removing dockerfile
isVoid Apr 22, 2021
efd897d
readme
isVoid Apr 22, 2021
8a49e83
Root readme, add to CI?
isVoid Apr 22, 2021
906154c
Updated with new build script
isVoid Apr 22, 2021
b80a7de
example build script bugs
isVoid Apr 22, 2021
de82e6e
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into b…
isVoid May 5, 2021
079b88e
Merge branch 'branch-21.06' of https://github.com/rapidsai/cudf into …
isVoid May 24, 2021
feb77cc
revert changes made to system level build scripts
isVoid May 24, 2021
6418342
Fixing bugs in build.sh
isVoid May 24, 2021
7d5f8c2
add examples into gpuci
isVoid May 24, 2021
d799c01
Comments
isVoid May 24, 2021
972f1a1
Return table_with_metadata instead
isVoid May 24, 2021
a87475e
readme
isVoid May 24, 2021
66dd82c
Abs path in build.sh
isVoid Jun 8, 2021
145a96e
Merge branch 'branch-21.08' of https://github.com/rapidsai/cudf into …
isVoid Jun 17, 2021
41a6e3f
Use latest branch and configure auto update
isVoid Jun 17, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions ci/gpu/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,10 @@ else
else
$WORKSPACE/build.sh cudf dask_cudf cudf_kafka -l --ptds
fi

# If examples grows too large to build, should move to cpu side
gpuci_logger "Building libcudf examples"
$WORKSPACE/cpp/examples/build.sh
fi

# Both regular and Project Flash proceed here
Expand Down
8 changes: 8 additions & 0 deletions cpp/examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Libcudf Examples

This folder contains examples to demonstrate libcudf use cases. Running `build.sh` builds all
libcudf examples.

Current examples:

- Basic: example that demonstrates basic use case with libcudf and building a custom application with libcudf.
21 changes: 21 additions & 0 deletions cpp/examples/basic/4stock_5day.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Company,Date,Open,High,Low,Close,Volume
MSFT,2021-03-03,232.16000366210938,233.5800018310547,227.25999450683594,227.55999755859375,33950400.0
MSFT,2021-03-04,226.74000549316406,232.49000549316406,224.25999450683594,226.72999572753906,44584200.0
MSFT,2021-03-05,229.52000427246094,233.27000427246094,226.4600067138672,231.60000610351562,41842100.0
MSFT,2021-03-08,231.3699951171875,233.3699951171875,227.1300048828125,227.38999938964844,35245900.0
MSFT,2021-03-09,232.8800048828125,235.3800048828125,231.6699981689453,233.77999877929688,33034000.0
GOOG,2021-03-03,2067.2099609375,2088.51806640625,2010.0,2026.7099609375,1483100.0
GOOG,2021-03-04,2023.3699951171875,2089.239990234375,2020.27001953125,2049.090087890625,2116100.0
GOOG,2021-03-05,2073.1201171875,2118.110107421875,2046.4150390625,2108.5400390625,2193800.0
GOOG,2021-03-08,2101.1298828125,2128.81005859375,2021.6099853515625,2024.1700439453125,1646000.0
GOOG,2021-03-09,2070.0,2078.0400390625,2047.8299560546875,2052.699951171875,1696400.0
AMZN,2021-03-03,3081.179931640625,3107.780029296875,2995.0,3005.0,3967200.0
AMZN,2021-03-04,3012.0,3058.1298828125,2945.429931640625,2977.570068359375,5458700.0
AMZN,2021-03-05,3005.0,3009.0,2881.0,3000.4599609375,5383400.0
AMZN,2021-03-08,3015.0,3064.590087890625,2951.31005859375,2951.949951171875,4178500.0
AMZN,2021-03-09,3017.989990234375,3090.9599609375,3005.14990234375,3062.85009765625,4023500.0
AAPL,2021-03-03,124.80999755859375,125.70999908447266,121.83999633789062,122.05999755859375,112430400.0
AAPL,2021-03-04,121.75,123.5999984741211,118.62000274658203,120.12999725341797,177275300.0
AAPL,2021-03-05,120.9800033569336,121.94000244140625,117.56999969482422,121.41999816894531,153590400.0
AAPL,2021-03-08,120.93000030517578,121.0,116.20999908447266,116.36000061035156,153918600.0
AAPL,2021-03-09,119.02999877929688,122.05999755859375,118.79000091552734,121.08999633789062,129159600.0
34 changes: 34 additions & 0 deletions cpp/examples/basic/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
cmake_minimum_required(VERSION 3.18)

project(basic_example VERSION 0.0.1 LANGUAGES C CXX CUDA)

set(CMAKE_CXX_STANDARD 14)
isVoid marked this conversation as resolved.
Show resolved Hide resolved
set(CMAKE_CUDA_ARCHITECTURES "")
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

set(CPM_DOWNLOAD_VERSION 0.27.2)
set(CPM_DOWNLOAD_LOCATION "${CMAKE_BINARY_DIR}/cmake/CPM_${CPM_DOWNLOAD_VERSION}.cmake")

if(NOT (EXISTS ${CPM_DOWNLOAD_LOCATION}))
message(STATUS "Downloading CPM.cmake")
file(DOWNLOAD https://github.com/TheLartians/CPM.cmake/releases/download/v${CPM_DOWNLOAD_VERSION}/CPM.cmake ${CPM_DOWNLOAD_LOCATION})
endif()

include(${CPM_DOWNLOAD_LOCATION})

CPMAddPackage(NAME cudf
GIT_REPOSITORY https://github.com/rapidsai/cudf
GIT_TAG main
isVoid marked this conversation as resolved.
Show resolved Hide resolved
GIT_SHALLOW TRUE
SOURCE_SUBDIR cpp
OPTIONS "BUILD_TESTS OFF"
"BUILD_BENCHMARKS OFF"
"ARROW_STATIC_LIB ON"
"JITIFY_USE_CACHE ON"
"CUDA_STATIC_RUNTIME ON"
"DISABLE_DEPRECATION_WARNING ON"
)

# Configure your project here
add_executable(${PROJECT_NAME} "src/process_csv.cpp")
target_link_libraries(${PROJECT_NAME} cudf::cudf)
23 changes: 23 additions & 0 deletions cpp/examples/basic/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Basic Standalone libcudf C++ application

This C++ example demonstrates a basic libcudf use case and provides a minimal
example of building your own application based on libcudf using CMake.

The example source code loads a csv file that contains stock prices from 4
companies spanning across 5 days, computes the average of the closing price
for each company and writes the result in csv format.

## Compile and execute

```bash
# Configure project
cmake -S . -B build/
# Build
cmake --build build/ --parallel $PARALLEL_LEVEL
# Execute
build/libcudf_example
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could mention they can alternatively build using build.sh

If your machine does not come with a pre-built libcudf binary, expect the
first build to take some time, as it would build libcudf on the host machine.
It may be sped up by configuring the proper `PARALLEL_LEVEL` number.
68 changes: 68 additions & 0 deletions cpp/examples/basic/src/process_csv.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#include <cudf/aggregation.hpp>
#include <cudf/groupby.hpp>
#include <cudf/io/csv.hpp>
#include <cudf/table/table.hpp>

#include <memory>
#include <string>
#include <utility>
#include <vector>

cudf::io::table_with_metadata read_csv(std::string const& file_path)
{
auto source_info = cudf::io::source_info(file_path);
auto builder = cudf::io::csv_reader_options::builder(source_info);
auto options = builder.build();
return cudf::io::read_csv(options);
}

void write_csv(cudf::table_view const& tbl_view, std::string const& file_path)
{
auto sink_info = cudf::io::sink_info(file_path);
auto builder = cudf::io::csv_writer_options::builder(sink_info, tbl_view);
auto options = builder.build();
cudf::io::write_csv(options);
}

std::vector<cudf::groupby::aggregation_request> make_single_aggregation_request(
std::unique_ptr<cudf::aggregation>&& agg, cudf::column_view value)
{
std::vector<cudf::groupby::aggregation_request> requests;
requests.emplace_back(cudf::groupby::aggregation_request());
requests[0].aggregations.push_back(std::move(agg));
requests[0].values = value;
return requests;
}

std::unique_ptr<cudf::table> average_closing_price(cudf::table_view stock_info_table)
{
// Schema: | Company | Date | Open | High | Low | Close | Volume |
auto keys = cudf::table_view{{stock_info_table.column(0)}}; // Company
auto val = stock_info_table.column(5); // Close

// Compute the average of each company's closing price with entire column
cudf::groupby::groupby grpby_obj(keys);
auto requests = make_single_aggregation_request(cudf::make_mean_aggregation(), val);

auto agg_results = grpby_obj.aggregate(requests);

// Assemble the result
auto result_key = std::move(agg_results.first);
auto result_val = std::move(agg_results.second[0].results[0]);
std::vector<cudf::column_view> columns{result_key->get_column(0), *result_val};
return std::make_unique<cudf::table>(cudf::table_view(columns));
}

int main(int argc, char** argv)
{
// Read data
auto stock_table_with_metadata = read_csv("4stock_5day.csv");

// Process
auto result = average_closing_price(*stock_table_with_metadata.tbl);

// Write out result
write_csv(*result, "4stock_5day_avg_close.csv");

return 0;
}
22 changes: 22 additions & 0 deletions cpp/examples/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/bin/bash

# Copyright (c) 2021, NVIDIA CORPORATION.

# libcudf examples build script

# Add libcudf examples build scripts down below

# Parallelism control
PARALLEL_LEVEL=${PARALLEL_LEVEL:-4}
Copy link
Contributor

@ttnghia ttnghia May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert in cmake/shell thus I'm not sure if this will result in -j 4 or -j -4?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be the same as here:

export PARALLEL_LEVEL=${PARALLEL_LEVEL:-4}

Copy link
Contributor

@ttnghia ttnghia May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it! We will have 4, not -4 so everything will be fine. The shell syntax is a bit confusing:

${parameter:-word}
    If parameter is unset or null, the expansion of word is substituted. 
    Otherwise, the value of parameter is substituted.

Ref: https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html


EXAMPLES_DIR=${WORKSPACE}/cpp/examples

################################################################################
# Basic example
BASIC_EXAMPLE_DIR=${EXAMPLES_DIR}/basic
BASIC_EXAMPLE_BUILD_DIR=${BASIC_EXAMPLE_DIR}/build

# Configure
cmake -S ${BASIC_EXAMPLE_DIR} -B ${BASIC_EXAMPLE_BUILD_DIR}
# Build
cmake --build ${BASIC_EXAMPLE_BUILD_DIR} -j${PARALLEL_LEVEL}