Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cuFile for Parquet IO when available #7444

Merged
merged 74 commits into from
Mar 17, 2021

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Feb 24, 2021

Adds optional cuFile integration:

  • cufile.h is included in the build when available.
  • libcufile.so is loaded at runtime if LIBCUDF_CUFILE_POLICY environment variable is set to "ALWAYS" or "GDS".
  • cuFile compatibility mode is set through the same policy variable - "ALWAYS" means on, "GDS" means off.
  • cuFile is currently only used on Parquet R/W and in CSV writer.
  • device_read/write API can be used with file datasource/data_sink.
  • Added CUDA stream to device_read.

sft-managed and others added 30 commits September 17, 2020 02:54
@kkraus14 kkraus14 requested review from devavret and removed request for trxcllnt and jrhemstad March 11, 2021 20:56
@kkraus14
Copy link
Collaborator

reassigned review to @devavret since it's cuIO related

Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obvious partial review but I wanted to say this sooner.

cudaFreeHost};
}
}();
auto host_bfr = pinned_buffer<uint8_t>{[](size_t size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cudaMallocHost is expensive. There was an optimization to not do it when using device write. Why was it removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not intentional, this is probably because this change was here before the optimization and I messed up a merge. Will restore the behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, now the buffer is lazily allocated on first use.

@vuule vuule requested a review from devavret March 13, 2021 01:30
@mike-wendt
Copy link
Contributor

Updating the gpuci/rapidsai and gpuci/rapidsai-driver images used for builds and testing to include the cufile.h in the images. Will post here once they are distributed and available for testing this PR

@mike-wendt
Copy link
Contributor

@kkraus14 @vuule FYI I just merged rapidsai/gpuci-build-environment#166 - it should take ~1hr for the images to build and be available for use in gpuCI

@vuule
Copy link
Contributor Author

vuule commented Mar 17, 2021

Rerun tests

@kkraus14
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 39ad863 into rapidsai:branch-0.19 Mar 17, 2021
@vuule vuule deleted the fea-gds-parquet branch March 17, 2021 17:16
@lmeyerov
Copy link

@vuule awesome!!! So we should be able to try on nightly's tmw+?

@kkraus14
Copy link
Collaborator

Nightly's available now should be able to be tested. You'll need to install GDS from here: https://developer.nvidia.com/gpudirect-storage

Then there's the environment variable of LIBCUDF_CUFILE_POLICY which can be set to ALWAYS or GDS.

  • ALWAYS will use GDS if possible (meaning the filesystem is capable and the GDS driver is loaded), and fallback to cuFile compatibility mode if either the filesystem is not capable or the GDS driver isn't loaded
  • GDS will use GDS if possible (see above), and fallback to our existing IO code if either the filesystem isn't capable or the GDS driver isn't loaded

By default cuFile / GDS will not be used and our existing IO code will be used.

@lmeyerov
Copy link

Amazing, thanks all :) Starting on the build process here, and will hopefully have our first benchmarks over the weekend / early next. Will share on Slack when up :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants