Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to CPU when encoding is not supported for JSON reader #4622

Merged
merged 2 commits into from
Jan 26, 2022

Conversation

wbo4958
Copy link
Collaborator

@wbo4958 wbo4958 commented Jan 25, 2022

Falllback to CPU when encoding is not UTF-8 or US-ASCII

@wbo4958 wbo4958 requested a review from revans2 January 25, 2022 01:06
@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 25, 2022

build

@sameerz sameerz added the task Work required that improves the product but is not user facing label Jan 25, 2022
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little late to be doing this for 22.02. But it is a simple bug fix.

@@ -138,6 +140,11 @@ object GpuJsonScan {
meta.willNotWorkOnGpu("GpuJsonScan only supports \"\\n\" as a line separator")
}

parsedOptions.encoding.foreach(enc =>
if (enc != StandardCharsets.UTF_8.name() && enc != StandardCharsets.US_ASCII.name()) {
meta.willNotWorkOnGpu("GpuJsonScan only supports UTF8 or US-ASCII encoded data")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have the error message include the encoding that is causing us to fall back? This will make it easier for the user to debug.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the comment, I'd like to create a follow-up targeting 22.04

@@ -138,6 +140,11 @@ object GpuJsonScan {
meta.willNotWorkOnGpu("GpuJsonScan only supports \"\\n\" as a line separator")
}

parsedOptions.encoding.foreach(enc =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we use exists here instead since this is an Option? Something like:

  if (parsedOptions.encoding.exists(enc =>
      enc != StandardCharsets.UTF_8.name() && enc != StandardCharsets.US_ASCII.name())) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the comment, I'd like to create a follow-up targeting 22.04

@sameerz sameerz added this to the Jan 10 - Jan 28 milestone Jan 25, 2022
@wbo4958 wbo4958 merged commit 871a1df into NVIDIA:branch-22.02 Jan 26, 2022
@wbo4958 wbo4958 deleted the json-encoding branch January 26, 2022 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants