Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update from_json to use null as line delimiter #11499

Closed
wants to merge 1 commit into from

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Sep 25, 2024

I am leaving this as draft for two reasons.

  1. it does not fix having \r in the JSON. See [BUG] \n is not considered whitespace when tokenizing JSON rapidsai/cudf#16915
  2. There is a very large performance hit going to a regexp for stripping the characters from the input. I am a bit conflicted here because we do need a fix for this at some point even without trying to support \r and \n in the data because \t is not being treated as an empty line and will fail.

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2
Copy link
Collaborator Author

revans2 commented Oct 9, 2024

We are going to do this a different way.

@revans2 revans2 closed this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant