Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add json reader support #4485

Merged
merged 12 commits into from
Jan 20, 2022
Merged

Add json reader support #4485

merged 12 commits into from
Jan 20, 2022

Conversation

wbo4958
Copy link
Collaborator

@wbo4958 wbo4958 commented Jan 10, 2022

This PR is trying to support the json reader on basic type support.

Signed-off-by: Bobby Wang <wbo4958@gmail.com>
@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 10, 2022

build

@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 11, 2022

build

@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 11, 2022

build

@sameerz sameerz added the feature request New feature or request label Jan 11, 2022
@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 11, 2022

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is looking good. My main concerns at this point is adding some documentation to docs/compatibility.md. Even if that is just that this is a very experimental feature and has a lot of problems. More specifics about what does not work is better.

I also wanted to make sure that @tgravescs and @nartal1 are okay with JSON reads not being flagged in the qualification tool any more. We generate that file automatically, but I am not 100% sure with our current level of support that we want JSON to be on the truly supported list.

@tgravescs
Copy link
Collaborator

since the supported data sources file has it as CO the qualification tool should still report it as not supported properly, but we should test to make sure @nartal1 can you test?

@tgravescs
Copy link
Collaborator

sorry, I didn't fully look at the PR, looks like the qualification results were update, only diff is it reports each type. They are properly listed under not supported still:
=> Unsupported Read File Formats and Types,Unsupported Write Data Format

@nartal1
Copy link
Collaborator

nartal1 commented Jan 11, 2022

sorry, I didn't fully look at the PR, looks like the qualification results were update, only diff is it reports each type. They are properly listed under not supported still: => Unsupported Read File Formats and Types,Unsupported Write Data Format

Thanks @tgravescs for looking into it. Yes, Instead of json[*], it's now reporting the types which are not supported which is good.

@wbo4958 wbo4958 changed the title [WIP] Add json reader support Add json reader support Jan 12, 2022
@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 12, 2022

build

@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 12, 2022

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments about the docs.

docs/compatibility.md Outdated Show resolved Hide resolved
docs/compatibility.md Show resolved Hide resolved
@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 13, 2022

build

@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 14, 2022

build


### JSON supporting types

The nested types(array, map and struct) are not supported yet in current version.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that it is not possible to access primitive types stored within an array, map, and struct, or just that we cannot return a column of type array, map, or struct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that it is not possible to access primitive types stored within an array, map, and struct,
Yes
or just that we cannot return a column of type array, map, or struct?
Spark does not treat them differently as far as I can tell. CUDF sort of does but not in any way that is compatible with what Spark does and it is only for top level arrays.

revans2
revans2 previously approved these changes Jan 14, 2022
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I would like others to look at this too

@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 17, 2022

@tgravescs Could you help to review this PR ?

@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jan 20, 2022

build

@wbo4958 wbo4958 merged commit 7eb6097 into NVIDIA:branch-22.02 Jan 20, 2022
@wbo4958 wbo4958 deleted the json-reader branch January 20, 2022 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants