-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-14020] Adding SchemaTransform, SchemaTransformProvider, TypedSchemaTransformProvider, and PCollectionRowTuple #16958
Conversation
804916e
to
3c99343
Compare
Run Java PreCommit |
* | ||
* // Create an empty PCollectionTuple: | ||
* Pipeline p = ...; | ||
* PCollectionTuple pcs2 = PCollectionTuple.empty(p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: looks like you need to s/PCollectionTuple/PCollectionRowTuple/
It's too bad there's so much duplication from PCollectionTuple
, but I can't think of a way to structure this that avoids it. Maybe @kennknowles
has an idea (but I know he dislikes inheritance, so maybe he prefers it this way :P)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I wanted that too, it seems like it should be easy to do. But I think unless we template the underlying class there's no real way to make this work. We could maybe make PCollectionTupleBase and then Have PCollectionTuple extends Object and PCollectionRowTuple extend Row. But then we'd end up with Object instead of ?. I'm not actually sure how well that would work. Could give it a try but it still seems kind of messy.
.../java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/SchemaTransformProvider.java
Show resolved
Hide resolved
fbc00ff
to
ad1014d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one minor nit. Thank you!
.../core/src/main/java/org/apache/beam/sdk/schemas/transforms/TypedSchemaTransformProvider.java
Outdated
Show resolved
Hide resolved
Run Java PreCommit |
List<Row> inputs = toRows(Arrays.asList(3, -42, 77), intSchema); | ||
|
||
PCollection<Row> mainInput = pipeline.apply(Create.of(inputs)); | ||
PCollection<Row> secondInput = pipeline.apply(Create.of(inputs)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is causing an actual failure in Java PreCommit:
java.lang.IllegalStateException: Pipeline update will not be possible because the following transforms do not have stable unique names: Create.Values.
Conflicting instances:
- name=Create.Values:
- Create.Values
- Create.Values
You can fix it adding a name when you call apply(): pipeline.apply(<name>, <transform>).
at org.apache.beam.sdk.Pipeline.validate(Pipeline.java:619)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:322)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:399)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:335)
at org.apache.beam.sdk.values.PCollectionRowTupleTest.testComposePCollectionRowTuple(PCollectionRowTupleTest.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
Run Java_Examples_Dataflow PreCommit |
1 similar comment
Run Java_Examples_Dataflow PreCommit |
Run Java PreCommit |
1 similar comment
Run Java PreCommit |
|
||
public static final Schema intSchema = Schema.of(Field.of("int", FieldType.INT32)); | ||
public static final Schema stringSchema = Schema.of(Field.of("str", FieldType.STRING)); | ||
public static final Schema boolSchema = Schema.of(Field.of("str", FieldType.BOOLEAN)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkstyle is failing in Java PreCommit now, it wants these to be ALL_CAPS
c6edaeb
to
af5569e
Compare
…Provider and PCollectionRowTuple. Doc: https://s.apache.org/beam-schema-transform
…forms/TypedSchemaTransformProvider.java
af5569e
to
0f6853b
Compare
0f6853b
to
fdadbd3
Compare
Adding SchemaTransform, SchemaTransformProvider, and PCollectionRowTuple. This is an interface which allows for Schema-aware transforms and will eventually replace SchemaIO.
R:@TheNeuralBit
Doc: https://s.apache.org/beam-schema-transform
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.