Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert updates to protocol/message data types #253

Merged
merged 20 commits into from
Jan 25, 2023
Merged

Convert updates to protocol/message data types #253

merged 20 commits into from
Jan 25, 2023

Conversation

alexshtin
Copy link
Member

@alexshtin alexshtin commented Jan 4, 2023

What changed?
Convert updates to protocol/message data types.

Why?
Previous version was very specific to workflow updates. After many discussions between @temporalio/sdk and @temporalio/server teams we came up to the idea to create generic messaging protocol which can be reused for other purposes.

Breaking changes
Yes, but previous version was never released and used.

temporal/api/enums/v1/update.proto Outdated Show resolved Hide resolved
temporal/api/history/v1/message.proto Show resolved Hide resolved
temporal/api/command/v1/message.proto Show resolved Hide resolved
temporal/api/protocol/v1/message.proto Show resolved Hide resolved
temporal/api/protocol/v1/message.proto Outdated Show resolved Hide resolved
temporal/api/update/v1/message.proto Outdated Show resolved Hide resolved
@cretz
Copy link
Member

cretz commented Jan 4, 2023

Can we make sure temporalio/api-go#100 and temporalio/sdk-go#974 land first? Otherwise compatibility issues may be caused with our proxy code gen as they did w/ these types of changes before. For example, see how we reference https://github.com/temporalio/sdk-go/blob/06e474c93e936b71dc4afcec973460b22c13986d/converter/grpc_interceptor.go#L192.

Also @alexshtin can you please review temporalio/api-go#100 so we can move forward on it?

Copy link
Member Author

@alexshtin alexshtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Integrating with server code.

@mmcshane mmcshane force-pushed the mpm/update branch 3 times, most recently from beed924 to 8f62dfd Compare January 6, 2023 21:19
message Outcome {
oneof value {
Incomplete incomplete = 1;
temporal.api.common.v1.Payloads success = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just gonna swoop in here to say Payloads is my enemy and if we can do just one Payload here to avoid the various footguns that Payloads introduces that'd be super. Forces people into better practices from the get-go.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be more than one return value, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean to say by that is just don't allow it. Force people into the best practice of returning one struct which contains everything they want, and is usually safer to change later than a list of un-named values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we follow this rule everywhere. Basically same way as gRPC does.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can actually do this because right now the reflection code requires that update functions return at either one or two values and one of those is the error.

That being said, it's extremely satisfying that temporal.api.common.v1.Payloads and temporal.api.failure.v1.Failure have the same number of characters so the lines line up just right so

Copy link
Member

@cretz cretz Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, as we're developing many-to-many payload codecs in .NET we discovered that people may want to encode one to many payloads (though they of course and often do encode many to one). It's a small want, but one supported in e.g. TypeScript today. The only place today where we use Payload without Payloads in non-map situations is Failure.encoded_attributes and I think @bergundy regrets that.

IMO We should stay with Payloads in every non-map situation. This gives payload encoders freedom of arity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went to do this and the sdk appears to be fully built around the plural form. For example there does not exist a utility function to map from commonpb.Payload to converter.EncodedValue. There exists several functions that are correctly named for such a task (NewValue, newEncodedValue, etc) but in each case they for some reason take the plural commonpb.Payloads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't regret using single Payload for Failure.encoded_attributes, the fact that we use single Payload in map values already makes encoding one-to-many impractical and as you've argued @cretz there's not really a need for that.

I would definitely support single Payload in this use case. Yes, it breaks consistency with other return values but in practice we don't support multiple return values in the SDK and never will for language interoperability.

Copy link
Member

@cretz cretz Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably encoded_attributes (plural) should have been map<string, Payload> to the user could have some alongside ours, but 🤷

As for here, I'm ok with a single payload for return.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure map<string, Payload> would've better there, user can still have their own attributes with Payload and a single Payload is easier to work with and is more flexible than a map.
But what's done is done.

I think we all are okay with single or multi here even though we will always only use single so it's up to Matt to make the call IMHO.

Comment on lines 680 to 683
string accepted_request_message_id = 2;
// The message payload of the original request message that initiated this
// update.
temporal.api.update.v1.Request accepted_request = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we need to prefix these with accepted_ in the context of an "Accepted" event?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads a little better in the code with the accepted prefix. Otherwise you have just a "request" type and variable in there and you wonder if it's a request about the Acceptance or what the semantics are. With the prefix it's a more clear that this is the request-that-was-accepted rather than a request that needs to be acted upon


message WorkflowExecutionUpdateCompletedEventAttributes {
// The metadata about this update.
temporal.api.update.v1.Meta meta = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we spell this out?

Suggested change
temporal.api.update.v1.Meta meta = 1;
temporal.api.update.v1.Metadata metadata = 1;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Mark approve?

// The metadata about this update.
temporal.api.update.v1.Meta meta = 1;
// The outcome of executing the workflow update function.
temporal.api.update.v1.Outcome outcome = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the other events, I would consider having separate Completed and Failed events instead of this internal outcome field.
I like your version better but would prefer consistency over personal preference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd question why exactly is it that consistency is important here? There are vanishingly few users making use of our gRPC APIs directly. Consistency for its own sake appears less valuable to me than better typing.

I won't belabor anything here, but I do think it's important to question what the value of staying consistent is here - if we can't name that, then just citing it as a de-facto good seems like something to be avoided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly for clarity when viewing a brief summary of the workflow history. Yes, it can be handled in our UI but it would be a special case.
Another point worth mentioning is that the Incomplete outcome is not a valid outcome here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, those make sense to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With respect guys, we did this conversation. We talked about how using oneof is a divergence and why both the server and the sdk teams were ok with it. This has been settled.

I can remove Incomplete (it's only invalid until we support other wait policies) for this pre-release but it's coming back shortly.

Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everyone that commented is okay with what we have now, it's GTM AFAIC.

@mmcshane mmcshane marked this pull request as ready for review January 13, 2023 22:20
@mmcshane mmcshane requested review from a team as code owners January 13, 2023 22:20
@@ -323,6 +325,8 @@ message RespondWorkflowTaskFailedRequest {
// Worker process' unique binary id
string binary_checksum = 5;
string namespace = 6;
// Protocol messages piggybacking on a WFT as a transport
repeated temporal.api.protocol.v1.Message messages = 7;
Copy link
Member

@cretz cretz Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it can seem late in the process for this, but after SDK team discussion, we believe for responses (so not necessarily accept/reject), it is much clearer to make this a command instead of being expected to maintain the sequence identifier of the event it should be within. The concept of "all ordered happenings in a workflow are sent as commands in their order" is much clearer than "some ordered happenings in a workflow are sent as commands and some sent as messages to be interleaved with commands".

So this can still exist for unordered things (query responses, update accept/reject) and on inbound this type of generic message coming on the task makes plenty of sense, it's just that it is much clearer specifically for ordered responses to be as commands as they always have been.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, if it's too late for this, it's too late for this.

ModifyWorkflowPropertiesCommandAttributes modify_workflow_properties_command_attributes = 17;
RejectWorkflowUpdateCommandAttributes reject_workflow_update_command_attributes = 18;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the previous comment, I think we may need some kind of ProtocolMessageCommand or something.

// belongs.
string protocol_instance_id = 2;

// The event ID or command ID after which this message can be delivered. The
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The event ID or command ID after which this message can be delivered. The
// The event ID or command ID after which this message can be processed. The

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think delivered is the right word here. It's up to the implementation as to whether or not it gets processed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, "may be processed"

Matt McShane and others added 13 commits January 24, 2023 17:41
The worker will need to recreate an input message from the data
contained in these events.
Enables sequencing on event_id or command_id with strong typing.
Needed for the server to build out the associated Accept/Reject history
event in the case that the original protocol information has been lost
(e.g. due to shard movement)
The possibility for an outcome to be incomplete is not present in the
protocol messages - it is only something that can happen with the RPC so
we move that indication up a level to the RPC response object. This also
prevents us from accidentally storing Incomplete{} as the outcome value
in an update completed event.
Will be used to reference and sequence protocol messages from the
RespondWorkflowTaskCompletedRequest.Messages field.
@@ -29,15 +29,15 @@ $(PROTO_OUT):
mkdir $(PROTO_OUT)

##### Compile proto files for go #####
grpc: buf-lint api-linter buf-breaking gogo-grpc fix-path
grpc: buf-lint api-linter gogo-grpc fix-path
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to bring it back!

// the code that handles this message. Omit to opt out of sequencing.
oneof sequencing_id {
int64 event_id = 3;
int64 command_index = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever need this, now that the pointer will be in the command list? Can this just be an optional event_id?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely want to preserve the flexibility to use different things as a sequence number (e.g. could have a dotted version vector or something causal like that) so I want to keep the oneof. Command index may end up being unused and if so we can remove it but at least for update it might be useful to set as part of message sent to point back to the ProtocolMessageCommand so that we can do bidirectional consistency checking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants