Convert updates to protocol/message data types #253

alexshtin · 2023-01-04T00:10:19Z

What changed?
Convert updates to protocol/message data types.

Why?
Previous version was very specific to workflow updates. After many discussions between @temporalio/sdk and @temporalio/server teams we came up to the idea to create generic messaging protocol which can be reused for other purposes.

Breaking changes
Yes, but previous version was never released and used.

temporal/api/enums/v1/update.proto

temporal/api/history/v1/message.proto

temporal/api/command/v1/message.proto

temporal/api/protocol/v1/message.proto

temporal/api/workflowservice/v1/request_response.proto

temporal/api/update/v1/message.proto

cretz · 2023-01-04T17:18:04Z

Can we make sure temporalio/api-go#100 and temporalio/sdk-go#974 land first? Otherwise compatibility issues may be caused with our proxy code gen as they did w/ these types of changes before. For example, see how we reference https://github.com/temporalio/sdk-go/blob/06e474c93e936b71dc4afcec973460b22c13986d/converter/grpc_interceptor.go#L192.

Also @alexshtin can you please review temporalio/api-go#100 so we can move forward on it?

alexshtin

LGTM. Integrating with server code.

Sushisource · 2023-01-07T00:16:15Z

temporal/api/update/v1/message.proto

+message Outcome {
+    oneof value {
+        Incomplete incomplete = 1;
+        temporal.api.common.v1.Payloads success = 2;


Just gonna swoop in here to say Payloads is my enemy and if we can do just one Payload here to avoid the various footguns that Payloads introduces that'd be super. Forces people into better practices from the get-go.

There might be more than one return value, right?

What I mean to say by that is just don't allow it. Force people into the best practice of returning one struct which contains everything they want, and is usually safer to change later than a list of un-named values.

I wish we follow this rule everywhere. Basically same way as gRPC does.

I think we can actually do this because right now the reflection code requires that update functions return at either one or two values and one of those is the error.

That being said, it's extremely satisfying that temporal.api.common.v1.Payloads and temporal.api.failure.v1.Failure have the same number of characters so the lines line up just right so

FWIW, as we're developing many-to-many payload codecs in .NET we discovered that people may want to encode one to many payloads (though they of course and often do encode many to one). It's a small want, but one supported in e.g. TypeScript today. The only place today where we use Payload without Payloads in non-map situations is Failure.encoded_attributes and I think @bergundy regrets that.

IMO We should stay with Payloads in every non-map situation. This gives payload encoders freedom of arity.

I went to do this and the sdk appears to be fully built around the plural form. For example there does not exist a utility function to map from commonpb.Payload to converter.EncodedValue. There exists several functions that are correctly named for such a task (NewValue, newEncodedValue, etc) but in each case they for some reason take the plural commonpb.Payloads.

I don't regret using single Payload for Failure.encoded_attributes, the fact that we use single Payload in map values already makes encoding one-to-many impractical and as you've argued @cretz there's not really a need for that.

I would definitely support single Payload in this use case. Yes, it breaks consistency with other return values but in practice we don't support multiple return values in the SDK and never will for language interoperability.

Arguably encoded_attributes (plural) should have been map<string, Payload> to the user could have some alongside ours, but 🤷

As for here, I'm ok with a single payload for return.

Not sure map<string, Payload> would've better there, user can still have their own attributes with Payload and a single Payload is easier to work with and is more flexible than a map.
But what's done is done.

I think we all are okay with single or multi here even though we will always only use single so it's up to Matt to make the call IMHO.

temporal/api/protocol/v1/message.proto

temporal/api/history/v1/message.proto

temporal/api/enums/v1/update.proto

bergundy · 2023-01-12T18:50:11Z

temporal/api/history/v1/message.proto

+    string accepted_request_message_id = 2;
+    // The message payload of the original request message that initiated this
+    // update.
+    temporal.api.update.v1.Request accepted_request = 3;


nit: do we need to prefix these with accepted_ in the context of an "Accepted" event?

It reads a little better in the code with the accepted prefix. Otherwise you have just a "request" type and variable in there and you wonder if it's a request about the Acceptance or what the semantics are. With the prefix it's a more clear that this is the request-that-was-accepted rather than a request that needs to be acted upon

bergundy · 2023-01-12T18:51:23Z

temporal/api/history/v1/message.proto

+
+message WorkflowExecutionUpdateCompletedEventAttributes {
+    // The metadata about this update.
+    temporal.api.update.v1.Meta meta = 1;


Can we spell this out?

Suggested change

temporal.api.update.v1.Meta meta = 1;

temporal.api.update.v1.Metadata metadata = 1;

Does Mark approve?

bergundy · 2023-01-12T18:59:05Z

temporal/api/history/v1/message.proto

+    // The metadata about this update.
+    temporal.api.update.v1.Meta meta = 1;
+    // The outcome of executing the workflow update function.
+    temporal.api.update.v1.Outcome outcome = 2;


For consistency with the other events, I would consider having separate Completed and Failed events instead of this internal outcome field.
I like your version better but would prefer consistency over personal preference.

I'd question why exactly is it that consistency is important here? There are vanishingly few users making use of our gRPC APIs directly. Consistency for its own sake appears less valuable to me than better typing.

I won't belabor anything here, but I do think it's important to question what the value of staying consistent is here - if we can't name that, then just citing it as a de-facto good seems like something to be avoided.

This is mostly for clarity when viewing a brief summary of the workflow history. Yes, it can be handled in our UI but it would be a special case.
Another point worth mentioning is that the Incomplete outcome is not a valid outcome here.

Cool, those make sense to me.

With respect guys, we did this conversation. We talked about how using oneof is a divergence and why both the server and the sdk teams were ok with it. This has been settled.

I can remove Incomplete (it's only invalid until we support other wait policies) for this pre-release but it's coming back shortly.

temporal/api/update/v1/message.proto

bergundy

I think everyone that commented is okay with what we have now, it's GTM AFAIC.

temporal/api/workflowservice/v1/request_response.proto

cretz · 2023-01-19T15:36:40Z

temporal/api/workflowservice/v1/request_response.proto

@@ -323,6 +325,8 @@ message RespondWorkflowTaskFailedRequest {
    // Worker process' unique binary id
    string binary_checksum = 5;
    string namespace = 6;
+    // Protocol messages piggybacking on a WFT as a transport
+    repeated temporal.api.protocol.v1.Message messages = 7;


I know it can seem late in the process for this, but after SDK team discussion, we believe for responses (so not necessarily accept/reject), it is much clearer to make this a command instead of being expected to maintain the sequence identifier of the event it should be within. The concept of "all ordered happenings in a workflow are sent as commands in their order" is much clearer than "some ordered happenings in a workflow are sent as commands and some sent as messages to be interleaved with commands".

So this can still exist for unordered things (query responses, update accept/reject) and on inbound this type of generic message coming on the task makes plenty of sense, it's just that it is much clearer specifically for ordered responses to be as commands as they always have been.

To clarify, if it's too late for this, it's too late for this.

cretz · 2023-01-19T15:37:46Z

temporal/api/command/v1/message.proto

        ModifyWorkflowPropertiesCommandAttributes modify_workflow_properties_command_attributes = 17;
-        RejectWorkflowUpdateCommandAttributes reject_workflow_update_command_attributes = 18;


Per the previous comment, I think we may need some kind of ProtocolMessageCommand or something.

alexshtin · 2023-01-20T18:25:15Z

temporal/api/protocol/v1/message.proto

+    // belongs.
+    string protocol_instance_id = 2;
+
+    // The event ID or command ID after which this message can be delivered. The


Suggested change

// The event ID or command ID after which this message can be delivered. The

// The event ID or command ID after which this message can be processed. The

?

I think delivered is the right word here. It's up to the implementation as to whether or not it gets processed.

Alternatively, "may be processed"

Nothing ever shipped that used these fields.

The worker will need to recreate an input message from the data contained in these events.

Enables sequencing on event_id or command_id with strong typing.

Needed for the server to build out the associated Accept/Reject history event in the case that the original protocol information has been lost (e.g. due to shard movement)

The possibility for an outcome to be incomplete is not present in the protocol messages - it is only something that can happen with the RPC so we move that indication up a level to the RPC response object. This also prevents us from accidentally storing Incomplete{} as the outcome value in an update completed event.

Will be used to reference and sequence protocol messages from the RespondWorkflowTaskCompletedRequest.Messages field.

alexshtin · 2023-01-24T22:59:25Z

Makefile

@@ -29,15 +29,15 @@ $(PROTO_OUT):
 	mkdir $(PROTO_OUT)

 ##### Compile proto files for go #####
-grpc: buf-lint api-linter buf-breaking gogo-grpc fix-path
+grpc: buf-lint api-linter gogo-grpc fix-path


Don't forget to bring it back!

Sushisource · 2023-01-24T23:38:24Z

temporal/api/protocol/v1/message.proto

+    // the code that handles this message. Omit to opt out of sequencing.
+    oneof sequencing_id {
+        int64 event_id = 3;
+        int64 command_index = 4;


Do we ever need this, now that the pointer will be in the command list? Can this just be an optional event_id?

Definitely want to preserve the flexibility to use different things as a sequence number (e.g. could have a dotted version vector or something causal like that) so I want to keep the oneof. Command index may end up being unused and if so we can remove it but at least for update it might be useful to set as part of message sent to point back to the ProtocolMessageCommand so that we can do bidirectional consistency checking.

alexshtin commented Jan 4, 2023

View reviewed changes

alexshtin commented Jan 5, 2023

View reviewed changes

mmcshane force-pushed the mpm/update branch 3 times, most recently from beed924 to 8f62dfd Compare January 6, 2023 21:19

Sushisource reviewed Jan 7, 2023

View reviewed changes

mmcshane force-pushed the mpm/update branch from 8f62dfd to 0eac326 Compare January 9, 2023 18:10

alexshtin commented Jan 10, 2023

View reviewed changes

temporal/api/protocol/v1/message.proto Outdated Show resolved Hide resolved

alexshtin commented Jan 10, 2023

View reviewed changes

temporal/api/protocol/v1/message.proto Outdated Show resolved Hide resolved

alexshtin commented Jan 12, 2023

View reviewed changes

temporal/api/history/v1/message.proto Outdated Show resolved Hide resolved

bergundy reviewed Jan 12, 2023

View reviewed changes

temporal/api/enums/v1/update.proto Outdated Show resolved Hide resolved

bergundy reviewed Jan 12, 2023

View reviewed changes

temporal/api/update/v1/message.proto Outdated Show resolved Hide resolved

mmcshane force-pushed the mpm/update branch from b2db17b to f92b6c0 Compare January 12, 2023 19:21

bergundy approved these changes Jan 12, 2023

View reviewed changes

mmcshane marked this pull request as ready for review January 13, 2023 22:20

mmcshane requested review from a team as code owners January 13, 2023 22:20

mmcshane force-pushed the mpm/update branch from 3d082d7 to 04caea5 Compare January 18, 2023 16:14

alexshtin commented Jan 19, 2023

View reviewed changes

temporal/api/workflowservice/v1/request_response.proto Outdated Show resolved Hide resolved

cretz reviewed Jan 19, 2023

View reviewed changes

alexshtin commented Jan 20, 2023

View reviewed changes

Matt McShane added 5 commits January 24, 2023 17:41

Convert updates to protocol/message data types

bf3e94b

Prefer 'stage' to 'event' to avoid vocab conflicts

4ba460f

Re-use old field numbers

c0403c7

Nothing ever shipped that used these fields.

Note that deleted field numbers are ok to reuse

bc20aae

Avoid storing an Any in history

4dbb5d2

Matt McShane and others added 13 commits January 24, 2023 17:41

Fix go package name hint

2e86d5a

Add fields to Accepted and Updated events

8a15246

The worker will need to recreate an input message from the data contained in these events.

Change Message.sequence_id to a oneof

920c13f

Enables sequencing on event_id or command_id with strong typing.

Prefer the singluar Payload where possible

5589634

Normal field ordering: 1 before 2

566505c

Add event sequencing ID to accept/reject events

3e2fc21

Fixes misspelled field name

bf04580

msg ID and seq ID fields on Accept/Reject messages

4352777

Needed for the server to build out the associated Accept/Reject history event in the case that the original protocol information has been lost (e.g. due to shard movement)

Back to plural Payloads

5da5b5b

Comments for UpdateWorkflowExecutionLifecycleStage enum

df201af

Add workflow task failed causes

9cb2092

Add ProtocolMessageCommand

5b6a4f9

Will be used to reference and sequence protocol messages from the RespondWorkflowTaskCompletedRequest.Messages field.

mmcshane force-pushed the mpm/update branch from 2adcfc0 to 5b6a4f9 Compare January 24, 2023 22:41

Matt McShane added 2 commits January 24, 2023 17:47

Improved comment wording on UpdateWorkflowExecutionRequest.request

4ef1a9b

Can't say 'required'

6c3fe16

alexshtin commented Jan 24, 2023

View reviewed changes

mmcshane approved these changes Jan 24, 2023

View reviewed changes

Sushisource reviewed Jan 24, 2023

View reviewed changes

mmcshane merged commit d86d330 into master Jan 25, 2023

mmcshane deleted the mpm/update branch January 25, 2023 00:43

This was referenced Jan 25, 2023

Synchronous workflow update temporalio/temporal#3822

Merged

Messages protocol implementation temporalio/temporal#3843

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert updates to protocol/message data types #253

Convert updates to protocol/message data types #253

alexshtin commented Jan 4, 2023 •

edited

Loading

cretz commented Jan 4, 2023

alexshtin left a comment

Sushisource Jan 7, 2023

alexshtin Jan 9, 2023

Sushisource Jan 10, 2023

alexshtin Jan 12, 2023

mmcshane Jan 12, 2023

cretz Jan 12, 2023 •

edited

Loading

mmcshane Jan 12, 2023

bergundy Jan 12, 2023

cretz Jan 12, 2023 •

edited

Loading

bergundy Jan 12, 2023

bergundy Jan 12, 2023

mmcshane Jan 12, 2023

bergundy Jan 12, 2023

alexshtin Jan 12, 2023

bergundy Jan 12, 2023

Sushisource Jan 12, 2023

bergundy Jan 12, 2023

Sushisource Jan 12, 2023

mmcshane Jan 12, 2023

bergundy left a comment

cretz Jan 19, 2023 •

edited

Loading

cretz Jan 23, 2023

cretz Jan 19, 2023

alexshtin Jan 20, 2023

mmcshane Jan 24, 2023

Sushisource Jan 24, 2023

alexshtin Jan 24, 2023

Sushisource Jan 24, 2023

mmcshane Jan 25, 2023

	temporal.api.update.v1.Meta meta = 1;
	temporal.api.update.v1.Metadata metadata = 1;

		ModifyWorkflowPropertiesCommandAttributes modify_workflow_properties_command_attributes = 17;
		RejectWorkflowUpdateCommandAttributes reject_workflow_update_command_attributes = 18;

	// The event ID or command ID after which this message can be delivered. The
	// The event ID or command ID after which this message can be processed. The

Convert updates to protocol/message data types #253

Convert updates to protocol/message data types #253

Conversation

alexshtin commented Jan 4, 2023 • edited Loading

cretz commented Jan 4, 2023

alexshtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jan 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jan 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bergundy left a comment

Choose a reason for hiding this comment

cretz Jan 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexshtin commented Jan 4, 2023 •

edited

Loading

cretz Jan 12, 2023 •

edited

Loading

cretz Jan 12, 2023 •

edited

Loading

cretz Jan 19, 2023 •

edited

Loading