Skip to content
This repository has been archived by the owner on Jul 15, 2018. It is now read-only.

Unify the response protobuf types #165

Closed
ethanfrey opened this issue Dec 21, 2017 · 16 comments
Closed

Unify the response protobuf types #165

ethanfrey opened this issue Dec 21, 2017 · 16 comments

Comments

@ethanfrey
Copy link
Contributor

ethanfrey commented Dec 21, 2017

I would like to simplify the representation of the types, make unified error handling. Write now, some types have a Code and Log field which is used to create IsErr, and IsOk.

I would prefer something more like:

Response = Error(string) | Result(ABCIResult)
ABCIResult = SetOptionResult() | DeliverTxResult(...) | ...

To make this more concrete, now looks something like this:

message Response {
  oneof value{
    ResponseInfo info = 4;
    ResponseDeliverTx deliver_tx = 6;
   // ....
  }
}

message ResponseInfo {
  string data = 1;
  string version = 2;
  int64 last_block_height = 3;
  bytes last_block_app_hash = 4;
}

message ResponseDeliverTx{
  uint32            code        = 1;
  bytes             data        = 2 ;
  string            log         = 3;
  repeated KVPair tags = 4;
}

I would propose something more like

message Response {
  Error error = 1;
  oneof result{
    ResponseInfo info = 2;
    ResponseDeliverTx deliver_tx = 3;
  };
}

message Error {
  uint32 code = 1;
  string log = 2;
}

message ResponseInfo {
  string data = 1;
  string version = 2;
  int64 last_block_height = 3;
  bytes last_block_app_hash = 4;
}

message ResponseDeliverTx{
  bytes    data = 1;
  repeated KVPair tags = 2;
}

// .....
@adrianbrink
Copy link
Contributor

I really like this proposal.

One question I have is: How can I represent a warning or just return some log information to the app in the proposal?

Let's say I want to attach a warning message to ResponseDeliverTx. Currently, I could set Code to a non-error code and just put something into the log field.
Maybe there is a point though that this shouldn't be possible, since you have no way of knowing how the other side will treat that string.

@ethanfrey
Copy link
Contributor Author

ethanfrey commented Dec 21, 2017

If we need this, we could have a log field in the result as well.
I don't know of any real usecase. Tendermint can't read the log messages.
ABCI app can write out to the log file

If it's just for debugging with abci-cli, seems a weak reason.
If there is another reason, sure add that in, and rename Error.log to Error.desc or something
It's the fill-in since error codes don't have hardcoded names anymore

@ethanfrey
Copy link
Contributor Author

There is no warning, Debug, Info, Error.
That was the result from the log refactor.
Either it is broken, or it isn't...

@jaekwon
Copy link
Contributor

jaekwon commented Dec 23, 2017

Hey Ethan, thank you for the proposal and writeup.

BTW, log is just for debugging and as a general conduit for emitting non-deterministic/non-consensus information to any event listeners (e.g. the transaction submitter, or transaction indexer). It's useful as is, so we don't want to change the semantics of this. Think of it as "captured stdout/stderr output for this request". We shouldn't add any additional structure there because we're barely using it... it's in general better to keep things simple/expressive and use conventions, than to enforce types, when we don't know what the types will end up being.

So if we were to split this out, I'd rather it be:

message Response {
  uint32 code = 1;
  string log = 2;
  oneof result{
    ResponseInfo info = 3;
    ResponseDeliverTx deliver_tx = 4;
  };
}

But I'm not sure about leaving the code outside.

The purpose of code is to be a super-short (but expressive) code of limited size, for CheckTx and DeliverTx. It isn't necessarily an error code, as it may also be used to mark non-error transactions.

Here's a future use-case for response codes. Say that we have a very tiny light-client, so small that it can't keep state, but it can do everything else. Also lets say that it's so small that we want to minimize bandwidth and CPU processing to a reasonable degree. Lets call these "nano-clients". An Internet-of-Things little device could be such a nano-client.

Say that this nano-client needs to scan through the transactions of a block and respond to certain events, but it doesn't care about most transactions.

The slow way to do this is to just get Merkle proofs for each transaction response (e.g. the <code,data,tags> tuple that we plan to Merkle-ize into the Header). This list of leaf-proofs doesn't prove the absence of tags in transactions that are absent, so either the nano-client has to receive all the proofs, or, we need to include some sequence in the transaction bytes (this is brittle because the nano-client needs to be upgraded when the tx format changes). So the latter isn't too bad, but it's a bit annoying.

The faster way to do this in many cases is to get the whole list of tx response-codes per block, and to scan those numbers to filter. If we only have OK/error response codes, and a non-OK code meant that the transaction could be ignored, then the nano-client has to parse all OK transactions to filter, or, it's possible that there isn't enough information in the transaction bytes, and so the nano-client needs to consult the response tags to filter, and now we're back to the first problem.

That's why ResponseCheckTx/ResponseDeliverTx has code, and why it's not called error_code.

Query and Commit have codes because we added them a long time ago. Commit shouldn't have a code, we should remove it. SetOption has a code because we wanted a way to respond with an error.

I think we want to make these changes:

  • Replace code uint32 with error string in Query and SetOption because they take user input, and a string is nicer than a number, and we don't need to make the string deterministic, and we don't have the same kind of design considerations as in CheckTx/DeliverTx above.
  • Remove code from Commit because it takes no user input and there's nothing to recover from.

Regarding errors in general, for messages that don't take user input like Flush, Info, InitChain, BeginBlock, EndBlock, and Commit.... There is no way to handle these errors gracefully, so we might as well panic. We have ResponseException exception = 1; reserved, so when there is such an error, the app can just respond with that, and the ABCI client logic can have special logic to deal with that (it doesn't correspond with a request, but that's OK because it's essentially a panic condition and there's no way to recover; often it'll be due to panics in the filesystem or a broken connection, so there's often no correlation with the request.). We used to have this special logic but we removed it during a refactor/cleanup with the intent to add them back in.

@ethanfrey
Copy link
Contributor Author

Okay, as far as I have seen, almost all the code refers to code 0 as okay, and not okay as error.

I thought we wanted error codes to map to strings in the ui. Much better that way for localization than returning a string in the app.

I like your split out of code and log outside of response. I would keep that but just make code int32, and consider 0 or positive a success, and negative an error. And then we can filter easily with future nano clients and also determine errors.

The main reason is to unify the interface, and all calls to application can be

Action(Action Request) (ActionResponse, error)

This unifies a lot, also the proxy app. We also have empty Request and Response structs now exactly for this symmetry. And using that nicer go Form instead of IsErr() will make code more readable and go- like. Also easier to pass generic errors that are not known tied to one response time.

@jaekwon
Copy link
Contributor

jaekwon commented Dec 23, 2017

Okay, as far as I have seen, almost all the code refers to code 0 as okay, and not okay as error.

That's how it is today, but lets keep our options open since it doesn't hurt.

I thought we wanted error codes to map to strings in the ui. Much better that way for localization than returning a string in the app.

That's another nice benefit for error codes. Good point. Now I'm in favor of adding code to ResponseQuery as well.

I like your split out of code and log outside of response.

I don't think code should be common for all messages for the reasons stated.

Regarding logs, it first it seems like the log could go outside for all messages, but it's probably not a good idea.

First, the application really ought to keep its own log in the filesystem, because in general there's a lot going on in an application and it's not always appropriate the fit them into ABCI messages. For example, the app might have more goroutines that log too, and it makes more sense to intersperse the logs resulting from transaction processing, with those other goroutine logs.

Second, it helps little to make Tendermint store them... at first I thought that it might be useful to intersperse, so you know what the App did in relation to other Tendermint logging events, but on second thought I realized that it's not true, it makes logging worse. We want CheckTx logs to match up with the state of the App when the CheckTx log was produced, not when Tendermint happened to receive it from the mempool socket, which would be some unknown time after the event occurred. By delaying the writing of log entries by sending them into a channel, we've lost critical causal/synchrony information. (e.g. this CheckTx was processed before or after this DeliverTx).

It's not any worse to make the App responsible for logging independently of Tendermint. The ABCI interface is designed to make the App a dead simple state machine w/ most messages called in sequential order (e.g. BeginBlock -> DeliverTx -> EndBlock -> Commit -> ...), so it's not difficult to match the lines in the Tendermint log with the App log.

It's (in general) an anti-pattern to push logs from the App to Tendermint, and it's sufficient to log in the app. Ergo, we shouldn't require all ABCI messages to include logs.

Action(Action Request) (ActionResponse, error)

That's an API to be implemented in the client. We already do that. But it doesn't make sense to reflect that in the ABCI wire, since an error in most messages should terminate Tendermint or the ABCI connection, as there is no way to recover. For the same reason why sometimes you want to panic rather than return-error, here we want to push an Exception message instead.

Maybe you'd prefer that we rename log to (human-readible) info. It shouldn't be a log as in logger output, but rather it should be something for consumers to read. In sum, I believe that it's best for Response(DeliverTx/CheckTx/Query) to have code uint32 and info string, but only those messages.

@jaekwon
Copy link
Contributor

jaekwon commented Dec 23, 2017

I think it might make sense to have both log and info in Response(DeliverTx/CheckTx/Query). It is useful to provide both the log and the info to the original requester, and both fields serve different purposes. The log would be logged on the App side in a file as well (but probably not in a file in Tendermint).

@ebuchman
Copy link
Contributor

We should take tendermint/tendermint#999 into account here - ie. we should have types at the ABCI level that delineate which components of the response are deterministic/consensus and which are not.

@jaekwon
Copy link
Contributor

jaekwon commented Dec 26, 2017

types at the ABCI level?
They're all deterministic save for a few exceptions.

  • SetOption is always nondeterministic.
  • Response[CheckTx/DeliverTx] log & info are nondeterministic.
    Lets keep it at that w/ documentation. We come out saying that ABCI apps need to be deterministic, so it's default deterministic unless you see exceptions.

@jaekwon
Copy link
Contributor

jaekwon commented Dec 26, 2017

See the latest SDK2: https://github.com/tendermint/abci/blob/sdk2/types/types.proto

Do we have a use for SetOption right now? If we move fee handling to Tendermint and genesis state to InitChain, I'm not sure what else we'd want it for.

@ebuchman
Copy link
Contributor

Response[CheckTx/DeliverTx] log & info are nondeterministic.

Exactly. I just meant have a type in ABCI that doesn't include the non-deterministic stuff - otherwise we need one in Tendermint for computing merkle roots

@ebuchman
Copy link
Contributor

Do we have a use for SetOption right now?

I don't have strong feelings about it

@ebuchman
Copy link
Contributor

Re sdk2 branch. Looks good. But I don't understand why you dropped data.Bytes for []byte.

@jaekwon
Copy link
Contributor

jaekwon commented Dec 28, 2017

Had a convo with bucky offline. Basically, the whole point of data.Bytes was to help with hex encoding our []byte for a slight, slight benefit of improved debuggability for developers, but it's not worth it if we're going to uglify our .proto files and lord knows what other compatibility problems we'll find later esp when tying ourselves to gogoproto. So... lets simplify.

So, in general, []byte in JSON will always be base64.

We can make some exceptions. Like our addresses, which could still be cmn.HexBytes (renamed from cmn.Bytes). Actually go-crypto sdk2 branch does this now. So in JSON form it's an encoded string. I don't think we should do this for protobuf since nobody (sane) reads protobuf encoded bytes. So then, we can still do the gogoproto extension and use HexBytes/Bae58Bytes, but in the proto file it would be declared primarily as bytes. Thanks @odeke-em, I like your proposal now.

@ebuchman
Copy link
Contributor

I think we probably need to better distinguish the differences between ABCI messages to really clear this up.

In short, only those that take user input (CheckTx/DeliverTx/Query/SetOption) can report over the quality of the execution with codes (ie. codes for OK, Not-OK-Because-X, OK-But-Y). The rest should have no possibility for error, since it's not defined what to do in those cases. Thus it seems right that eg. InitChain, BeginBlock, etc. should panic on error - we just need to make sure the panic is caught in the right place and turned into ResponseException so Tendermint can shutdown gracefully.

@ebuchman
Copy link
Contributor

Some of the changes (ie. log/info) are on develop, and will land in 0.7.

Some, like introducing error, seem to be not on the table.

As for properly handling panics, see #195

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants