Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add draft of v6 specification #80

Merged
merged 10 commits into from
Mar 3, 2018
Merged

Add draft of v6 specification #80

merged 10 commits into from
Mar 3, 2018

Conversation

spenczar
Copy link
Contributor

This is a prerelease draft to explain planned changes coming in v6. The only change
from v5 (so far) is a change of the URL scheme Twirp uses.

Issue #, if available:

Relates to #55, but doesn't implement it yet.

Description of changes:

Here's diff PROTOCOL.md PROTOCOL_V6_DRAFT.md:

< # Twirp Wire Protocol
---
> # EXPERIMENTAL Twirp v6 Wire Protocol
3,4c3,31
< This document defines the Twirp wire protocol over HTTP. The
< current protocol version is v5.
---
> This document is a draft of a new version ("v6") of the Twirp wire
> protocol over HTTP. This draft is EXPERIMENTAL and subject to change.
> v6 of Twirp is not released or considered stable; instead, this
> document lists planned changes which may be available in prerelease
> distributions.
>
> ## Changes from v5
>
> ### URL scheme
>
> In [v5](./PROTOCOL.md), URLs followed this format:
>
> **URL ::= Base-URL "/twirp/" [ Package "." ] Service "/" Method**
>
> In v6, the "/twirp/" prefix is removed:
>
> **URL ::= Base-URL "/" [ Package "." ] Service "/" Method**
>
> The "/twirp/" prefix is removed for three reasons:
>
>  - Trademark concerns: some very large organizations don't want to
>    take any legal risks and are concerned that "twirp" could become
>    trademarked.
>  - Feels like advertising: To some users, putting "twirp" in all your
>    routes feels like it's just supposed to pump Twirp's brand, and
>    provides no value back to users.
>  - Homophonous with "twerp": In some Very Serious settings (like
>    government websites), it's not okay that "twirp" sounds like
>    "twerp", which means something like "insignificant pest."
28c55
< **URL ::= Base-URL "/twirp/" [ Package "." ] Service "/" Method**
---
> **URL ::= Base-URL "/" [ Package "." ] Service "/" Method**
109c136
< POST /twirp/example.echoer.Echo/Hello HTTP/1.1
---
> POST /example.echoer.Echo/Hello HTTP/1.1
120c147
< POST /twirp/example.echoer.Echo/Hello HTTP/1.1
---
> POST /example.echoer.Echo/Hello HTTP/1.1

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This is a prerelease draft to explain planned changes. The only change
from v5 (so far) is a change of the URL scheme Twirp uses.
@spenczar spenczar added this to the v6 milestone Feb 14, 2018
@spenczar
Copy link
Contributor Author

cc @mocax, @rhysh, @marioizquierdo, @Xe.

docs/spec_v6.md Outdated
provides no value back to users.
- Homophonous with "twerp": In some Very Serious settings (like
government websites), it's not okay that "twirp" sounds like
"twerp", which means something like "insignificant pest."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a concern only for whoever is updating from v5 to v6, I think it is better to keep the v6 (latest) spec clear without this. It may be enough with adding this info in the Release Notes for v6, or if you want to keep this info in the spec is better to add it in the v5 spec (as a notice that it changed after v6).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll move this sort of thing into a changelog doc that will go unpublished until we release v6.

docs/spec_v6.md Outdated
* **Base-URL** is the virtual location of a Twirp API server, which is
typically published via API documentation or service discovery.
Currently, it should only contain URL `scheme` and `authority`. For
example, "https://example.com".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in v6, the Base-URL should allow a path prefix as well, for example "http://mydomain.com/twirp", specially important to ease backwards compatibility with v5 services.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this restriction. The Base-URL can be any valid URL. There is no need to add any restriction. For example, why do we need to rule out port?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A URL "authority" includes a port, but you're right that we should allow path prefixes.

I don't think we should allow any valid URL. We should not permit query parameters or fragments. I think this should be come "it should only contain a URL scheme, authority, and path, as defined in RFC 3986 Section 3."

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this would be be great - we use path prefixes to route between our services so twirp is a non-starter for us atm.

`CalendarService`.
* **Method** is the proto `rpc` name for an API method. For example,
`CreateEvent`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add an example? It may be useful, specially because Base-URL contains "/", and Package contains "." on it (same symbols used as separators). The example should make it visually clear that Twirp URLs need to be parsed from right to left (extract Method first, then Service, then optional Package, and the rest is the Base-URL).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to parse the URL. The server registers a handler against a URL, and all it needs to do is URL lookup using a hash table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an example further down that I think should be enough.

I agree that URLs don't need to be parsed like that necessarily. For example, if the user mounts the Twirp server under a path prefix, we could say they should trim the path prefix from request URLs before handing the request off to Twirp in the Go implementation. In any case, this is an implementation-specific concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needing to parse right-to-left is worrisome. That will be hard to implement consistently and to explain concisely. It sounds like we're not quite heading in that direction .. but if we are, we should stop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhysh, certainly agreed, but I don't think that's what we're headed to. In Go, users would do something like this:

mux.Handle("/userdefined" + HaberdasherPathPrefix, http.StripPrefix("/userdefined", svc))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 I retire the sentence "need to be parsed from right to left". Sorry that was misleading. The URLs can be parsed in multiple ways, one of them is by starting from the right, but it could also be removing the Base-URL first and then using a regexp, etc. What I mean is that one example will make it easier to understand the things that need to be consider when implementing a Twirp router, or just to understand the routes themselves.


**URL ::= Base-URL "/" [ Package "." ] Service "/" Method**

The "/twirp/" prefix is removed for three reasons:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "/twirp/" prefix is not really removed. It can be part of the Base-URL if the service provider wants to keep it. It is just no longer part of the spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we should definitely allow path prefixes.


In v6, the "/twirp/" prefix is removed:

**URL ::= Base-URL "/" [ Package "." ] Service "/" Method**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the first "/" and make it part of Base-URL. The less client logic, the better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need the leading "/". If it's in the Base-URL, we need to say that the URL must not have an empty path. Otherwise, we'd allow a Base-URL of https://example.com which would lead to this:
https://example.comtwirp.example.Haberdasher/MakeHat

which seems clearly invalid.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't you use the ResolveReference method on *url.URL in order to support both cases?

Eg. https://play.golang.org/p/Zj6ehRCuZ_G

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterhellberg That might be a good way to make the API nice for Go users, but this question is on what the spec should be, which is a language-independent question. We're trying to work out what the BNF for the URL scheme should look like, which is probably a bit pedantic, but important to get right.

In [ABNF syntax](https://tools.ietf.org/html/rfc5234), Twirp's URLs
have the following format:

**URL ::= Base-URL "/" [ Package "." ] Service "/" Method**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This URL schema is identical to gRPC, which seems confusing. At least it would conflict with existing logging, monitoring, routing, load balancing systems. While I am not saying gRPC should own this URL schema, it is better not to be identical between Twirp and gRPC.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the schema was supposed to be URL ::= Base-URL "/" [ Package "/" ] Service "/" Method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mocax, In case of conflicts, users could mount Twirp at a different path prefix, so I'm not very worried about that.

@Xe, Right! Don't know how I forgot that part.

docs/spec_v6.md Outdated
* **Base-URL** is the virtual location of a Twirp API server, which is
typically published via API documentation or service discovery.
Currently, it should only contain URL `scheme` and `authority`. For
example, "https://example.com".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this restriction. The Base-URL can be any valid URL. There is no need to add any restriction. For example, why do we need to rule out port?

`CalendarService`.
* **Method** is the proto `rpc` name for an API method. For example,
`CreateEvent`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to parse the URL. The server registers a handler against a URL, and all it needs to do is URL lookup using a hash table.

@spenczar
Copy link
Contributor Author

All concerns addressed, I believe - please take another look, y'all.

Copy link
Contributor

@rhysh rhysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the complexity of using '/' as a delimiter when the package may be unset.

```

Version 6 changes this format to remove the mandatory `"/twirp"` prefix, and
changes `Package` and `Service` to be delimited with a `/` instead of a `.`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Packages will still contain . characters; we're not changing the internal delimiter. How about this phrasing:

"and changes the separator between Package and Service to be a / instead of a . character"

to set any prefix you like.

If you loved the old `/twirp` prefix, you can still use it by using a base URL
that ends with `/twirp`. You're no longer forced into use it, however.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can keep that prefix, but 1) they'll need to remove it before passing the request to the Twirp generated code, and 2) the rest of the URI path won't be the same as v5 because of the .->/ change (unless the package is unset).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'll clarify this.


This document is a draft of a new version ("v6") of the Twirp wire
protocol over HTTP. This draft is EXPERIMENTAL and subject to change.
v6 of Twirp is not released or considered stable; instead, this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"not yet released ...", unless you want something like "even-numbered releases are unstable"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, not yet released - I'm just trying to explain that it's a draft.

`CalendarService`.
* **Method** is the proto `rpc` name for an API method. For example,
`CreateEvent`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needing to parse right-to-left is worrisome. That will be hard to implement consistently and to explain concisely. It sounds like we're not quite heading in that direction .. but if we are, we should stop.

docs/spec_v6.md Outdated
**Proto Request**

```
POST /userprefix/example.echoer/Echo/Hello HTTP/1.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using / as a delimiter forces us to deal with cases of overlap. Consider the following two:

One has no package declaration, and describes a service called "Foo" with a method "Hello". It gets requests like POST /userprefix/Foo/Hello.

The other has a package declaration of "package Foo;" .. not matching the normal lower-case pattern, but a package declaration nonetheless. It describes some service, maybe "Bar" with a method "Goodbye". It gets requests like POST /userprefix/Foo/Bar/Goodbye.

How does a user route requests to those two services? Do they need to count the slashes? Peeking into the later parts of the URI path and only partially understanding them is an invitation for maintenance and security problems down the road.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, this is an extremely good point, and might torpedo the / idea here. I'm going to remove the . -> / change from this commit.

Let's discuss in #86.


```bnf
**URL ::= Base-URL "/" [ Package "/" ] Service "/" Method**
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if there was already a discussion about using "." vs "/" for the package separator, but I am definitely not convinced that doing the . -> / change is a good thing. Let me list the advantages that I can see with each one:

Advantages of "/" separator:

  • Looks nicer when described like this: Base/Package/Service/Method.
  • It is different from gRPC routes. This was mentioned before as a pro, but I don't clearly see the advantage of this; only the very trained eye will be able to tell the difference at a glance, and I don't see the advantage of keeping a Twirp and a gRPC service both in the same Base-URL, only differentiated by the "." vs "/" syntax.
  • If there is no package or the package has no dots, and the Base-URL also has no dots, then the resulting URL will not have dots. Maybe this is easier to recognize by some middleware?

Advantages of "." separator:

  • Somewhat backwards-compatible. With the "/", service implementations will need to explicitly handle both new and old paths. With the "." this is not entirely true, because even if the service didn't support old routes, the user could still add the "/twirp" prefix on the Base-URL to deal with compatibility.
  • Since packages are optional, the "." makes it easier to tell if a URL has a package (blahblah.com/api/myserviceapi.CoolAPI/CreateStuff) or not (blahblah.com/api/CoolAPI/CreateStuff).
  • Avoids service path conflicts. For example, given the services Foo1 with package package and name Foo, and Foo2 with no package and the same name Foo. With the "." a URL like http://fooservice.com/package/Foo/Method is clearly mapping to Foo2 with the Base-URL http://fooservice.com/package, but with the "/" the same URL is ambiguous. This is probably an edge case because in most cases the user controls the Base-URL, but it is still possible.

I may be missing some other point (please comment!), but the advantages of the "/" separator look a little weak to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent points, and I agree that we under-designed this. I think PR feedback is a tough place to do design discussion, though. Could you chime in on #86? You can just paste in what you have here :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll add this points to #86. Now I realize they are very similar to what @rhysh was talking about, but presented in a different order. It may help to add clarity to design decisions.

`CalendarService`.
* **Method** is the proto `rpc` name for an API method. For example,
`CreateEvent`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 I retire the sentence "need to be parsed from right to left". Sorry that was misleading. The URLs can be parsed in multiple ways, one of them is by starting from the right, but it could also be removing the Base-URL first and then using a regexp, etc. What I mean is that one example will make it easier to understand the things that need to be consider when implementing a Twirp router, or just to understand the routes themselves.

@mocax
Copy link
Contributor

mocax commented Feb 20, 2018 via email

@spenczar
Copy link
Contributor Author

@mocax, I don't think I agree. It may be extremely rare within Google, but it is not so rare in the wild. We need to make a spec sturdy enough that this sort of thing doesn't cause problems for our users.

We could require package be set (making it an even breakier change!) or we could back away and keep . as the delimiter. Let's discuss that in #86. For now, I'm going to change this PR to only remove the /twirp prefix.

We might end up using / as delimiter, but it needs to be more
thoroughly designed. Design will take place in
#88.
@mocax
Copy link
Contributor

mocax commented Feb 21, 2018 via email

@spenczar
Copy link
Contributor Author

@marioizquierdo and @rhysh, please take another look. I think I've addressed all concerns here.

}
```

### Error Codes
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much like this section and how it's formatted

docs/spec_v6.md Outdated

| Twirp Error Code | HTTP Status | Description
| ------------------- | ----------- | -----------
| canceled | 408 | The operation was cancelled.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these error codes sent as strings over the wire?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, they are.

Copy link
Contributor

@marioizquierdo marioizquierdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! :shipit:

Copy link
Contributor

@rhysh rhysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after two protobuf syntax fixes. The other two nits are up to you.

docs/spec_v6.md Outdated
}

message HelloRequest {
string message;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string message = 1;

docs/spec_v6.md Outdated
}

message HelloResponse {
string message;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string message = 1;

* **msg**: A human-readable message describing the error
as a string.
* **meta**: (optional) An object with string values holding
arbitrary additional metadata describing the error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of "arbitrary" here makes it sound like they don't need to be strings .. even though "string values" makes it clear that they need to be.

Consider s/metadata/string-formatted metadata/

docs/spec_v6.md Outdated

| Twirp Error Code | HTTP Status | Description
| ------------------- | ----------- | -----------
| canceled | 408 | The operation was cancelled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error code is canceled, which is the spelling the Go project uses. Consider s/cancelled/canceled/ in the description for consistency :)

@spenczar spenczar merged commit 500dc2d into master Mar 3, 2018
@spenczar spenczar deleted the v6_spec_draft branch March 3, 2018 00:20
@spenczar spenczar mentioned this pull request Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants