Text format: type more operations #408

jfbastien · 2015-10-13T20:23:01Z

Why isn't return typed, e.g. i32.return?

I get that the type is implicit because of the function's signature, but where are we drawing the line on consistency?

i32.add is typed because that allows us to ignore the operand subtree's expression.

Should other operations be typed? get_local? break?

What about if type=expressions? Shouldn't all typed operations have an explicit type?

The text was updated successfully, but these errors were encountered:

lukewagner · 2015-10-13T21:46:31Z

One subtle difference is that return and get_local's types are fully determined by the static function context, not by the types of subexpressions. I expect we'll have an easier time making non-subjective arguments in favor of different points on this type-specificity spectrum once we start defining the binary encoding and writing real decoders.

jfbastien · 2015-10-13T22:46:00Z

I agree that it's known from the function's context, but it's not clear that's particularly useful. When types were added the reasoning was that it made things easier to understand for developers, and I think the same applies for return and get_local.

lukewagner · 2015-10-13T23:38:11Z

The original motivations had more to do with the predicted simplicity of implementation and interactions with binary format which I think are the higher priority. There are still some big pending choices to be made about the binary format (e.g., type-segregated opcode spaces, module-local opcode tables) so it's hard to really evaluate this atm. What is "easiest to understand for developers" sounds like a bikeshedfest since I don't think any of these options will significantly move the dial on overall understandability; for that you want source info.

jfbastien · 2015-10-14T00:11:47Z

I'm only concerned with the textual format on this. It seems easier to read IMO, since it's consistent.

rossberg · 2015-10-14T12:02:36Z

Why are you singling out `return`? You could make the same argument for `call`, `get_local`, `if`, or virtually any control operator. So it's not clear to me what notion of consistency you have in mind. Generally speaking, the fundamental difference between these and arithmetic operators is as follows: arithmetic operators have a different semantics for different types. Consequently, you want to be explicit about indexing them with a type, because it selects an actual behaviour. That is not the case for control operators or ones like `get_local` -- these are fully _parametric_ in their types, i.e., the type does not affect their behaviour. Consequently, there is no reason to include it. Note also that putting types on `return` or other control operators wouldn't easily scale to multiple values.

sunfishcode · 2015-10-14T14:50:27Z

Ignoring control constructs, call, and return for a moment, should we put a type on get_local, etc.? Conceptual abstractions aside,

advantage: simple S-expression consumers could determine the types of such nodes without context
disadvantage: it makes the S-expression format more verbose

It's subjective, but it seems like a win to me. The purpose of the S-expression syntax is to be explicit and easy to process with simple tools. Thoughts?

rossberg · 2015-10-14T15:14:40Z

I suppose my question would be the same: why single out that one? How are, say, calls any different? How does it help a simple consumer to have annotations on some but not the others?

sunfishcode · 2015-10-14T15:40:01Z

I temporarily excluded calls, returns, and other control constructs because I'm wondering if we shouldn't address those at a higher level. I've been thinking about a stack abstraction rather than having values as operands of returns and so on. We'd of course restrict the number and types of things on the stack, and we still have structured control flow, so it'd still be easily statically type-checkable and verifiable, but it'd simplify several things.

For example, since the current thinking for multiple result values is call_multiple, which is better than some alternatives, isn't great for eg. forwarding the result values from one call to the result of another. A stack mechanism for passing values around would be a pretty simple mechanism with some nice properties. And it would separate concerns; control instructions could focus on being control instructions and not have operands for things they don't inspect themselves.

It's just an idea at this point, but it is something I'm thinking about.

rossberg · 2015-10-14T15:47:23Z

Yeah, I think that call_multiple is the wrong approach to multiple return values, for the reasons you mention and some others. But of course we wouldn't need an explicit stack to deal with that.

Anyway, probably getting off-topic.

sunfishcode · 2015-10-14T15:57:22Z

Ok, so putting control constructs aside for the moment, should we add types to get_local and set_local for the reasoning above?

jfbastien · 2015-10-14T16:10:05Z

I'm singling out return as an example, and am asking about all operations in general.

lukewagner · 2015-10-14T16:11:03Z

A parser of the s-expr language is already going to need to maintain a static function context (to resolve $foo local names to their associated index) so I don't see that simplifying the s-expr parser's job but, rather, just adding extra work to catch type mismatches.

jfbastien · 2015-10-14T16:17:41Z

Catching type mistakes was one of the justifications to type operations in the first place. Can't use the same reason both ways ;-)

I agree that for a parser this doesn't do much. I'm approaching this from the POV of someone who would read or write this wasm assembly, the consistency seems nicer to me (having just written some examples by hand).

rossberg · 2015-10-14T16:24:27Z

As I argued above, the fundamental difference is that (1) the types don't affect the operational meaning of these operators, so they are redundant information, and (2) they (or some of them) are not conceptually limited to a (finite) set of primitive types, but may eventually evolve to handle user-defined types or multiple values, which don't fit the limited format.

lukewagner · 2015-10-14T17:16:52Z

@jfbastien Someone may have used that argument, but I actually don't think "catching (static) errors" should be a design goal (and it may actively work against other goals). Simplicity of spec, impl and codegen; this is what I think we should design for and I don't see i32.return helping any of those. Everything else being equal, @rossberg-chromium's conceptual argument also makes sense to me.

titzer · 2015-10-14T20:33:17Z

I agree with @lukewagner regarding locals; verifying the AST requires a
function context that maps locals to their types, so typing get_local and
set_local would be redundant. Same reasoning for return. Calls always
reference either an explicit function or an signature, so that signature
determines their type.

On Wed, Oct 14, 2015 at 10:17 AM, Luke Wagner notifications@github.com
wrote:

@jfbastien https://github.com/jfbastien Someone may have used that
argument, but I actually don't think "catching (static) errors" should be a
design goal (and it may actively work against other goals). Simplicity of
spec, impl and codegen; this is what I think we should design for and I
don't see i32.return helping any of those. Everything else being equal,
@rossberg-chromium https://github.com/rossberg-chromium's conceptual
argument also makes sense to me.

—
Reply to this email directly or view it on GitHub
#408 (comment).

jfbastien · 2015-10-14T20:53:06Z

I understand that the type isn't strictly required, that's not the reasoning I offered: I'm approaching this purely from the POV of reading / writing the textual format.

I agree that multi-returns or UDTs would require expanding the signature, but that's true of the semantics of the entire text format. It's pretty easy to spec properly.

The argument on "arithmetic needs the disambiguation, return doesn't" is one of aesthetics. I agree for the binary format (because not having it affects decoding speed), but the textual format doesn't require types at all (it's redundant here too). It makes the text easier to read to have the type. The same thing is true for get_local and return and other operations.

In fact, type does affect the semantics of multi-value return and UDT return.

lukewagner · 2015-10-14T23:05:03Z

If the text format is the extent of your concerns, it seems like we should table this discussion until we start defining the text format.

qwertie · 2015-10-15T04:30:00Z

IMO the text format should be made to contain the same information as the binary format does, unless there is a really good reason not to, so that the text format gives users some intuition about how the binary format works. Also, if I'm writing Wasm code, I don't want to write information that the assembler will immediately throw away.

jfbastien · 2015-10-15T06:14:32Z

It sounds like the consensus is: redundant type checks aren't a design goal of the text format?

rossberg · 2015-10-15T06:39:26Z

Yes, I think so. I'd say the design goal is for the text format to contain the same amount of information as the binary format will (modulo naming things).

rossberg · 2015-10-15T07:42:02Z

On 14 October 2015 at 22:53, JF Bastien notifications@github.com wrote:

The argument on "arithmetic needs the disambiguation, return doesn't"
is one of aesthetics. I agree for the binary format (because not having it
affects decoding speed), but the textual format doesn't require types at
all (it's redundant here too). It makes the text easier to read to have the
type. The same thing is true for get_local and return and other
operations.

Maybe the difference becomes clearer if you don't think of the type in
i32.add vs f32.add as a type annotation. It's part of the operator
name, and different types imply different operators. If you didn't have the
type in there you'd essentially introduce operator overloading into Wasm.
Contrast that with return, which is one operator, agnostic to types, and
always "doing the same".

In fact, type does affect the semantics of multi-value return and UDT
return.

Only if we screw it up. It may very well be that implementations choose to
handle them differently, but that should not be semantically observable.

sunfishcode · 2015-10-22T18:14:10Z

I agree that there's inconsistency here, but right now it's just in the S-expression format. We can talk about whether the text format the same amount of information as the binary format when we're designing a real text and binary format :-).

lukewagner · 2015-12-02T17:42:52Z

Just a note: I realized that another specific example of what Andreas was mentioning above is the "opaque reference type" idea mentioned in GC.md: the only operations you'd be able to perform on these opaque types are precisely the ops like get_local which don't have type annotations since they just shuffle their values around like a black box.

jfbastien added the question label Oct 13, 2015

jfbastien added this to the MVP milestone Oct 13, 2015

lukewagner mentioned this issue Oct 22, 2015

Add type to getglobal/setglobal/getlocal/setlocal opcodes #311

Closed

sunfishcode closed this as completed Oct 22, 2015

AndrewScheidecker mentioned this issue Nov 18, 2015

Call return types and multiple passes? WebAssembly/spec#182

Closed

lukewagner mentioned this issue Dec 24, 2015

Re-consider the indexing of local variables, giving each type its own dense index space. #509

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text format: type more operations #408

Text format: type more operations #408

jfbastien commented Oct 13, 2015

lukewagner commented Oct 13, 2015

jfbastien commented Oct 13, 2015

lukewagner commented Oct 13, 2015

jfbastien commented Oct 14, 2015

rossberg commented Oct 14, 2015 via email

sunfishcode commented Oct 14, 2015

rossberg commented Oct 14, 2015

sunfishcode commented Oct 14, 2015

rossberg commented Oct 14, 2015

sunfishcode commented Oct 14, 2015

jfbastien commented Oct 14, 2015

lukewagner commented Oct 14, 2015

jfbastien commented Oct 14, 2015

rossberg commented Oct 14, 2015

lukewagner commented Oct 14, 2015

titzer commented Oct 14, 2015

jfbastien commented Oct 14, 2015

lukewagner commented Oct 14, 2015

qwertie commented Oct 15, 2015

jfbastien commented Oct 15, 2015

rossberg commented Oct 15, 2015

rossberg commented Oct 15, 2015

sunfishcode commented Oct 22, 2015

lukewagner commented Dec 2, 2015

Text format: type more operations #408

Text format: type more operations #408

Comments

jfbastien commented Oct 13, 2015

lukewagner commented Oct 13, 2015

jfbastien commented Oct 13, 2015

lukewagner commented Oct 13, 2015

jfbastien commented Oct 14, 2015

rossberg commented Oct 14, 2015 via email

sunfishcode commented Oct 14, 2015

rossberg commented Oct 14, 2015

sunfishcode commented Oct 14, 2015

rossberg commented Oct 14, 2015

sunfishcode commented Oct 14, 2015

jfbastien commented Oct 14, 2015

lukewagner commented Oct 14, 2015

jfbastien commented Oct 14, 2015

rossberg commented Oct 14, 2015

lukewagner commented Oct 14, 2015

titzer commented Oct 14, 2015

jfbastien commented Oct 14, 2015

lukewagner commented Oct 14, 2015

qwertie commented Oct 15, 2015

jfbastien commented Oct 15, 2015

rossberg commented Oct 15, 2015

rossberg commented Oct 15, 2015

sunfishcode commented Oct 22, 2015

lukewagner commented Dec 2, 2015