Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JS Memory and Table API, support dynamic linking #682

Merged
merged 22 commits into from
Jun 28, 2016
Merged

Conversation

lukewagner
Copy link
Member

This PR adds first-class WebAssembly.Memory and WebAssembly.Table objects along with the ability for modules to import and export any number of these from a given module. This generalizes the current linear memory / table design wherein a module cannot import linear memory and is limited to 1 linear memory / table for internal use and export. This PR also generalizes tables to be resizable (symmetric to memory) and parameterized by an element type which is, in the MVP, restricted to either "any callable function" or "any callable function with a given signature". Since there can be multiple memories/tables, the memory and table opcodes are given an index immediate.

With this PR, it should be possible to implement dynamic linking in a toolchain so this PR also updates DynamicLinking.md to explain how. See the note at the bottom of DynamicLinking.md for an obvious omission which we should consider adding (viz., constant-value imports/exports, which would be used to position data segments and function-table-elements at load time).

This PR also punts on the precise way to describe a signature in JS when calling the WebAssembly.Table constructor, saving that for its own PR where we can bikeshed; see TODO in ToTypeDescriptor.

@lukewagner lukewagner added this to the MVP milestone May 5, 2016
@kripken
Copy link
Member

kripken commented May 5, 2016

I am very excited about Tables! :) That's all we need for some useful forms of dynamic linking.

But I am worried about Memories. First, a concern here is a binary code size increase due to adding an immediate to loads and stores, which are very common (those two are over 13% of opcodes, and adding one byte for each leads to a 6% increase in binary size). Second, imported memories are not strictly needed for dynamic linking; emscripten supports dlopen() and shared libraries right now without multiple memories, for example. The cost is copying the memory init during startup, but it's not that bad - is there some other benefit to multiple Memories that I am missing?

Overall, Tables and Memories seem like separate features. Perhaps it makes sense to discuss them separately?

@lukewagner
Copy link
Member Author

With partial specialization of immediates in the opcode table, then the 1 opcode entry (per load/store op) that would already be allocated to partially specialize natural alignment would also specify the heap index, so this doesn't necessarily impact code size. I expect we'll discuss this PR for a while, during which time we can figure out opcode tables and have more concrete details, but I wouldn't argue against this feature just for presumed binary size reasons now.

Multiple memories are basically an obvious generalization that I think practically every person in the wasm group has independently proposed over the last year. The use cases involve some of the ones given for having memory-imports at all in previous discussions: when implementing a VM, you could have separate host and guest memories. Separate memories might also be a good tool to avoid OOMs due to 32-bit fragmentation (by splitting 1 huge contiguous range into multiple) and could be used by special subsystems like file i/o which can statically separate where different data lives. Lastly, in the future, we might allow imports of other, non-WebAssembly.Memory objects (e.g., (Shared)ArrayBuffer, something to alias the bits of a <canvas>, ...), allowing fast and flexible access to these via memory operations.

* tables may not be directly mutated or resized from WebAssembly code;
initially this must be done through the host environment (e.g., the
the `WebAssembly` [JavaScript API](JS.md#webassemblytable-objects).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a small step to allow a table specialized to a particular function signature, thus allowing instances of call_table to do validation checks when compiling or loading and thus avoid dynamic signature checks. Could just generalize the current function table, adding a signature field that has a wild value when not specialized, and allow multiple function tables in the MVP.

@jfbastien
Copy link
Member

I'm quite a fan of memories, it'll also make it easier to separate stuff out of the current single heap, such as the user stack, reducing user-side bugs / security issues, and it'll potentially also reduce fragmentation if the compiler is clever about certain things (I think that'll be a longer term gain).

`(offset, length)` ranges of a given table, specified by its
[table index](#index-spaces). The elements are specified by their
[function index](#index-spaces). This section is the table analogue of the
linear memory [data section](#data-section).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand what the "elements section" contains? What are the "element segments"? Are they sub-ranges of table index spaces? Or is this a section with all the elements from all tables with the elements grouped into ranges as in the tables? Why is this needed and how is it to be used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, what this PR is doing is splitting 1 section (the table section) into two halves: two ways to "get" a table (import or define), and 1 way to stick your module's table elements into any table you "got". The best way to think about this is as symmetric to memory imports/definitions and data segments. I rewrote this section to be more clear, though.

@ghost
Copy link

ghost commented May 6, 2016

@jfbastien The stack is a high frequency access area, and it's going to degrade the file size to be flipping between opcodes with different segments, so developers are not going be a 'fans' of code using that style. If developers want a bounds checked stack then they can emit explicit checks - this use case does not need explicit support by wasm.

Bounds checking lots of small segments would be hard to optimize. Each is going to need a large chunks of VM for a memory protection scheme, so they would need large base offsets so either use a lot of immediate code memory or increase register pressure.

I doubt the 'fragmentation' benefits would be realized in most cases anyway as the main use of these segments would likely be for small areas of memory per module and modules will be using global memory for general use.

Lastly, C code is generally not written in a segmented memory model these days. I recall the old days with all the different memory models, and I would not wish this on the next generation of web developers.

@kripken
Copy link
Member

kripken commented May 6, 2016

@lukewagner: Agreed that we can keep code size issues for later in the discussion of this PR.

Regarding the use cases presented by you and @jfbastien:

  1. They could also be done by a GC reference to an ArrayBuffer, which is a feature on the roadmap anyhow. There may be a perf difference, but for several of those use cases it probably wouldn't matter anyhow (e.g., for a separate file I/O space, emscripten has an option for that now with the filesystem stored out of main memory, and it works reasonably well). For others, we should measure.
  2. Using multiple memory spaces in a C or C++ program is not going to directly work (since each pointer needs to be able to refer to anything), so I don't see how the user stack as suggested could be done in another space? This feature will be very limited in scope.

Given those two, I don't see Memories as well-motivated, certainly not for the MVP - in what way are we not viable otherwise?

@lukewagner
Copy link
Member Author

@kripken C++ could be given access to multiple memories through address space annotations. This prohibits use of more than 1 general purpose memory (whose pointers could escape "into the wild", where static address space is lost), so unfortunately, I don't think this could be used for, e.g., the user stack. But this could be used by self-contained subsystems, like the in-memory file-system and interfacing with APIs in the future. Using GC references is a further generalization but would lose some optimization potential.

As I said in a comment above, if we thought this would be a major complexity burden, we could save ourselves some time in the MVP by constraining to 0..1. But given the work I think we all will want to do to support better code sharing (this type of stuff), I think the work required to support N heaps won't be too bad. Even if we constrain to 0..1 in the MVP, though, I'd like to have the design for N understood/documented since it's very easy to make a design that bakes in assumptions of 1 and would then require duplicating a bunch of things.

@lukewagner
Copy link
Member Author

(Interested to hear what other engines think about the work required for multiple memories, of course; constraining to 0..1 is certainly an acceptable option if it looked prohibitive.)

@rossberg
Copy link
Member

rossberg commented May 9, 2016

I wonder whether a call_table operator still is the right thing to have at this point. If we plan to add other table entry types in the future, wouldn't it make more sense to decompose it into get_table and call_ptr? We want to add function pointers anyway, so maybe the logical step would be to do that now, and avoid a redundant special case?

@lukewagner
Copy link
Member Author

@rossberg-chromium That's an interesting idea, but if we add function pointers as a real value type, it seems like we'll end up needing full GC safe-point support which I've been enjoying not having to mess with. I also wonder if we might accidentally rush the feature (b/c we only need this sliver of it) in a way that we might regret if we waited until we understood the whole GC problem domain. Lastly, I'm interested in having this option that, even when we have first-class references, allowing wasm to only manipulate those references indirectly (through indices into tables) in a way that allows a wasm worker (or even webapp) to not need to create a GC heap or any of the related machinery; from this POV, you want the ability to pass in the index to the thing you want to operate on, which lines up nicely with call_table.

@rossberg
Copy link
Member

@lukewagner, hm, I was referring to plain, C-style function pointers, not closures. How would that involve GC?

@ghost
Copy link

ghost commented May 10, 2016

fwiw I think there should be a distinct variant of wasm that works without garbage collection, even after the support for GC objects has been added. It could be a variant that is declared and allows the runtime to optimize for this. Threads running such variants could run in parallel with garbage collection on other threads serving different instances.

@rossberg-chromium Some examples of call_ptr might help? If they are just C function pointers then this use case seems to be handled in wasm by call_indirect or call_table. If they are boxed opaque objects then are they a small set of pinned objects not subject to GC?

@lukewagner If external GC objects can be referenced by an index then the external GC needs to know about these references. It might be more generally useful to be able to import boxed objects defined to be pinned in memory and reference counted, so that the wasm thread need not be concerned about them being garbage collected yet still be able to pass them around as boxed objects.

@rossberg
Copy link
Member

rossberg commented May 10, 2016

@JSStats, very straightforward: introduce a new value type, say, funcptr; values can be created by an operator (addressof <funcid>), and consumed by the call operator (call_ptr <typeid> <ptr> <arg>*).

Moreover, you can introduce typed variants, i.e., value types (funcptr <typeid>), where the call operator does not need to perform any runtime check.

In fact, I would prefer to avoid misleading terminology like "pointer" and "address" for this, since these are opaque references. That's how they differ from the table: the only reason we need that is to compile C, where function pointers have to be numeric, because (a) C allows stupid casts, and (b) you need to be able to store them in linear memory. For other purposes, this indirection is completely unnecessary.

@ghost
Copy link

ghost commented May 10, 2016

@rossberg-chromium It appears that without involving garbage collection that these funcptrs would need to be interned objects anyway, so could be referenced by an index, so it's not clear what has been gained, and these objects could not be stored in the linear memory which is a loss, so for a wasm without garbage collection they appear to be a net loss?

This PR proposes typed function index spaces to avoid runtime signature checks.

Being able to store function references in linear memory seems a very compelling use case to me.

@rossberg
Copy link
Member

@JSStats, no interned objects needed, just plain code addresses.


[hosted libraries]: https://developers.google.com/speed/libraries/
[service workers]: https://www.w3.org/TR/service-workers/
WebAssembly supports load-time and run-time (`dlopen`) dynamic linking in the
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supports isn't really the right term here. It's more "enables" or "allows", since WebAssembly itself only provides mechanisms, and the rest is done in user space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, changed.

@titzer
Copy link

titzer commented May 10, 2016

I'm not sold on multiple memories, or why we actually want them quite yet. It's a tempting generalization, but I don't see the problem they actually solve. Actually, I think they create a far worse problem providing a statically segmented address space as the programming model. Would we want dynamically indexed memories in the future? Then why the restriction to static indexing now?

There are larger questions we need to discuss here, like what will the role of memory be in the future programming model, which has yet to really emerge. In particular, a future with managed data extensions make adding more generality with multiple memories less important, and tantalizingly suggest that memory or memories in the future will be relegated to IO duties rather than the main program storage.

@ghost
Copy link

ghost commented May 10, 2016

@rossberg-chromium Thank you, I guess raw unboxed code addresses could be passed around and stored in local variables just as a floating point numbers are, and I guess these could be prohibited from escaping or could be boxed if they do escape and perhaps then qualified by the instance too. But the inability to store them in the linear memory is still a show stopper.

@rossberg
Copy link
Member

@JSStats, for that we continue to have tables. But they're only needed in some cases. And it's pretty clear that we will eventually get other opaque types that cannot be stored in linear memory anyway.

@titzer
Copy link

titzer commented May 11, 2016

One criteria that was considered important for the current design of function pointers was to avoid having local types for function pointers, because we weren't (and IMO still aren't) ready to cross that boundary for MVP.

@lukewagner
Copy link
Member Author

@titzer Assuming we get a first-class Memory references some time in the future (by which I mean: that you could store in a local and access dynamically), I don't see the fundamental difference/cost (performance, implementation, complexity) with allowing multiple memory imports. It seems unnecessarily irregular if we have multiple imports/exports of functions, tables (and, in the future: values, global variables, thread-local variables, ...) but only single imports/exports of memories. It begs the question: Why not memory too?

As for "why the static restriction now?" I think it's the same answer for tables (assuming we get first-class Table references too): because it allows you to declaratively compose modules (via imports/exports) that share these things. E.g., if one has a worker containing only modules that import/export memory/tables, and therefore don't pass around first-class GC references, you don't need a GC in that worker (and you can know this a priori).

Again, if we think it causes undue impl burden in the MVP, I'd be fine to simply constrain the number of memory imports to <=1, as we've done with the number of return types in signatures. I'm not trying to create more MVP work; I'm trying to avoid long-term irregularity that isn't pragmatically necessary.

@ghost
Copy link

ghost commented May 12, 2016

@lukewagner But what is the benefit of being able 'to declaratively compose modules'? The claimed reasons seems to be that 'you don't need a GC in that worker', but having boxed object and garbage collection seem somewhat orthogonal, and a worker could accept a boxed buffer reference without having to worry about garbage collection but would just need some guarantees that the object is not going to be moved or reclaimed which could be a burden for the supplier of this object and not the worker.

Practically most implementations will need to keep these buffer bases in memory and load them as needed whether they are defined declaratively or dynamically, so what has been achieved? Perhaps 32 bit x86 could be the exception where it might be possible to bake in the bases into the instruction immediate offsets.

If there is a benefit then perhaps it is something else? For example, knowing the size of a buffer at compile time might help optimize accesses. Knowing that these buffers are in an area of VM using memory protection to assist avoiding bounds checks might help, but then perhaps there could just be a dynamic buffer type that has these properties too, to have buffer types with constraints on their size and with different memory protection guard zones etc.

sunfishcode added a commit that referenced this pull request Jul 8, 2016
This is renamed to "anyfunc" in AstSemantics.md in #682.

Also, fix grammar in the Elements Section.
lukewagner pushed a commit that referenced this pull request Jul 9, 2016
This is renamed to "anyfunc" in AstSemantics.md in #682.

Also, fix grammar in the Elements Section.
lukewagner pushed a commit that referenced this pull request Jul 18, 2016
lukewagner pushed a commit that referenced this pull request Jul 18, 2016
lukewagner pushed a commit that referenced this pull request Jul 18, 2016
lukewagner pushed a commit that referenced this pull request Jul 18, 2016
lukewagner pushed a commit that referenced this pull request Jul 27, 2016
kisg pushed a commit to paul99/v8mips that referenced this pull request Jul 28, 2016
This patch updates internal data structures used by V8 to support
multiple indirect function tables (WebAssembly/design#682). But, since
this feature is post-MVP, the functionality is not directly exposed and
parsing/generation of WebAssembly is left unchanged. Nevertheless, it
is being used in an experiment to implement fine-grained control flow
integrity based on C/C++ types.

BUG=

Review-Url: https://codereview.chromium.org/2174123002
Cr-Commit-Position: refs/heads/master@{#38110}
@sunfishcode sunfishcode deleted the dynamic-linking branch August 9, 2016 19:49
rossberg added a commit to WebAssembly/spec that referenced this pull request Aug 12, 2016
Implements global declarations as of WebAssembly/design#682. Still missing: mutability, import/export.
rossberg added a commit to WebAssembly/spec that referenced this pull request Aug 12, 2016
As of WebAssembly/design#682. Still misses import/export abilities.

This changes & extends the S-expression syntax for tables and memories in the following way:
```
<elem_type>: anyfunc
<table>:     (table <nat> <nat>? <elem_type>)
<memory>:    (memory <nat> <nat>?)
<elem>:      (elem <expr> <var>*)
<data>:      (data <expr> <string>*)
```
In particular, memory segments are no longer part of the `memory` definition, in anticipation of the ability to import memory. Same for tables. This also mirrors the Wasm section structure more closely.

However, it is pretty tedious to count table elements in the common case, so the following shorthand is available:
```
(table <elem_type> (elem <var>*))    ;; = (table <size> <size> <elem_type>) (elem (i32.const 0) <var>*)
```
which pretty much behaves like the previous table syntax.
For symmetry, I introduced the analogous shorthand for memories, which turns out to be useful in tests as well:
```
(memory (data <string>*))   ;; = (memory <size> <size>) 
(data (i32.const 0) <string>*)
```
where `<size>` is the strings' total length in (rounded up) page units.

In the future, we can extend this syntax to name tables and memories, e.g.:
```
<table>:   (table <name>? <nat> <nat>? <elem_type>)
<memory>:  (memory <name>? <nat> <nat>?)
<elem>:    (elem <var>? <expr> <var>*)
<data>:    (data <var>? <expr> <string>*)
```
titzer pushed a commit that referenced this pull request Sep 29, 2016
* Clarify that wasm may be viewed as either an AST or a stack machine. (#686)

* Clarify that wasm may be viewed as either an AST or a stack machine.

* Reword the introductory paragraph.

* Add parens, remove "typed".

* Make opcode 0x00 `unreachable`. (#684)

Make opcode 0x00 `unreachable`, and move `nop` to a non-zero opcode.

All-zeros is one of the more common patterns of corrupted data. This
change makes it more likely that code that is accidentally zeroed, in
whole or in part, will be noticed when executed rather than silently
running through a nop slide.

Obviously, this doesn't matter when an opcode table is present, but
if there is a default opcode table, it would presumably use the
opcodes defined here.

* BinaryEncoding.md changes implied by #682

* Fix thinko in import section

* Rename definition_kind to external_kind for precision

* Rename resizable_definition to resizable_limits

* Add  opcode delimiter to init_expr

* Add Elem section to ToC and move it before Data section to reflect Table going before Memory

* Add missing init_expr to global variables and undo the grouped representation of globals

* Note that only immutable globals can be exported

* Change the other 'mutability' flag to 'varuint1'

* Give 'anyfunc' its own opcode

* Add note about immutable global import requirement

* Remove explicit 'default' flag; make memory/table default by default

* Change (get|set)_global opcodes

* Add end opcode to functions

* Use section codes instead of section names

(rebasing onto 0xC instead of master)

This PR proposes uses section codes for known sections, which is more compact and easier to check in a decoder.
It allows for user-defined sections that have string names to be encoded in the same manner as before.
The scheme of using negative numbers proposed here also has the advantage of allowing a single decoder to accept the old (0xB) format and the new (0xC) format for the time being.

* Use LEB for br_table (#738)

* Describe operand order of call_indirect (#758)

* Remove arities from call/return (#748)

* Limit varint sizes in Binary Encoding. (#764)

* Global section (#771)

global-variable was a broken anchor and the type of count was an undefined reference and inconsistent with all the rest of the sections.

* Make name section a user-string section.

* Update BinaryEncoding.md

* Update BinaryEncoding.md

* Use positive section code byte

* Remove specification of name strings for unknown sections

* Update BinaryEncoding.md

* Remove repetition in definition of var(u)int types (#768)

* Fix typo (#781)

* Move the element section before the code section (#779)

* Binary format identifier is out of date (#785)

* Update BinaryEncoding.md to reflect the ml-proto encoding of the memory and table sections. (#800)

* Add string back

* Block signatures (#765)

* Replace branch arities with block and if signatures.

Moving arities to blocks has the nice property of giving implementations
useful information up front, however some anticipated uses of this
information would really want to know the types up front too.

This patch proposes replacing block arities with function signature indices,
which would provide full type information about a block up front.

* Remove the arity operand from br_table too.

* Remove mentions of "arguments".

* Make string part of the payload

* Remove references to post-order AST in BinaryEncoding.md (#801)

* Simplify loop by removing its exit label.

This removes loop's bottom label.

* Move description of `return` to correct column (#804)

* type correction and missing close quote (#805)

* Remove more references to AST (#806)

* Remove reference to AST in JS.md

Remove a reference to AST in JS.md. Note that the ml-proto spec still uses the name `Ast.Module` and has files named `ast.ml`, etc, so leaving those references intact for now.

* Use "instruction" instead of "AST operator"

* Update rationale for stack machine

* Update Rationale.md

* Update discussion of expression trees

* Update MVP.md

* Update Rationale.md

* Update Rationale.md

* Remove references to expressions

* Update Rationale.md

* Update Rationale.md

* Address review comments

* Address review comments

* Address review comments

* Delete h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants