Idea for getting rid of extra indirection level for calls through virtual function table #312

AndrewScheidecker · 2015-08-26T12:34:29Z

At the moment we have function tables where every element has the same type, and a call_indirect operation that takes a static table index and dynamic element index. This means C++ vftables translate to static data in VM memory. Calling a virtual function must first load the address of the vftable, then load the function index from the vftable, then make the indirect call, which has to validate the function index because it came from VM memory. Compared to native there's an extra level of indirection because the vtable in untrusted VM memory must store a function index instead of a direct pointer.

I propose that we add heterogeneously typed function tables (HTFT), i.e. function tables that allow each element to have a different type, and interface tables (IT), i.e. tables of HTFTs with the same types.

Also add an operation that calls a static element index of a dynamic HTFT index of a static IT. This should cost roughly the same as the present call_indirect: it can statically validate the element index and IT index, and at runtime must mask the HTFT index, load the function pointer from a flat table for the IT*, and call it.

These changes would allow C++ vftables to be stored in trusted memory, so only the vftable index comes from untrusted VM memory. That means there isn't any indirection necessary beyond what is otherwise implied by calling through a statically unknown vftable.

So that new operation has the same cost as call_indirect, but it can implement a call to vftable without first loading an index from VM memory. That should allow implementing C++ virtual function calls with roughly the same performance as native.

To avoid having too many concepts, you could replace the existing homogeneously typed function tables as an interface table containing function tables with only a single function.

This can be polyfilled in ASM.JS if it can allocate space (see #306) to store the ITs on the heap. It would be storing indices into the function tables rather than direct pointers, so it wouldn't be a perf win, but it shouldn't be any slower than the current implementation.

lukewagner · 2015-08-26T13:52:42Z

We've actually discussed a similar optimization that doesn't require anything new (it'd work in asm.js today). The idea is to have the vtable live in the function pointer array by giving the application explicit control over the order of functions in the table. Then, a C++ object's vtableptr is just the index into the function table array and the vtable dispatch is (call_indirect (vtableptr + offset) ...) and there are the same number of indirections as native. The downside is that the vtable will only be able to store functions so any data that you'd want to stick in the vtable has to be turned into a function that is called to return the data. However, afaik, the only uses for data in vtables is RTTI which isn't super-fast anyway so I'm not too worried about slowing it down.

AndrewScheidecker · 2015-08-26T14:49:57Z

You could use the vtable index from either scheme to address static data for RTTI. I can also see how you could map my approach onto the approach you describe for single module situations (e.g. to translate to ASM.JS).

It seems like the primary difference between the approach you describe and my idea is how dynamic linking would work. The way I'd do dynamic linking in the approach you describe is to give the function tables some name that prevents them from being linked with function tables with a different name. Function tables for vtable slices could then be given a name derived from the vtable signature+slice index. As long as any module with a function table corresponding to a vtable slice has function tables for all the slices of the vtable, it should allow linking them with a single per-module fixup to vtable indices. A downside to this scheme would be that there's a namespace for function tables that is completely invisible to people who just use a C++ compiler.

You could do dynamic linking in the approach I described by saying that interface tables will be linked with any other interface tables that have the same signature. Indices into the interface table would need a link-time per-module fixup. The signature of the interface table is equivalent to the name in the other scheme.

I hadn't thought of the other approach, so I'm glad to see there's a way to save the extra indirection in the current spec. I do think my idea is still relevant as a way to look at dynamic linking of function tables that doesn't break that approach.

AndrewScheidecker · 2015-08-26T15:04:54Z

Ah, never mind about the need to name the function tables. There can be a link-time offset per-table, so the indices don't need to remain consistent across all slices of a vtable. You just need to ensure that any function tables from other modules with the same signature will be linked into a single table.

I'll close this, thanks for listening. :)

titzer · 2015-08-26T15:08:20Z

Dynamic linking is going to impact function pointers in ways that we need
to think about.

One other approach that doesn't require heterogenous function pointer
tables is to use MTables. The idea is that you arrange the class hierarchy
as a tree and order the classes using a numbering scheme (preorder works)
so that every leaf class gets an id and every non-leaf class gets a range
of ids. Then you can compute method tables per-virtual function that are
indexed by class id. Each mtable has a single signature. The mtables are
dense, but they are not zero based, so it requires subtracting the id of
root class that introduced a method. In Virgil I fold the subtraction into
the constant for the mtable start, so it requires only a single
indirection, just like a vtable. The advantage is that class membership
checks are constant type (just compare a range), and the class id can
double as the index into the method table.

On Wed, Aug 26, 2015 at 4:50 PM, Andrew Scheidecker <
notifications@github.com> wrote:

You could use the vtable index from either scheme to address static data
for RTTI. I can also see how you could map my approach onto the approach
you describe for single module situations (e.g. to translate to ASM.JS).

It seems like the primary difference between the approach you describe and
my idea is how dynamic linking would work. The way I'd do dynamic linking
in the approach you describe is to give the function tables some name that
prevents them from being linked with function tables with a different name.
Function tables for vtable slices could then be given a name derived from
the vtable signature+slice index. As long as any module with a function
table corresponding to a vtable slice has function tables for all the
slices of the vtable, it should allow linking them with a single per-module
fixup to vtable indices. A downside to this scheme would be that there's a
namespace for function tables that is completely invisible to people who
just use a C++ compiler.

You could do dynamic linking in the approach I described by saying that
interface tables will be linked with any other interface tables that have
the same signature. Indices into the interface table would need a link-time
per-module fixup. The signature of the interface table is equivalent to the
name in the other scheme.

I hadn't thought of the other approach, so I'm glad to see there's a way
to save the extra indirection in the current spec. I do think my idea is
still relevant as a way to look at dynamic linking of function tables that
doesn't break that approach.

—
Reply to this email directly or view it on GitHub
#312 (comment).

lukewagner · 2015-08-26T15:56:41Z

@AndrewScheidecker You make a good point that, in addition to global variables that represent constant pointers into the heap (to support relocatable global data), we also need global variables that represent offsets into the function-pointer tables. That is, a shared module would have (1) a local func-ptr-table (containing all the vtables and aliased functions defined by that module) that would be merged into the global func-ptr-table at dynamic-link-time and (2) globals that refer to an offset in the local func-ptr-table that receive the final offset in the global func-ptr-table. (2) would be used to initialize the vtable-ptr of C++ objects on construction.

@titzer Interesting alternative strategy. I don't think C++ could use the numbering scheme to implement dynamic_cast since dyamic_cast is super-complicated, but I do see why it's valuable to reduce table size and also why dynamic linking would break the ordering property. Ignoring that problem (assuming the tables contain big holes of dead space when a dynamically-loaded module extends another module's base class), it seems like the raw building blocks wasm should provide to allow that scheme would be having multiple named func-ptr-tables with explicit control over ordering (like asm.js). Since these MTables will have homogeneous signatures, that suggests typed func-ptr-tables (so just like asm.js). The downside of typed func-ptr-tables is that, for plain C++ function pointers, you either have to:

accept that different-typed functions will have equal pointers when compared as ints (which Emscripten experience shows occasionally breaks real code).
bloat func-ptr-tables with throwing stubs so that only one func-ptr-table contains a real function for a given index.

Emscripten gives the user an option to control this with a compiler flag. However:

if virtual functions are kept in separate tables from generic func-ptrs (in both the vtable and mtable strategies), this will drastically reduce the size of the generic func-ptr tables, so scheme (2) will not cost so much.
we could additionally apply the offset optimization you described for MTables to further reduce the table size (although holes would appear when dynamically linking)
the compiler can still offer an option to overlap func-ptr indices (I don't know whether depending on different-signature func identity is UB or not, so whether this can be a default-on optimization or a "you can break me" opt-in like -ffast-math or -fno-rtti).

So it sounds like multiple named, typed func ptr tables with explicit ordering have a pretty good story for all three ways of implementing virtual functions (generic func ptrs, vtables-in-func-tables, MTables). I'm now in favor.

AndrewScheidecker · 2015-08-26T16:02:34Z

Oops, about 2 minutes after I closed this I realized I was wrong about being able to dynamically link vtable slices. The offset has to be applied when the index is generated rather than when it's used to access a specific table in order for it to be usable by calls in other modules, so a per-table per-module offset won't work if you count on indices remaining consistent between tables (unless you have something like the name scheme to constrain how different function tables are linked).

I think my idea could be simplified to just making function tables into 2D arrays, with one index static and one dynamic. The table index + static index could map to the signature of a virtual function, and the dynamic index maps to a specific implementation of it. The generated code doesn't need to maintain a distinct table index and static index, as each static index of a table can be thought of as a separate table. However, the additional structure of the function table definitions would make it possible to put multiple vtable slices in one function table and ensure that you can use the same dynamic index to access different slices, even though new dynamic indices in the table may be allocated for dynamically linked modules.

The MTable stuff made me realize this isn't enough, though (for either scheme). The linker needs to be aware of the inheritance hierarchy. I think you might be able to do it if you define subtyping relationships between function table signatures and define dynamic linking of them so you can use the same dynamic index in multiple function tables with a subtype relationship. What a mess!

AndrewScheidecker · 2015-08-26T16:15:56Z

@lukewagner (2) globals that refer to an offset in the local func-ptr-table that receive the final offset in the global func-ptr-table

That's along the lines of what I was thinking. Same thing for data segment addresses. I think they need a way to define immutable globals though so the code generator can assume they won't be written to.

accept that different-typed functions will have equal pointers when compared as ints (which Emscripten experience shows occasionally breaks real code).

Seems like that could be handled by putting the expected table index in the bits of the function index that will be masked off for the table lookup. That would also provide some redundant information for a debugger to catch inconsistencies in. Seems like requiring those semantics might be a little too constraining on implementations though.

titzer · 2015-08-26T16:25:07Z

On Wed, Aug 26, 2015 at 6:16 PM, Andrew Scheidecker <
notifications@github.com> wrote:

@lukewagner https://github.com/lukewagner (2) globals that refer to an
offset in the local func-ptr-table that receive the final offset in the
global func-ptr-table

That's along the lines of what I was thinking. Same thing for data segment
addresses. I think they need a way to define immutable globals though so
the code generator can assume they won't be written to.

+1 for immutable globals...but how are they initialized? Only by linking?

Also, for optimization it might be nice to have "change rarely" globals
that the implementation can do specialization against guarded with
deoptimization upon write.

accept that different-typed functions will have equal pointers when

compared as ints (which Emscripten experience shows occasionally breaks
real code).

Seems like that could be handled by putting the expected table index in
the bits of the function index that will be masked off for the table
lookup. That would also provide some redundant information for a debugger
to catch inconsistencies in. Seems like requiring those semantics might be
a little too constraining on implementations though.

—
Reply to this email directly or view it on GitHub
#312 (comment).

lukewagner · 2015-08-26T16:48:16Z

@AndrewScheidecker That's an interesting idea, but ideally we'd get away from the asm.js forced masking of table indices (it hides bugs and might bite us with future optimizations).

@titzer Yes, at least that's what I'm proposing in #154 and #302. So when you load a dynamically linked module, first it acquires the space to hold the module's global data segment (I'm proposing that it calls a hook in #302, but that's not the only way), then it sets all the immutable pointers and this happens before the code is run so that compilation can fully specialize to the address. The case with function table offsets is symmetric, but no need to call a hook b/c the table isn't in linear memory.

AndrewScheidecker · 2015-08-26T17:51:50Z

To elaborate on my comment that the linker needs to be aware of the inheritance hierarchy: if you have a class that inherits from a base class defined in another module, you need some way to give that class a vtable index that works for consumers of the base class vtable as well as consumers of the derived class's vtable.

If you have knowledge up front of the entire class hierarchy, then you can just generate an exhaustive vtable for every class, including virtual functions that are only defined by derived classes. That doesn't work with dynamic linking, though.

Assuming the 2D function tables I described above:

Define function table signature to be the tuple mapping a static index to the type of functions in the table with that static index.
Define a subtyping relationship between function table signatures: given function table signatures A and B, B is a subtype of A iff A's valid static indices are a subset of B's valid static indices, and B's function type for each common static index is a subtype of A's function type for the same static index.
Specify the linker to guarantee: given two function tables whose signatures have a subtyping relationship (B <: A), and a static index that is valid for both, a dynamic index that is valid for B must denote the same function whether it is accessed through table A or B.

A simple way to implement it would be to store static index slices of a function table contiguously. When new function tables are linked in, their dynamic index space is appended to the global dynamic index space. Static index slices of function tables with subtypes in the new module are extended to include the dynamic indices of the function table in the new module. As long as all dynamic modules are static dependencies of the seed module, this could do a topological sort on function tables to get optimal sparsity.

That's equivalent in purpose MTables, but defined in terms of structural subtyping rather than based on a nominal class hierarchy. The subtyping relationship on function tables allows you to establish an order on dynamic indices into those tables without requiring you to build a table with all virtual function signatures for all vtables. However, it can still degenerate into that depending on how things are loaded. Since you can't reorder dynamic indices after the application has seen them, you never have the opportunity to do a global sort once you know all the modules that will be loaded. In the worst case you might have some static index slice of a function table that has values for dynamic index 0 and dynamic index 100000 and nothing in between. It doesn't seem possible to avoid that.

I haven't thought of a way this could be mapped to the existing function tables + names. You need to be able to express that an index into a function table representing virtual functions of a derived class must accept the same indices as a function table representing a function on the base class; sibling classes must extend the index space for the function table.

jfbastien · 2015-08-26T17:56:28Z

I recommend looking at the discussion from #89, as well as the work in LLVM on CFI. It includes work on class hierarchies.

lukewagner · 2015-10-22T17:56:40Z

Resolved as part of #392.

AndrewScheidecker closed this as completed Aug 26, 2015

AndrewScheidecker reopened this Aug 26, 2015

lukewagner closed this as completed Oct 22, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea for getting rid of extra indirection level for calls through virtual function table #312

Idea for getting rid of extra indirection level for calls through virtual function table #312

AndrewScheidecker commented Aug 26, 2015

lukewagner commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

titzer commented Aug 26, 2015

lukewagner commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

titzer commented Aug 26, 2015

lukewagner commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

jfbastien commented Aug 26, 2015

lukewagner commented Oct 22, 2015

Idea for getting rid of extra indirection level for calls through virtual function table #312

Idea for getting rid of extra indirection level for calls through virtual function table #312

Comments

AndrewScheidecker commented Aug 26, 2015

lukewagner commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

titzer commented Aug 26, 2015

lukewagner commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

titzer commented Aug 26, 2015

lukewagner commented Aug 26, 2015

AndrewScheidecker commented Aug 26, 2015

jfbastien commented Aug 26, 2015

lukewagner commented Oct 22, 2015