Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SharedArrayBuffer integration #152

Closed
annevk opened this issue Feb 20, 2017 · 17 comments
Closed

SharedArrayBuffer integration #152

annevk opened this issue Feb 20, 2017 · 17 comments
Labels
Milestone

Comments

@annevk
Copy link
Member

annevk commented Feb 20, 2017

It seems to me that without modification IDB would end up storing a pointer to mutable memory if whatwg/html#2361 landed and no changes to IDB were made. @domenic claims it's not true but won't tell me why unless I open a separate issue...

@domenic
Copy link
Contributor

domenic commented Feb 22, 2017

Let's walk through the spec. For example, https://w3c.github.io/IndexedDB/#dom-idbobjectstore-put. It uses StructuredClone into a user-agent defined Realm in step 10. I guess we'd want to specify that that user-agent defined Realm is inside the same agent cluster as the source realm, otherwise it'll error in some way.

OK, now it moves on and uses that clone, which contains a SAB whose [[ArrayBufferData]] contains all the bytes. (Not a pointer; SABs don't have pointers. It contains all the bytes. They're the same bytes contained by other SABs, but they're still bytes, not pointers.)

We get to https://w3c.github.io/IndexedDB/#steps-for-storing-a-record-into-an-object-store. The relevant step is:

Store a record in store containing key as its key and value as its value.

which is the idea of storing things on disk. Storing a SAB whose [[ArrayBufferData]] is a bunch of bytes on disk works fine, just like storing an ArrayBuffer whose [[ArrayBufferData]] is a bunch of bytes.

@annevk
Copy link
Member Author

annevk commented Feb 22, 2017

[[ArrayBufferData]] points to a block that is shared, no?

@annevk
Copy link
Member Author

annevk commented Feb 22, 2017

I mean, please explain it in terms of https://tc39.github.io/ecma262/#sec-data-blocks.

@domenic
Copy link
Contributor

domenic commented Feb 22, 2017

No, it's a data block, not a pointer to a data block.

@annevk
Copy link
Member Author

annevk commented Feb 22, 2017

Okay, I guess then I'd argue it matters how storing on disk is defined, since a naive implementation might copy the bytes lazily which would not work well for SharedArrayBuffer whereas it doesn't matter for ArrayBuffer.

@domenic
Copy link
Contributor

domenic commented Feb 22, 2017

I mean, if you're assuming that level of willful misinterpretation, it definitely matters for ArrayBuffer too. E.g. you could just modify it from the same thread, or perhaps even cause it to be GCed.

@annevk
Copy link
Member Author

annevk commented Feb 22, 2017

@domenic no, because the ArrayBuffer is cloned, including its bytes.

@inexorabletash
Copy link
Member

inexorabletash commented Feb 22, 2017

It's a spec fiction that there's still a SAB at the point of "Store a record in store...". At the point where the clone is made during the put() (or add() or update()) call, implementations will have done a serialization and the value will be a list of (non-shared) bytes. Changes to the SAB data after that point would not be serialized. This would be observable by issuing a put() call in a transaction that has definitely not started (i.e. it is queued behind an unfinished transaction).

The spec is currently written "as if" the clone is retained in some abstract realm (in practice, then serialized to disk), and cloned back into an actual realm when retrieved.

@annevk
Copy link
Member Author

annevk commented Feb 22, 2017

(I also still don't understand the non-pointer concept. How can that work with multiple objects sharing the same data?)

@annevk
Copy link
Member Author

annevk commented Feb 22, 2017

@inexorabletash what requires those bytes to be non-shared at that point? Sounds like implementations are relying on the only-sometimes-share semantics we're trying to disallow.

@inexorabletash
Copy link
Member

I can't speak to all implementations, but those I'm aware of need the entire value to be serialized into a buffer. In Blink that's then immediately sent across process boundaries, where it waits for the request to actually be processed, and ends up being the bytes written to disk. (Aside: Blobs are a special case, but they're also immutable.) There's no later step at which to then go fetch the actual bytes from the SAB.

@inexorabletash
Copy link
Member

If HTML had Serialize/Deserialize variants of StructuredClone (as @domenic has suggested somewhere...) then IDB would use that instead.

(put() and friends would actually need to Serialize and then potentially Deserialize immediately to compute keys against an unobservable, side-effect-free copy. That's what implementations must do today, internally.)

@annevk
Copy link
Member Author

annevk commented Feb 23, 2017

Serialize for SharedArrayBuffer can't do a clone of the bytes though, right? Those bytes are shared. If you also copy them that seems wasteful.

@inexorabletash
Copy link
Member

Either we copy or we throw. I'm fine with either.

@annevk
Copy link
Member Author

annevk commented Feb 23, 2017

I guess I still don't understand the setup then. @domenic claims this works as specified, but in Chrome it relies on StructuredClone copying the bytes, so it wouldn't work if StructuredClone didn't copy the bytes (as per @domenic's change to HTML). Something is amiss.

(I'd also still love to hear how shared memory works without pointers.)

@domenic
Copy link
Contributor

domenic commented Feb 23, 2017

In the end it seems pretty clear we're only going to be able to make you happy if we do this perfectly, i.e. overhaul structured clone, specify exactly how bytes are serialized to disk, etc. Whether we have time to do that before SharedArrayBuffer ships everywhere is an open question, but I can try my best. I was hoping to just get something implementers could implement and get compatible semantics with---even if spec people are able to poke holes in it by assuming malicious interpretations. But it sounds like you're not going to accept that.

@annevk
Copy link
Member Author

annevk commented Feb 23, 2017

I think you misunderstand. I'm happy to move forward with half-broken-but-better-than-nothing-solutions and help you out with that. I just want to know what is defined and what is not so we have issues filed and a shared understanding of what needs changing in implementations, tests, and standards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants