Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers: Dedicated execution thread + related syntactic matters #851

Closed
lars-t-hansen opened this issue Mar 10, 2016 · 20 comments
Closed

Workers: Dedicated execution thread + related syntactic matters #851

lars-t-hansen opened this issue Mar 10, 2016 · 20 comments
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: workers

Comments

@lars-t-hansen
Copy link

At the moment the workers spec allows workers to share execution threads: each worker uses a run-to-completion model and there are few if any means of synchronous inter-worker communication; the worker must return to its event loop periodically to communicate, at which point the thread can be used to run another worker for a while. In particular, N workers can share M < N execution threads in a pool, which helps control system load.

The shared memory proposal adds the ability for workers to communicate safely and synchronously by means of atomic reads and writes on shared memory. Effectively, a worker may spinloop waiting for a cell to take on a value. A primitive mechanism is available to make that waiting efficient: it allows workers to block and to be woken. However, the model is still that the worker is "running" while it is waiting, and an efficient implementation of "blocking" will in fact use an implementation-dependent combination of spinloops, micro-waits, and actual blocking. Anyway, in this model it is not possible for N workers to share M < N execution threads without risking deadlock, since M workers could all go into a wait but, by hogging all the threads, no work to wake the waiters could get done in the remaining N-M workers.

Effectively that situation forces every worker that sends or receives a shared memory object to have a dedicated execution thread, since the implementation can't guess how that worker will be doing communication. If the worker is only going to be synchronizing by postMessage (and using shared memory just to avoid data copying) there is no pressing need for it to have a dedicated thread, however. Arguably, not having a dedicated thread is a reasonable default, but absent a mechanism to override the default the implementation must always pessimize.

@sicking therefore suggested that there might be a way to pass a request to the worker constructor that a dedicated thread is required. The shared memory proposal already has a clause that allows the implementation to prevent a worker from using the built-in blocking mechanism; that clause would apply if the worker does not have a dedicated thread. (The clause already applies to the window's main thread, in practice.)

I'd like to hear whether there is any support for the idea to pass such a hint to workers.

(From an implementation point of view there might be a possibility that there could be some kind of thread switching behind the scenes when a worker blocks, but as that effectively comes down to simulating a separate thread with its own stack and context and so on in the embedding rather than relying on the OS threads, it really is not appealing, and the savings are uncertain. And I expect interaction with the existing threads in the embedding, locks and TLS and so on, will be truly nasty. I expect that an embedding that can choose between that complexity and just eating the cost of many OS threads will choose the latter.

There is also the possibility that it is "good enough" to only give dedicated threads to workers that send or receive shared memory; it's hard to predict, since not much code has been written yet with shared memory and we don't know what the communication patterns will look like.)

The strawman syntax proposal for this is that instead of the Worker constructor taking a URL, it takes an object containing attributes:

let w = new Worker({ url: ..., dedicatedThread: true })

The strawman syntax also opens up for the possibility that the url field is absent and is replaced by eg a src field containing the actual code, and probably for other possibilities. See the discussion around Jonas's note, referenced above.

@domenic
Copy link
Member

domenic commented Mar 10, 2016

Syntax-wise, the way to do this would be to add it to WorkerOptions, so you'd do new Worker(url, { dedicatedThread: true }).

Currently the only way the spec touches on these issues is

Create a separate parallel execution environment (i.e. a separate thread or process or equivalent construct), and run the rest of these steps in that context.

By my reading this already requires at least a separate thread, but I take it you disagree? How would you rephrase this, to say "separate thread if dedicatedThread is true, something else if not"?

Anyway, this seems like a pretty reasonable and easy addition, assuming implementer support. It sounds like there's support from Mozilla; have you talked to any other vendors in the course of your shared memory work? Could you get them to voice support here?

@domenic domenic added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest labels Mar 10, 2016
@lars-t-hansen
Copy link
Author

I think that the current spec language probably allows a thread to be shared among workers, since workers are all run-to-completion and "equivalent construct" is pretty wide license and there's no requirement about timeliness. I've made an argument here that I don't think that the worker's thread can be the same thread as the thread that creates the worker, but that's a slightly different matter.

As for how to phrase it: For the ES agents spec we're trying to specify a notion of forward progress of an agent, so maybe it can be phrased in those terms instead of getting into exactly what the mechanism is. Certainly workers that have a dedicated thread must be guaranteed forward progress in a strong sense. Intuitively, if we have two workers A and B on dedicated threads and neither is waiting for the other or some external resource and A is looping forever, then B will eventually get work done anyway. (Workers that don't have dedicated threads are probably only guaranteed forward progress so long as -- informally -- none of those workers loop forever.)

I have not talked to other vendors about this; the idea came from @sicking - who really wants to have a thread pool for workers that don't need dedicated threads - and @annevk prodded me to record it here. I think it's a useful idea that allows for meaningful resource management. I'm not sure if I want to sign up as champion, however.

@domenic
Copy link
Member

domenic commented Mar 10, 2016

As for how to phrase it: For the ES agents spec we're trying to specify a notion of forward progress of an agent, so maybe it can be phrased in those terms instead of getting into exactly what the mechanism is.

Maybe the flag should be forwardProgress: true then?

I'm not sure if I want to sign up as champion, however.

Not sure exactly what you mean by this. We're happy to do the spec work, as it sounds like it won't require to much domain-specific knowledge (and I imagine you'll be able to help us with reviews to make sure we get things right). But I thought, given your ongoing collaboration on SharedArrayBuffer and such, you'd be well-positioned to bring this up with others.

@lars-t-hansen
Copy link
Author

Maybe the flag should be forwardProgress: true then?

I like that.

But I thought, given your ongoing collaboration on SharedArrayBuffer and such, you'd be well-positioned to bring this up with others.

In principle I am; my weakness here is a lack of contacts in the relevant parts of the orgs of the other browser vendors (I work almost exclusively on core JS things). But sure, I'll ask around a little.

@zcorpan
Copy link
Member

zcorpan commented Mar 10, 2016

FTR, Presto implemented workers despite, I believe, the whole browser being single-threaded (on some platforms, at least). This was possible because the JS engine was interruptible, unlike all other JS engines. cc @sof

@lars-t-hansen
Copy link
Author

Oh, it was better than that. When I took over responsibility for that engine in late 2000 it would actually timeslice multiple scripts on the page, running event handlers concurrently. Awsome responsiveness but very strange behavior, at times.

@lars-t-hansen
Copy link
Author

Maybe the flag should be forwardProgress: true then?

I like that.

Possibly even forwardProgress: "strict", to contrast with forwardProgress: "loose" which would be the default. This captures better the distinction I made earlier, where workers that share a thread can be guaranteed forward progress as long so none of them enter an infinite loop. (I don't intend to be bikeshedding here, I'm just thinking that we wouldn't want to imply that a worker on a non-shared thread does not have a reasonable guarantee of forward progress.)

@pizlonator
Copy link

I'm in favor of simply mandating native thread behavior for workers. M:N models always die eventually:

  • Java VMs used to do M:N but not anymore.
  • Solaris used to do M:N but not anymore.
  • There were attempts to add M:N to Linux but that died in a fire because of overwhelming evidence that native threads were always better.
  • The BSDs had similar experiences. I think that both FreeBSD and NetBSD initially implemented M:N threads but then had to painfully remove them. I think that FreeBSD is still trying to remove the M:N stuff. I believe that NetBSD has finished killing M:N.

The M:N approach is a dead end and I hope it was only an accident that workers permitted it. Regardless of whether memory is shared or not, concurrency activities work best when they are actually concurrent. It would be sad if every user of workers had to explicitly state that they want the real threads and not the toy threads.

@annevk
Copy link
Member

annevk commented Mar 23, 2016

Taking into account that M:N models die, I think there's still three options towards making SharedArrayBuffer shipable:

  1. We introduce forwardProgress and require it for now and eventually that argument simply has no effect since toy threads went out of business. (Note that what we currently have in some implementations is worse than M:N, as I understand it, as there's no forward progress guarantee.)
  2. We require M:N, so that we at least have the forward progress guarantee (and no optin for it), and hope that eventually those threads become true threads as M:N dies evolutionary.
  3. We require true threads right out of the gate.

I don't really know what WebKit, Chromium, and Edge do for workers. Thus far it seemed to me that 1/2 would be easiest for Gecko, though @khuey probably knows more.

@sicking
Copy link

sicking commented Mar 23, 2016

After having talked with @khuey it sounds like we abandoned M:N scheduling in Gecko long ago.

What we still do have is a limit on how many workers can be started. If a webpage creates 100 workers, we only start the first N workers.

The other workers aren't started until one of the N first workers dies (is GCed or closed/terminated). No exceptions are thrown for the workers that aren't immediately started, they simply wait in a stalled state and accumulate any messages sent to them.

I believe this behavior is the problem that @lars-t-hansen is running into.

To make matters more complicated, the limit N is affected by at least a per-origin limit. Which means that in a given page you don't know for sure what the limit will be since the user could have multiple tabs open. There might also be a global limit, though I'm not sure about that.

I believe that all browsers will have some limit on how many workers they can actually run at once. So we'd need to define some way for pages to know when a worker can't be immediately started. Either through new default behavior or through new API.

@lars-t-hansen
Copy link
Author

@sicking points to the issues that are written up in the related w3c bug. I believe that, according to current specs, it's simply wrong to not always and immediately start a worker on a dedicated thread, and that the Firefox behavior is strictly speaking a bug. Other browsers' limitations might be bugs too, although one could at least argue that throwing an exception is an acceptable adaptation.

Personally I'm probably happiest if there's always a dedicated thread per worker and a clear error report (synchronous is nice, asynchronous is manageable, something we can all agree on is a requirement) if a thread can't be created for a new worker.

The proposal in the present bug is a different take on the problem that allows resource-constrained devices (IoT class) to have a reasonable number of classical run-to-completion workers without needing a lot of system resources. But from the point of view of the shared memory proposal it is not actually a simplification and not in any way a requirement.

@annevk
Copy link
Member

annevk commented Mar 23, 2016

To move forward here it would be good to know what WebKit, Chromium, and Edge are doing. From that we can probably derive what we can change compatibly and what needs new functionality.

@khuey
Copy link

khuey commented Mar 23, 2016

Gecko does not currently use and has no intent to switch to a M:N threading model for web workers.

As sicking points out, we do schedule a limited number of threads per-origin (I believe the cap today is 50). Any further workers are stalled indefinitely until one of those slots becomes available when a different worker terminates. A way to communicate this to the page would be a good thing. Currently we've been relying on the fact that most authors don't create that many workers ...

@domenic
Copy link
Member

domenic commented Mar 23, 2016

I don't believe Gecko is in violation of the spec, per

User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.

and also

User agents may impose resource limitations on scripts, for example CPU quotas, memory limits, total execution time limits, or bandwidth limitations. When a script exceeds a limit, the user agent may either throw a QuotaExceededError exception, abort the script without an exception, prompt the user, or throttle script execution.

It sounds like in general though the original proposal here is not necessary (since nobody has plans to do anything but a dedicated thread). And now we've moved on to discussing a new API for implementations to signal their implementation-specific limits?

@lars-t-hansen
Copy link
Author

@domenic, I think that's right.

And is the proper locus for discussing an API to signal their implementation-specific limits the w3c bug where that concern was first filed?

@domenic
Copy link
Member

domenic commented Mar 23, 2016

Nah, we're trying to move most discussion to GitHub, so this seems fine to me. Ideally that bug could be split up into a number of concise (!) feature requests or suggested changes, as GitHub issues, and then we could close it.

I'd say it's up to you whether you want to close this issue and start a new GitHub issue specifically about an API for implementations to signal their worker limitations, or just keep using this one.

@lars-t-hansen
Copy link
Author

@domenic, I'll create one or more new issues and close this one.

@kinu
Copy link

kinu commented Mar 28, 2016

Looks like we're going to migrate to a few new bugs, but let me add a short note about Chromium's current impl status for the record (which wouldn't affect the overall direction of discussion). (/cc @mattto)

Reg: Chromium implementation we have (at least) once considered the possibility of using thread pool or something like that for worker execution but we mostly dropped the idea because we didn't feel there's much value doing so, as making M worker scripts run on N v8 isolates needs more locking and complexities, and it seemed to have violate some of the behavior assumptions if some script execution gets blocked by others. It's interesting to see we're adding a new, more explicit API around thread usage for workers though, as there's been also a constant discussion that we might want to limit or control the # of threads whenever possible...

@annevk
Copy link
Member

annevk commented Jun 20, 2016

So Mozilla discussed the worker limit talked about in #851 (comment) and decided to drop it. That means that given infinite resources you can have infinite parallel workers.

I think that while the specification allows that kind of limiting per its OOM recommendations, generally the specification doesn't want feature-specific limitations.

We might want to figure out how to phrase that now that implementations are going in that direction.

I'll leave making the workers lifetime more sensible to #1004.

@annevk
Copy link
Member

annevk commented Apr 25, 2017

This will end up being fixed by #2521 as each worker is an agent and agents are required to have forward process per JavaScript.

What might be interesting at some point in the future is to expose the [[CanBlock]] bit of agents to APIs, which I guess is kinda what OP was after, but I think we should consider in a new issue if we want something like that.

annevk added a commit that referenced this issue Apr 25, 2017
Define the infrastructure for SharedArrayBuffer. This also clarifies
along which boundaries a browser implementation can use processes and
threads.

Tests: web-platform-tests/wpt#5569.

Follow-up to define similar-origin window agents upon a less shaky
foundation is #2528. Because of that, similar-origin window agents
are the best place to store state that would formerly go on unit of
related similar-origin browsing contexts.

tc39/ecma262#882 is follow-up to define
agents in more detail; in particular make their implicit realms slot
explicit. w3c/css-houdini-drafts#224 is
follow-up to define worklet ownership better which is needed to
define how they relate to agent (sub)clusters.

Fixes part of #2260. Fixes #851. Fixes
w3c/ServiceWorker#1115. Fixes most of
w3c/css-houdini-drafts#380 (no tests and no
nice grouping of multiple realms in a single agent as that is not
needed).
@annevk annevk closed this as completed in 4db8654 Apr 26, 2017
inikulin pushed a commit to HTMLParseErrorWG/html that referenced this issue May 9, 2017
Define the infrastructure for SharedArrayBuffer. This also clarifies
along which boundaries a browser implementation can use processes and
threads.

Tests: web-platform-tests/wpt#5569.

Follow-up to define similar-origin window agents upon a less shaky
foundation is whatwg#2528. Because of that, similar-origin window agents
are the best place to store state that would formerly go on unit of
related similar-origin browsing contexts.

Follow-up for better agent shutdown notifications: whatwg#2581.

tc39/ecma262#882 is follow-up to define
agents in more detail; in particular make their implicit realms slot
explicit. w3c/css-houdini-drafts#224 is
follow-up to define worklet ownership better which is needed to
define how they relate to agent (sub)clusters.

Fixes part of whatwg#2260. Fixes whatwg#851. Fixes
w3c/ServiceWorker#1115. Fixes most of
w3c/css-houdini-drafts#380 (no tests and no
nice grouping of multiple realms in a single agent as that is not
needed).
inikulin pushed a commit to HTMLParseErrorWG/html that referenced this issue May 9, 2017
Define the infrastructure for SharedArrayBuffer. This also clarifies
along which boundaries a browser implementation can use processes and
threads.

Tests: web-platform-tests/wpt#5569.

Follow-up to define similar-origin window agents upon a less shaky
foundation is whatwg#2528. Because of that, similar-origin window agents
are the best place to store state that would formerly go on unit of
related similar-origin browsing contexts.

Follow-up for better agent shutdown notifications: whatwg#2581.

tc39/ecma262#882 is follow-up to define
agents in more detail; in particular make their implicit realms slot
explicit. w3c/css-houdini-drafts#224 is
follow-up to define worklet ownership better which is needed to
define how they relate to agent (sub)clusters.

Fixes part of whatwg#2260. Fixes whatwg#851. Fixes
w3c/ServiceWorker#1115. Fixes most of
w3c/css-houdini-drafts#380 (no tests and no
nice grouping of multiple realms in a single agent as that is not
needed).
alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
Define the infrastructure for SharedArrayBuffer. This also clarifies
along which boundaries a browser implementation can use processes and
threads.

Tests: web-platform-tests/wpt#5569.

Follow-up to define similar-origin window agents upon a less shaky
foundation is whatwg#2528. Because of that, similar-origin window agents
are the best place to store state that would formerly go on unit of
related similar-origin browsing contexts.

Follow-up for better agent shutdown notifications: whatwg#2581.

tc39/ecma262#882 is follow-up to define
agents in more detail; in particular make their implicit realms slot
explicit. w3c/css-houdini-drafts#224 is
follow-up to define worklet ownership better which is needed to
define how they relate to agent (sub)clusters.

Fixes part of whatwg#2260. Fixes whatwg#851. Fixes
w3c/ServiceWorker#1115. Fixes most of
w3c/css-houdini-drafts#380 (no tests and no
nice grouping of multiple realms in a single agent as that is not
needed).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: workers
Development

No branches or pull requests

8 participants