Consider making agents and event loops 1:1 #5352

domenic · 2020-03-12T20:35:30Z

The specification currently says

Each agent has an associated event loop.

User agents may share an event loop across similar-origin window agents.

There must be one worker event loop per [dedicated/shared/service worker] agent.

and links to #4213 on worklet event loops.

Previously, I've pushed (against @annevk) to preserve this freedom. It turns out this was because I was mistaken in thinking that event loops and threads were 1:1 in implementations, so I'd like to revisit the issue.

In Chromium, we currently have one event loop per agent, although I believe there is a bug with ongoing work happening where the microtask queue is shared too widely. I'm pretty sure this is how we do worklets as well; even if they are in the same thread as the main page, they get a separate event loop. It would be good to have @bfgeek or @hoch confirm.

Would other implementations be up for saying that event loops and agents must be 1:1? I know @annevk is in favor, but I'm unsure if he's looked in to Gecko implementation architecture. @rniwa, any thoughts from WebKit?

The main effect of this would be guaranteeing that different-site iframes get separate event loops. It would also impact various isolation proposals (COOP+COEP, origin isolation, disallowdocumentaccess) that create more-tightly-scoped agents or agent clusters.

On the other hand, I think that having separate event loops vs. consolidate ones is only observable probabilistically. That is, you could write code which posts tasks from a bunch of different agents, and if they run in the order you posted them, then you are pretty sure that those agents share an event loop. So it's not clear what kind of normative, testable effect this spec change would really have.

bzbarsky · 2020-03-13T18:01:58Z

In Gecko we basically have a single event loop per thread at the moment, but are trying to move away from that.

And I agree that the only way to observe that event loops are separate is to observe ordering, and it's pretty hard to do because if you have a bunch of different agents, how do they even tell what order they posted tasks in? It's possible if they posted them very far apart in time (so e.g. one posts, sends a message to the other via BroadcastChannel or something, the other posts on receipt), but chances are in that case the tasks would run in posted order regardless of separate vs same event loop...

In particular, by observing reordering you can conclude with certainty that separate event loops are involved, but you can never conclude with certainty that a single one is shared, as opposed to multiple ones getting scheduled in order.

cc @smaug----

Yay295 · 2020-03-14T23:34:08Z

Would this affect requestAnimationFrame? requestAnimationFrame is supposed to fire once per "cycle", but if you had two iframe's on the same loop, would only one of them fire one cycle, and then the other on the next cycle?

annevk · 2020-03-15T07:07:17Z

That would depend on whether they share an agent.

gterzian · 2020-03-19T09:04:37Z

On the other hand, I think that having separate event loops vs. consolidate ones is only observable probabilistically.

In Chromium, we currently have one event loop per agent, although I believe there is a bug with ongoing work happening where the microtask queue is shared too widely.

I think the Chromium bug with microtasks mentioned is a good example of a noticeable problem(see https://docs.google.com/document/d/1GSHp-KoP4M1oq-vQaw8W8g93lm_k1zsykP_GAzRrbYo/edit): if a cross-site iframe doesn't have it's own event-loop(i.e. it's own microtask-queue), then if you cooperatively suspend a long-running task in the iframe, and start running a task for the main page(on the same thread and sort of the same event-loop), then if the suspended task in the iframe had queued any microtasks, they might be executed at the checkpoint following the task for the main page, then if you later reschedule the task of the iframe it will have had it's microtasks executed "right in the middle" of its current task.

So if tasks are run sequentially(without cooperative scheduling), and a microtask checkpoint always follows after each, then it's indeed hard to notice the difference between a similar-origin window agent having it's "own" event-loop, or sharing an event-loop, or having a thread run multiple event-loops(for dedicated worker agents there was an interesting discussion back at #851).

Isn't this in some way more about agent-clusters? I assume the Chromium bug is a non-issue when site-isolation is turned on?

The main effect of this would be guaranteeing that different-site iframes get separate event loops.

Is that guarantee in itself important? If a different-site iframe has it's own event-loop, but is not isolated in a separate process, it would seem that only a limited type of isolation would be guaranteed(not covering spectre attacks, crashes)?

I think the spec already separates a different-site iframe into a different agent-cluster from the page in which it would be embedded, so I think the note in the spec that "User agents may share an event loop across similar-origin window agents." might apply only to UAs that would not use any kind of process isolation(maybe a headless test runner)?

It turns out this was because I was mistaken in thinking that event loops and threads were 1:1 in implementations, so I'd like to revisit the issue.

I'm somewhat confused by the distinction between "potentially sharing event-loops between similar-origin window agents" and running multiple event-loops on the same thread.

As noted above, in practice it's hard to notice the difference between the two, as long as the sequential guarantees of the event-loop processing model are preserved.

it's almost like an event-loop implies a native thread, or at least a sequential "thread of execution".

I can imagine running multiple event-loops on the same thread in purely sequential fashion and remain spec compliant, however there would seem to be only limited benefit, if any, in doing that(it would simply be a kind of "various origin window agent"). And the Chromium experience I think shows it's really hard to follow a more aggressive scheduling strategy without breaking the processing model...

Against perhaps the more relevant distinction is whether agent clusters are physically isolated in processes(which would almost certainly mean event-loops are not shared between similar-origin window agents)?

annevk · 2020-03-19T11:23:19Z

It's not about agent clusters. Agent clusters don't have an event loop.

If you have two top-level browsing contexts that each load a document of the same origin, you have two agents (and yes, two agent clusters) that could share an event loop implementation-wise. Even if you don't process-isolate those it might still make sense to have separate event loops, so you can run the foreground browsing context more often than the background browsing context.

gterzian · 2020-03-19T14:13:04Z

It's not about agent clusters. Agent clusters don't have an event loop.

Yes ok, what I'm saying is that agent-clusters, and the BC group they belong to, provide a hierarchical structure that will tend to matter for the sharing of event-loops in practice.

I don't think the spec needs to mandate one event-loop per agent to allow for running two different same-origin tabs on two different event-loops as in your example, if anything a UA would have to go through actual troubles to follow the spec and still share an event-loop in that case(since the two BCs are part of different agent-cluster, and those are nested to different BC groups).

Unless that UA runs all window agents on one event-loop, for which there can be a usecase, for example headless browsers with no security requirements meant to run tests on a server.

If the two tabs in your example would have an auxiliary relationship, I think they would be part of the same BC, agent-cluster, and actually run on the same agent, in which case they would have to run on the same event-loop.

The problem in Chromium as I understand it, is that there wasn't a clear "event-loop" abstraction, instead things would run in ways that was either too coarse- or too fine-grained to match the spec, and the solution seems to have been implementing a new abstraction, the "Per-(eTLD+1) EventLoop", see https://docs.google.com/document/d/1bM_1SjE25NMQrD4IJFAPc1tI4i8tvSET2f0IgjukRho/edit#heading=h.93483qe1wgpw

Also, since I don't see the difference between "running multiple event-loop on one thread(in a spec-compliant way)" and "sharing an event-loop between mutiple window agents", I would rather have the spec mentions the possibility of sharing event-loops specifically, rather than mandate one loop per agent and then assume UA's can still multiplex them on one thread when necessary(the Chromium experience shows this easily leads to problems in practice).

annevk · 2020-03-19T14:22:56Z

I'm having a hard time following what you're trying to say. The example I gave sounds exactly like the thing Chromium is doing and is definitely what Firefox is doing as bz mentioned earlier.

(I think the specification should not aim to describe that as it's not directly observable and it's much simpler to have it be 1:1 with an agent.)

gterzian · 2020-03-19T14:33:53Z

What that I'm trying to say is that in your example above there would in practice be two event-loops in the most simple spec-compliant implementation already(without requiring the spec to mandate one loop per agent), because the two BCs are nested to different agent-clusters, and to different BC groups. To have them share an event-loop the UA would have to actually go through some extra trouble as a kind of optimization(to reduce the number of event-loops that are running I guess), or the UA would for some reason run everything on one event-loop to begin with(which could be useful for certain limited types of UAs).

Also if using one event-loop for two tabs would be a useful optimization, a UA could choose to do that and for example throttle tasks for tab currently in the background. I would rather have that reflected in the spec as a possibility, than have the spec mandate one loop per agent and then have UAs in practice multiplex them in some unspecified way as an optimization.

gterzian · 2020-03-25T08:52:05Z

Ok sorry, I think I blew this up into more than it needs to be! Having thought about it more, since running multiple window agents on the same event-loop, or running multiple event-loops on the same thread, appears equivalent, then we might as well get rid of what seems to be somewhat unclear language in the spec(the note currently allowing the sharing of an event-loop between window agents in some unspecified cases).

This closes #5352, by making event loops and agents 1:1. It also adds an explicit explanation that event loops/agents and implementation threads are not necessarily 1:1, apart from the restrictions imposed by JavaScript's forward progress guarantee. This also closes #4213, as worklet event loops work fine in this architecture. This also closes #4674, as browsing contexts changing event loops is not a problem in this architecture, since they change agents at the same time. Finally, this guards one step in the event loop processing model that only made sense for window event loops.

This closes #5352, by making event loops and agents 1:1. It also adds an explicit explanation that event loops/agents and implementation threads are not necessarily 1:1, apart from the restrictions imposed by JavaScript's forward progress guarantee. This also closes #4213, as worklet event loops work fine in this architecture. This also closes #4674, as browsing contexts changing event loops is not a problem in this architecture, since they change agents at the same time. Other problems remain around communication between agents (see e.g. #3691), and some of that communication is event-loop mediated, but the specific note discussed there is no longer relevant.

domenic added topic: event loop topic: agent The interaction with JavaScript's agent and agent cluster concepts. labels Mar 12, 2020

domenic mentioned this issue Mar 18, 2020

Guarantee event loop isolation? WICG/origin-agent-cluster#16

Closed

domenic mentioned this issue Mar 25, 2020

Clean up the event loop and agent correspondence #5396

Merged

domenic closed this as completed in #5396 Mar 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider making agents and event loops 1:1 #5352

Consider making agents and event loops 1:1 #5352

domenic commented Mar 12, 2020

bzbarsky commented Mar 13, 2020

Yay295 commented Mar 14, 2020

annevk commented Mar 15, 2020

gterzian commented Mar 19, 2020 •

edited

Loading

annevk commented Mar 19, 2020

gterzian commented Mar 19, 2020 •

edited

Loading

annevk commented Mar 19, 2020

gterzian commented Mar 19, 2020 •

edited

Loading

gterzian commented Mar 25, 2020

Consider making agents and event loops 1:1 #5352

Consider making agents and event loops 1:1 #5352

Comments

domenic commented Mar 12, 2020

bzbarsky commented Mar 13, 2020

Yay295 commented Mar 14, 2020

annevk commented Mar 15, 2020

gterzian commented Mar 19, 2020 • edited Loading

annevk commented Mar 19, 2020

gterzian commented Mar 19, 2020 • edited Loading

annevk commented Mar 19, 2020

gterzian commented Mar 19, 2020 • edited Loading

gterzian commented Mar 25, 2020

gterzian commented Mar 19, 2020 •

edited

Loading

gterzian commented Mar 19, 2020 •

edited

Loading

gterzian commented Mar 19, 2020 •

edited

Loading