Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/sync API does not tell clients when the server's view of state changes outside the timeline #1209

Open
Tracked by #245
ara4n opened this issue Dec 13, 2017 · 17 comments
Labels
A-Client-Server Issues affecting the CS API feature Suggestion for a significant extension which needs considerable consideration

Comments

@ara4n
Copy link
Member

ara4n commented Dec 13, 2017

If the state graph forks and then merges again, some state events may be reset if they are in conflict. For instance:

alice sets topic1
 |
 |---------------------------.
 |                           |
 v                           v
alice sets topic2       bob kicks alice
 |                           |
 |     .---------------------'
 v     v
bob says something

The fact the bob kicked alice on one branch of the DAG may arrive late, causing alice's topic2 to be considered as reverted for the room's state going forwards. We currently do not tell clients via /sync that this has happened however, so the client's view of the room state will drift until they entirely resync (e.g. by flushing your cache in Riot).

Instead, we need a way of re-issuing the 'topic1' event into the state block in the /sync response to tell clients about the reset. This is potentially problematic as state changes like this may happen at any point in the timeline, whereas we only snapshot state at the beginning of a sync response. (However, if we include inlined state changes in the timeline at the 'right' point, perhaps this is enough?)

@richvdh richvdh changed the title /sync API does not tell clients when state changes due to state resets, causing their view to diverge from the HS's /sync API does not tell clients when state changes due to state reolution causing their view to diverge from the HS's Dec 13, 2017
@richvdh richvdh changed the title /sync API does not tell clients when state changes due to state reolution causing their view to diverge from the HS's /sync API does not tell clients when state changes due to state resolution causing their view to diverge from the HS's Dec 13, 2017
@richvdh
Copy link
Member

richvdh commented Dec 13, 2017

You're not using the term "reset" in the same way as we have been doing elsewhere.

We've been using "state reset" to refer to matrix-org/synapse#1953, which is normally where an attempt to resolve a recent room state against a quite old one causes a whole cascade of changes to the current room state with the net effect that much (but not all!) of the room state ends up back where it was a few months ago.

So a state reset is one reason that clients can get out of sync, but in general any state conflict resolution (such as the one in your diagram above) can result in a change in room state which we don't currently tell clients about.

@richvdh
Copy link
Member

richvdh commented Jan 29, 2019

this is a spec issue really; moving it over.

@richvdh richvdh transferred this issue from matrix-org/synapse Jan 29, 2019
@richvdh
Copy link
Member

richvdh commented Jan 29, 2019

oh bother. It's really not.

@richvdh richvdh transferred this issue from matrix-org/matrix-spec-proposals Jan 29, 2019
@ara4n
Copy link
Member Author

ara4n commented Jan 11, 2020

why not? i don't think the spec says how to handle this?

@ara4n ara4n changed the title /sync API does not tell clients when state changes due to state resolution causing their view to diverge from the HS's /sync API does not tell clients when state changes due to state resolution cause their view to diverge from the HS's Jan 11, 2020
@kegsay
Copy link
Member

kegsay commented Mar 30, 2020

The API for /sync provides a mechanism for resolving this, as it segregates state and timeline. Things in timeline appear in the timeline and things in state are room state deltas.

AFAIK it should be possible for a server to say "whoops, this state is wrong, and this state is right" and put it under the state key of the room and have it work transparently to the client. Assuming this is the approach we are taking, this is a synapse bug and not a spec bug, though definitely should be clarified somewhere (do we even have any places where we explain this issue anywhere in the spec?).

https://matrix.org/docs/spec/client_server/r0.6.0#get-matrix-client-r0-sync

@kegsay
Copy link
Member

kegsay commented Mar 30, 2020

This is overall pretty catastrophic from a UX perspective when this happens, and makes me feel Matrix is super flakey.

I encountered this from a state reset in a v1 room, which meant I was kicked but it was still present as joined in riot-web, and I couldn't leave it because I wasn't in the room :/ but this issue is applicable for any kind of fork in room state and so we will likely keep hitting edge cases manifesting as crap UX until we fix this.

@joepie91
Copy link

@kegsay Related, in that case: matrix-org/synapse#1597

@richvdh
Copy link
Member

richvdh commented Mar 30, 2020

AFAIK it should be possible for a server to say "whoops, this state is wrong, and this state is right" and put it under the state key of the room and have it work transparently to the client.

Except that entries in state are supposed to be real events, with real event ids. If you're trying to tell a client that you're backing out an earlier bit of state (in the example in the description, we need to back out the topic change), you'd have to do so by making up events. That might not be a crazy thing to do, but it's certainly not an obvious thing to do.

(do we even have any places where we explain this issue anywhere in the spec?)

not that I'm aware of, no.

@kegsay
Copy link
Member

kegsay commented Mar 30, 2020

@joepie91 thanks for this. It really depends on how we wish to communicate state conflicts to clients. We probably don't want to flood the timeline with state changes because the state res algo decided to kick in and pretend a bunch of previously sent state is no longer valid. This is why I'm suggesting it should reside in state.

matrix-org/synapse#1597 is mainly complaining that the spec says this should be impossible. It might be worth killilng two birds with one stone and saying "hey, under these circumstances, you may get them to appear twice (byte-for-byte identical)" so clients have some way of realising that something weird is going on.

@kegsay
Copy link
Member

kegsay commented Mar 30, 2020

you'd have to do so by making up events.

I would propose we just used the previous state which we already track under unsigned.prev_content. In the event there is no previous state, a redacted content {} for that state. That is to say, if there is a previous event use it, else yes, we'd need to make one up with a redacted content.

@richvdh
Copy link
Member

richvdh commented Mar 30, 2020

the proposal sounds reasonable, but it's obscure enough that I think if we're going to do it, it needs to end up in the spec.

@richvdh
Copy link
Member

richvdh commented Mar 30, 2020

incidentally, there is another glitch here. The timeline can consist of up to ten events; the state reset can occur anywhere in that series of events; by sticking it in state we imply that the state changes at the start of the timeline, so the timeline events may not be shown correctly.

It may be close enough, or we may need to work around it by (say) forcing a gappy sync.

@Mikaela
Copy link

Mikaela commented Feb 17, 2022

Is this the current iteration of the Hotel California issue force joining users who have left back to Matrix HQ and Element Android rooms amongst others?

@DMRobertson
Copy link
Contributor

AFAICS this is a spec (protocol-level) defect rather than Synapse per-se.

FWIW I've heard indirectly that sliding sync will fix this.

@richvdh
Copy link
Member

richvdh commented Aug 11, 2022

AFAICS this is a spec (protocol-level) defect rather than Synapse per-se.

it is, and I really don't know why I moved it back to this repo. Let's move it again.

@richvdh richvdh transferred this issue from matrix-org/synapse Aug 11, 2022
@richvdh richvdh added feature Suggestion for a significant extension which needs considerable consideration A-Client-Server Issues affecting the CS API labels Aug 11, 2022
@richvdh
Copy link
Member

richvdh commented Aug 11, 2022

It's worth noting that faster joins introduce a new way that clients can end up with "incorrect" state: it's possible for us to accept events that we shouldn't (or vice versa) during the "background resync" phase, and then we have no way to tell clients when we change our minds.

@richvdh richvdh changed the title /sync API does not tell clients when state changes due to state resolution cause their view to diverge from the HS's /sync API does not tell clients when the server's view of state changes outside the timeline Aug 11, 2022
@kegsay
Copy link
Member

kegsay commented Sep 20, 2022

It's worth noting that Sliding Sync provides a mechanism for telling clients about the updated state without inserting events into the timeline. It does this by:

  • Having a state block that is independent of the timeline (that is, it is the current state, not the state at the beginning of the timeline and relying on clients rolling forward to work out the current state).
  • Providing a way to invalidate a room, or even a connection entirely and force the client to resync from the server.

The intended flow here would be:

  • Server detects state has reset. E.g. some state has been deleted.
  • Server issues the room again in the rooms response, with initial: true. This needs to contain the entire room state all over again. Alternatively, the server can issue a delta in the rooms response with just that state in the required_state block. E.g
{
  rooms: {
    "!reset:room": {
      required_state: [
        { type: "m.room.member", state_key: "@alice:example.com", ... }
      ]
    }
  }
}

or:

{
  rooms: {
    "!reset:room": {
      initial: true,
      name: "Reset Room",
      required_state: [
        { type: "m.room.member", state_key: "@alice:example.com", ... },
        { type: "m.room.member", state_key: "@bob:example.com", ... },
        { type: "m.room.name", state_key: "", ... },
        { type: "m.room.create", state_key: "", ... },
        { type: "m.room.power_levels", state_key: "", ... },
        ...
      ],
      timeline: [ ... ]
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Client-Server Issues affecting the CS API feature Suggestion for a significant extension which needs considerable consideration
Projects
None yet
Development

No branches or pull requests

6 participants