Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Element displays error that it failed to restore the session when the network is not yet connected when it launches #26967

Closed
bessw opened this issue Feb 3, 2024 · 8 comments · Fixed by matrix-org/matrix-react-sdk#12280
Assignees
Labels
O-Frequent Affects or can be seen by most users regularly or impacts most users' first experience S-Critical Prevents work, causes data loss and/or has no workaround T-Defect X-Regression

Comments

@bessw
Copy link

bessw commented Feb 3, 2024

Steps to reproduce

  1. disable your internet connection (in my case after booting, element launched before the network was connected)
  2. element opens with error that it could not restore the session "Sitzungswiederherstellung fehlgeschlagen"
  3. after a short time element loads my session behind the error screen
  4. enable your internet connection
  5. click into the empty space around the error message and it disappears
  6. everything works again as expected
  7. some reboots later it happens again.

Outcome

What did you expect?

A message that tells me that my internet is not connected

What happened instead?

A error screen appeared that suggests me to clear the storage and login again.

Operating system

Windows

Application version

Version von Element: 1.11.57
Krypto-Version: Olm 3.2.15

How did you install the app?

A long tim ago from the official website

Homeserver

local Synapse v1.99.0

Will you send logs?

Yes

@bessw bessw added the T-Defect label Feb 3, 2024
@bessw bessw changed the title Element displays error message that it couldn't restore the session, but it loads behind the error screen. Element displays error that it failed to restore the session when the network is not yet connected when it launches Feb 3, 2024
@dbkr dbkr added S-Minor Impairs non-critical functionality or suitable workarounds exist O-Occasional Affects or can be seen by some users regularly or most users rarely labels Feb 6, 2024
@BillCarsonFr
Copy link
Member

BillCarsonFr commented Feb 8, 2024

Step to reproduce:

Put a breakpoint in client.getVersions() wait for the process to reach that point, then switch to offline mode and wait.

looks like the errors bubbles up and is finally caught in LifeCycle#loadSession and at the end is handled by handleLoadSessionFailure

@BillCarsonFr
Copy link
Member

BillCarsonFr commented Feb 12, 2024

Looks like there might be an issue with getVersions():
** Edited **

There are several places where version is called and could reject:

- In Lifecycle > restoreFromLocalStorage > await checkServerVersions()
- In ServerSupportUnstableFeatureController await this.client.isVersionSupported(this.stableVersion))

There are other places where it's check but there is some basic retries, like in DeviceListener > cli.isVersionSupported("v1.1")

Maybe there should be some global retry on getVersion, and also throw something specific to be checked against in handleLoadSessionFailure and show something else?
Given the nature of getVersion, we might want to retry forever on certain errors though

@richvdh
Copy link
Member

richvdh commented Feb 16, 2024

I saw this this morning; my server was briefly returning 500s in response to /_matrix/client/versions requests.

I feel like this has been mis-traged. Clearly it's happening semi-frequently; and it's really, really bad because it encourages the user to throw away all their state.

@richvdh richvdh added S-Critical Prevents work, causes data loss and/or has no workaround O-Frequent Affects or can be seen by most users regularly or impacts most users' first experience and removed S-Minor Impairs non-critical functionality or suitable workarounds exist O-Occasional Affects or can be seen by some users regularly or most users rarely labels Feb 16, 2024
@opusforlife2
Copy link

and it's really, really bad because it encourages the user to throw away all their state.

This would not have been much of a problem if dehydrated devices was a stable core feature of Element from the start. But we don't live in that parallel universe.

@Nils-Magnus
Copy link

Nils-Magnus commented Feb 21, 2024

Same issue like the original poster: Element was started during boot when Internet was not yet available due to a delayed starting VPN.
Version von Element: 1.11.58
Krypto-Version: Rust SDK 0.7.0 (691ec63), Vodozemac 0.5.0
Localization: German
image
Apparently there are only those two options available (clear stored data XOR send logs). I was expecting to find a "Cancel" button or the option to close the modal dialog with an "X".

Clicking on the surrounding main window helped, but that was very unintuitive.

The reasoning of the error condition should be improved, which now reads "We encountered a problem restoring your session." If this is due to lack of connectivity, it should be stated here ("can't restore session from server XYZ ...").

The mitigation plan currently proposes "close window and return to more recent version". Both advices are problematic:

  1. UX suggests that there exactly two options (see above), and closing this window is none of it (even though it actually exists, by clicking into the main window).
  2. "return to more recent version" is an underspecified action for me: How do I "return" to this more recent version? Should I upgrade? But why then "return"? What is actually "more recent"? How to "return" to that?

@richvdh
Copy link
Member

richvdh commented Feb 22, 2024

@t3chguy suggests that maybe this was introduced by us awaiting a thing that was previously done in the background.

@langleyd langleyd assigned MidhunSureshR and dbkr and unassigned MidhunSureshR Feb 22, 2024
@dbkr
Copy link
Member

dbkr commented Feb 22, 2024

Regressed in matrix-org/matrix-react-sdk#12154 which adds an await with no explanation.

dbkr added a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 22, 2024
Move the server versions check to each time we reconnect to the server
rather than the first time,although, as per comment it will still only
trigger the first time, but it will avoid us awaiting and mean we know
we're connected to the server when we try, and get automatic retries.

Fixes element-hq/element-web#26967
@realtyem
Copy link

realtyem commented Feb 23, 2024

Is it possible that during this error another sync handler is started up? If so, then I think this is indirectly the cause of the recent presence load issue. When I saw this error, was I able to click the red button and then 'cancel' on the next window which allowed me to not have to sign out and back in, but did lead to the behavior I saw that lead me to connect the dots. Full client restart mitigated it back to normal.

github-merge-queue bot pushed a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 26, 2024
* Fix spurious session corruption error

Move the server versions check to each time we reconnect to the server
rather than the first time,although, as per comment it will still only
trigger the first time, but it will avoid us awaiting and mean we know
we're connected to the server when we try, and get automatic retries.

Fixes element-hq/element-web#26967

* Move test & add regression test

* Write some more tests

* More comments & catch exceptions in server versions check

* Note caching behaviour

* Typo

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Remove the bit of the comment that might be wrong

---------

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
RiotRobot pushed a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 26, 2024
* Fix spurious session corruption error

Move the server versions check to each time we reconnect to the server
rather than the first time,although, as per comment it will still only
trigger the first time, but it will avoid us awaiting and mean we know
we're connected to the server when we try, and get automatic retries.

Fixes element-hq/element-web#26967

* Move test & add regression test

* Write some more tests

* More comments & catch exceptions in server versions check

* Note caching behaviour

* Typo

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Remove the bit of the comment that might be wrong

---------

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
(cherry picked from commit 1403cd8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-Frequent Affects or can be seen by most users regularly or impacts most users' first experience S-Critical Prevents work, causes data loss and/or has no workaround T-Defect X-Regression
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants