Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realm SyncManager takes up to 4-5 minutes to reconnect to ROS after waking device. #7003

Closed
helgijons opened this issue Jul 24, 2020 · 10 comments
Assignees

Comments

@helgijons
Copy link

helgijons commented Jul 24, 2020

Goal

Keep realm fully synchronized at all times.

Actual Results

SyncManager does not automatically reconnect/refresh access tokens after raising device from sleep until 4 minutes have elapsed.

SyncManager refresh connections has no effect, does not immediately reconnect to ros, is affected by same delay.
SyncManager.stop() & SyncManager.start() has no effect. Even when called in onStart & onStop respectively.

Logging sync user out and back in resolves the issue immediately.

Steps & Code to Reproduce

Before you start:

  1. Close realm in onDestroy

  2. Disconnect device from power charging.

  3. Login user

  4. Open realm asynchronously.

  5. Put device to sleep via power button with app in foreground.

  6. Wait 4-5 minutes.

  7. Wake device from sleep.

  8. Realm will not reconnect to ros until 4-5 minutes later. Which means that any actions user makes within that time window are not synchronised to other devices via ros.

Version of Realm and tooling

Realm version(s): ?
6.0.2

Realm Sync feature enabled: Yes

Android Studio version: 4.0

Android Build Tools version: 28.0.3

Gradle version: ?

Which Android version and device(s): api level 19+ and not manufacturer specific.

@helgijons
Copy link
Author

WifiLock and WakeLock does not fix the problem.

@helgijons
Copy link
Author

helgijons commented Jul 24, 2020

I think the issue lies in the SyncSession object in the getAccessToken() method and/or otherwise how the realm-java checks for network connectivity.

When looking at the following line:
if (!onGoingAccessTokenQuery.get() && NetworkStateReceiver.isOnline(SyncObjectServerFacade.getApplicationContext())) { authenticateRealm(authServer); }

and looking at the NetworkStateReceiver.isOnline() method:

/**
     * Attempt to detect if a device is online and can transmit or receive data.
     * This method is thread safe.
     * <p>
     * An emulator is always considered online, as `getActiveNetworkInfo()` does not report the correct value.
     *
     * @param context an Android context.
     * @return {@code true} if device is online, otherwise {@code false}.
     */
    public static boolean isOnline(Context context) {
        if (SyncManager.Debug.skipOnlineChecking) {
            return true;
        }
        ConnectivityManager cm = (ConnectivityManager) context.getSystemService(Context.CONNECTIVITY_SERVICE);
        NetworkInfo networkInfo = cm.getActiveNetworkInfo();
        return ((networkInfo != null && networkInfo.isConnectedOrConnecting()) || Util.isEmulator());
    }

The above method does not adequately check if the application has access to the "available" network. When battery optimisation and doze mode kicks in, the applications access to the network is restricted and limited. While the application may still have access to the cpu and being able to run code I believe the above code allows realm java to make illegitimate network requests that will always fail and thus causing an exponential backoff policy to kick in.

In other words I believe the isOnline method is only checking if the device is online but does not check if the application has access to the available network.

I believe that realm-java should also, in the isOnline method, check via the PowerManager if the application is in PowerSaveMode like so:

final PowerManager pm = (PowerManager) context.getSystemService(Context.POWER_SERVICE);
        if (pm.isPowerSaveMode()) {
            // do something
        } else {
            // do some else
        }

and so I recommend to update the above method like so:

        if (SyncManager.Debug.skipOnlineChecking) {
            return true;
        }
        ConnectivityManager cm = (ConnectivityManager) context.getSystemService(Context.CONNECTIVITY_SERVICE);
        NetworkInfo networkInfo = cm.getActiveNetworkInfo();
        final PowerManager pm = (PowerManager) context.getSystemService(Context.POWER_SERVICE);

        return ((networkInfo != null && networkInfo.isConnectedOrConnecting() && pm != null && !pm.isPowerSaveMode()) || Util.isEmulator());
    }

Unfortunately isPowerSaveMode is only accessible from api level 21+ and so that would need to be taken into effect.

Having tricked devices into powersaving mode and disabling charging via adb while connected to android studio, I did observe via logcat that realm java makes network requests that will always fail resolving the address of ros which is very indicative of network access having been restricted to the application.

After having read the realm java code I do believe the mindset has been to make sure that the library only makes network requests that are legitimate and that the code after the above check is not making any further considerations as to the network connectivity there after. Therefore I do believe there to be many cascading issues that arise as a side effect because of the current implementation of isOnline()

@cmelchior
Copy link
Contributor

Hi @helgijons
Sorry for the late response. Thank you very much for the detailed issue and explanation.

First: The isOnline() method is just supposed to do a reasonable effort to detect if we think the device is online, but there are a lot of things that can go wrong, e.g. just having a very slow connection. In general, the method tries to be optimistic instead of pessimistic about it.

That said, after going over the code. I think your explanation is mostly correct. The problem does happen inside getAccessToken(). But the issue is that we have two kinds of exponential backoff. One in the underlying C++ code, and one in Java impemplemented in https://github.com/realm/realm-java/blob/master/realm/realm-library/src/objectServer/java/io/realm/internal/network/ExponentialBackoffTask.java#L70

The problem is that calling SyncManager.refreshConnections() only resets the timers in the native code, not the one in Java, so if it actually got stuck there, it will wait the 5 minutes you observe.

This is very much a bug on our end we need to fix.

Optimizing isOnline will obviously help not hitting the issue as often, but the real fix needs to happen in SyncManager.refreshConnections().

@cmelchior cmelchior self-assigned this Aug 5, 2020
@cmelchior cmelchior added the T-Bug label Aug 5, 2020
@helgijons
Copy link
Author

Hi thanks for getting back.

I can confirm that SyncManager.refreshConnections() does not have any effect even when triggered multiple times. Good to know that you understand the issue. I think this is causing many bad situations where data shared between devices becomes very inconsistent if data is changed in disconnected clients.

Do you envision a fix to 5.x and 6.x or only 10.x?

What timeline are we looking at? I need to know because we are micro managing users, instructing them to make sure devices do not go to sleep during peek hours.

@cmelchior
Copy link
Contributor

This code path doesn't exist in 10.x, but that isn't out of BETA either. Generally, we only patch the latest release, so in this case, it would be in a 7.0.2 release.

I do realize that this crosses a major version boundary, so might be problematic. I'll discuss internally and see if it would make sense to do a patch for 6.1.0 as well.

@menting311
Copy link

@cmelchior - Have you determined if this fix can also be applied to the latest 6.x release?

@cmelchior
Copy link
Contributor

This has been fixed in 7.0.2. We will also release 6.1.1 shortly with the same fix.

@helgijons
Copy link
Author

Thank you for fixing this issue. Just saw this version is not referenced on https://realm.io/docs/java/latest/. Is it safe to use in production?

@cmelchior
Copy link
Contributor

Yes, it should be fine 👍

@helgijons
Copy link
Author

Roger, we just pushed out an update which only upgrades the realm version. Will report back in a week.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants