Setup RTDB connection before auth token is fetched #2349

AaronO · 2019-11-09T01:10:46Z

A lot of (web) apps follow the following pattern:

App starts / initializes Firebase SDK (app instance)
Authenticates user (e.g: persisted in IndexedDB or otherwise)
Queries the RTDB to fetch data for the now authenticated user

In the current implementation here's what happens:

SDK reads persisted auth from IndexedDB
Makes an API request to Google Identity Toolkit (getAccountInfo) (technically it's 2 https requests, because of CORS)
Once auth token is received the SDK connects to the RTDB (usually via a websocket)
Once the connection is established, any pending queries are now made / sent

Step 3. is actually quite expensive in practice (in my measurements it takes ~500ms, from Europe)
Establishing a wss connection takes 4 roundtrips (1 for TCP syn/ack, 2 for TLS handshake, 1 for HTTP upgrade).

So the time required to establish a (websocket) connection with the RTDB is 4 * RTT, where RTT is the "RoundTrip Time" between the Firebase RTDB node (which seem to be located in us-east/us-central) and the client (in my case RTT ~= 125ms which explains the ~500ms)
(and because each websocket requires it's own TCP socket, it doesn't seem possible to accelerate this by using <link rel="preconnect" ... )

This latency negatively affects the user experience for end-users because it delays their meaningful TTI.

Through my debugging and quick analysis of the code there's no obvious reason as to why we wait to finish the auth before setting up the websocket connection (given that it takes time). I patched our copy of the Firebase SDK to include the above change and there's no immediately obvious downside or issue.

Am I missing something ? Is there a strong reason why we don't establish connections until we obtain auth tokens ?

Note

The time to establish RTDB connections (to s-usc1c-nss-{x}.firebaseio.com) could be further improved without modifying the SDK, through improvements to your backend.

Right now it appears that:

*.firebaseio.com resolves to an anycast IP (35.201.97.85), which is a "dumb" set of TCP loadbalancers/routers
All SSL termination and co happens on the RTDB node(s) in us-central

Above we defined that time = 4 * RTT, (1 TCP, 2 TLS, 1 HTTP upgrade). Of those 4 roundtrips the first 3 could easily be terminated by the loadbalancer (if it had a wildcard cert for *.firebaseio.com), the 4th RTT (HTTP request/response) is more application specific and is probably out of scope for a load balancer.

Moving the first 3 roundtrips to the loadbalancers would reduce the overall time to time = 3 * cheap_RTT + 1 * expensive_RT

These improvements would benefit all firebase (RTDB) users.

A lot of (web) apps follow the following pattern: 1. App starts / initializes Firebase SDK (app instance) 2. Authenticates user (e.g: persisted in IndexedDB or otherwise) 3. Queries the RTDB to fetch data for the now authenticated user In the current implementation here's what happens: 1. SDK reads persisted auth from IndexedDB 2. Makes an API request to Google Identity Toolkit (`getAccountInfo`) (technically it's 2 https requests, because of CORS) 3. Once auth token is received the SDK connects to the RTDB (usually via a websocket) 4. Once the connection is established, any pending queries are now made / sent Step `3.` is actually quite expensive in practice (in my measurements it takes ~500ms, from Europe) Establishing a `wss` connection takes 4 roundtrips (1 for TCP syn/ack, 2 for TLS handshake, 1 for HTTP upgrade). So the time required to establish a (websocket) connection with the RTDB is `4 * RTT`, where RTT is the "RoundTrip Time" between the Firebase RTDB node (which seem to be located in us-east/us-central) and the client (in my case `RTT ~= 125ms` which explains the `~500ms`) (and because each websocket requires it's own TCP socket, it doesn't seem possible to accelerate this by using `<link rel="preconnect" ...` ) This latency negatively affects the user experience for end-users because it delays their meaningful TTI. Through my debugging and quick analysis of the code there's no obvious reason as to why we wait to finish the auth to start setting up the websocket connection (given that it takes time). I patched our copy of the Firebase SDK to include the above change and there's no immediately obvious downside or issue. Am I missing something ? Is there a strong reason why we don't establish connections until we obtain auth tokens ? --- ## Note The time to establish RTDB connections (to `s-usc1c-nss-{x}.firebaseio.com`) could be reduced without modifying the SDK, by improvements to your backend. Right now it appears that: 1. `*.firebaseio.com` resolves to an anycast IP (`35.201.97.85`), which is a "dump" set of TCP load balancers/routers 2. All SSL termination and co happens on the RTDB node(s) in us-central Above we defined that `time = 4 * RTT`, (1 TCP, 2 TLS, 1 HTTP upgrade). Of those 4 roundtrips the first 3 could easily be terminated by the loadbalancer (if it had a wildcard cert for `*.firebaseio.com`), the 4th RTT (HTTP request/response) is more application specific and is probably out of scope for a load balancer. Moving the first 3 roundtrips to the loadbalancers would reduce the overall time to `time = 3 * cheap_RTT + 1 * expensive_RT` These improvements would benefit all firebase (RTDB) users.

mikelehen · 2019-11-11T23:03:42Z

@AaronO Thanks for bringing this up and taking the initiative to put a PR together.

For a bit of context, when the RTDB code was originally written, auth would typically provide us a cached token immediately and so there were no auth roundtrips we had to wait for. But you're right that today these auth roundtrips are causing unnecessary latency and your suggestion of starting up the WebSocket connection immediately is probably a good idea.

Unfortunately I think your PR will likely introduce a race condition where we may finish connecting to the backend and start sending requests before we have gotten the token back from auth (since this is now happening in parallel), leading to us accidentally sending the requests unauthenticated.

We essentially would need to delay most of our PersistentConnection.onReady() code to wait until the connection is ready and we have received the auth token. In particular, we must not call restoreState() [since it'll send our auth token and then send any queued requests] or set connected_ to true [since that will allow subsequent requests to be immediately sent to the backend].

I'm not sure of the best way to accomplish this while minimizing code churn (in order to minimize risk of breaking something). It is a little tempting to introduce an "onReady" promise that we resolve when the Connection onReady callback is called and then we do Promise.all([onReadyPromise, authTokenPromise]).then(...) or something. But we'd need to be careful to get all the error handling (onDisconnect, onKill, auth token failure, etc.) right. If you want to take another stab at it, feel free. Else we can add this to the backlog though I'm not sure when we will tackle it. Most of our SDK development effort is going into the Cloud Firestore SDK at present.

Thanks again!

mikelehen

Getting this out of my "Review requests" list. 😄

Note: I don't work for Google anymore, so somebody else will need to help with this if it moves forward...

AaronO requested review from jsdt, mikelehen and schmidt-sebastian as code owners November 9, 2019 01:10

mikelehen suggested changes Mar 30, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setup RTDB connection before auth token is fetched #2349

Setup RTDB connection before auth token is fetched #2349

AaronO commented Nov 9, 2019 •

edited

Loading

mikelehen commented Nov 11, 2019

mikelehen left a comment

Setup RTDB connection before auth token is fetched #2349

Are you sure you want to change the base?

Setup RTDB connection before auth token is fetched #2349

Conversation

AaronO commented Nov 9, 2019 • edited Loading

Note

mikelehen commented Nov 11, 2019

mikelehen left a comment

Choose a reason for hiding this comment

AaronO commented Nov 9, 2019 •

edited

Loading