Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: use Happy Eyeballs-like logic for dialing peers #1785

Closed
Tracked by #1808
marten-seemann opened this issue Sep 29, 2022 · 3 comments · Fixed by #2260
Closed
Tracked by #1808

proposal: use Happy Eyeballs-like logic for dialing peers #1785

marten-seemann opened this issue Sep 29, 2022 · 3 comments · Fixed by #2260
Assignees
Labels
effort/weeks Estimated to take multiple weeks exp/intermediate Prior experience is likely helpful kind/enhancement A net-new feature or improvement to an existing feature P1 High: Likely tackled by core team if no one steps up

Comments

@marten-seemann
Copy link
Contributor

Happy Eyeballs, defined in RFC 8305, specifies how to dial a server that one has an IPv4 and an IPv6 address for: In order to not overload the network, the IPv6 address should be dialed first, and if no connection can be established within 250ms, another dial attempt using IPv4 should be started in parallel. The application will use whatever connection is established first.

Why?

We're in the process of adding more transports. As more and more upgrade, we can expect their list of advertised addresses to grow. libp2p/specs#353 will further increase the number of addresses. It puts a lot of load on our node, on the network and on the peer to dial all these addresses in parallel. We need to be smart and dial addresses such that 1. we end up with a connection over a transport that we prefer and 2. we have a high probability of successfuly connecting on the first or second connection attempt.

Differences from RFC 8305

  • In the general case, we’ll start with a list of (multi)addrs that contain IPs and / or dnsaddrs (which need to be resolved first)
  • We don’t only need to rank IPv6 and IPv4, but first and foremost different transports

Proposed address ranking algorithm

Preprocessing: Bucket addresses into local / internet-wide addresses (⇒ 2 buckets). For every bucket, run:

  1. Filtering: if we have the same IP address or domain name for QUIC / WebTransport and TCP / WebSocket, remove WebTransport / WebSocket address
  2. Sorting:
    1. a single address of each transport in the order: QUIC > TCP > WebTransport > WebSocket > WebRTC > Circuit
    2. other addresses: we don’t really care, randomize?
  3. re-rerun filtering step

Open Questions

  • What about IPv4 vs. IPv6: if given QUIC (v4 + v6) and TCP (v4 + v6), do we do
    • QUIC v6, QUIC v4, TCP v6, TCP v4 or
    • QUIC v6, TCP v6, QUIC v4, TCP v4
  • If given multiple IP addresses for the same transport, how do we select the one we dial
    • this really shouldn’t happen if we had decent address discovery on the sender side, but we don’t…
    • picking one at random seems fine
  • When (if at all) do we start DNS resolution, when we have some multiaddrs containing IP addresses?
  • Can we tell if IPv6 is not available (not all ISPs provide v6 functionality to their customers)?

Possible optimizations

  • Find out if UDP is blackholed in our network, and disable QUIC in that case. We should re-probe on a regular basis to see if things have changed.
  • Build an RTT estimation logic based on IP address. For exact matches (most likely re-dials, or dials to Hydra nodes), we can use a value based on the last RTT instead of the fixed 250ms. Using prefix matching, we might also be able to take an informed guess to an IP that's "close" to another RTT that we know the RTT for.
@marten-seemann marten-seemann added kind/enhancement A net-new feature or improvement to an existing feature P1 High: Likely tackled by core team if no one steps up exp/intermediate Prior experience is likely helpful effort/weeks Estimated to take multiple weeks labels Sep 29, 2022
@mxinden
Copy link
Member

mxinden commented Oct 4, 2022

Thank you for writing this down @marten-seemann.

In regards to the open questions, I think Probe Lab (//CC @yiannisbot and @dennis-tra) could be a big help providing data and recommendations.

Long term, I think rust-libp2p should follow the learnings and strategy of go-libp2p, thus I would be in favor of this making it into libp2p/specs as a guideline for other implementations eventually. What do you think?

Can we tell if IPv6 is not available (not all ISPs provide v6 functionality to their customers)?

First indicators could be:

  1. whether one has an IPv6 listen address or not for LAN IPv6 support (not an option in the browser, arguably the most important one here)
  2. having an established IPv6 connection

Related discussion in rust-libp2p libp2p/rust-libp2p#1896.

@vyzo
Copy link
Contributor

vyzo commented Feb 25, 2023

You need some smarts to avoid breaking hole punching of you go down that route.

@marten-seemann
Copy link
Contributor Author

Indeed. I think we can afford to hole punch in parallel in the first iteration. That's probably not too bad. When we're at the point where we've decided to hole punch, we might as well pay the cost of a few parallel dials.

If we wanted to apply a similar logic to hole punches, we'd need some kind of mechanism to negotiate the order of the addresses, as well as the timeouts. This could be done by adding this information to the hole punch protobuf, but that's definitely a larger change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/weeks Estimated to take multiple weeks exp/intermediate Prior experience is likely helpful kind/enhancement A net-new feature or improvement to an existing feature P1 High: Likely tackled by core team if no one steps up
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants