Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(discovery): discover triggered too frequently #3550

Open
guillaumemichel opened this issue Jul 3, 2024 · 4 comments · May be fixed by #3561
Open

feat(discovery): discover triggered too frequently #3550

guillaumemichel opened this issue Jul 3, 2024 · 4 comments · May be fixed by #3561
Assignees
Labels
external Issues created by non node team members kind:misc Attached to miscellaneous PRs

Comments

@guillaumemichel
Copy link

In the case a node doesn't have its quota of peers, it will send a new discover request every second. There are no guarantees that the discover request can complete within 1 second.

case <-t.C:
if !d.discover(ctx) {
// rerun discovery if the number of peers hasn't reached the limit
continue
}

This interval seems too aggressive and should probably be increased (e.g to 1 or even 5 minutes?)

@github-actions github-actions bot added needs:triage external Issues created by non node team members labels Jul 3, 2024
@ramin ramin self-assigned this Jul 4, 2024
@ramin ramin added the kind:misc Attached to miscellaneous PRs label Jul 4, 2024
@renaynay
Copy link
Member

renaynay commented Jul 4, 2024

@guillaumemichel i agree that the interval for retrying is a bit too aggressive and could be increased, but the actual deadline to FindPeers is a minute.

findCtx, findCancel := context.WithTimeout(ctx, findPeersTimeout)
defer func() {
// some workers could still be running, wait them to finish before canceling findCtx
wg.Wait() //nolint:errcheck
findCancel()
}()
peers, err := d.disc.FindPeers(findCtx, d.tag)

@guillaumemichel
Copy link
Author

So basically there will always be a lookup running until enough peers are discovered.

@Wondertan
Copy link
Member

So basically there will always be a lookup running until enough peers are discovered.

IIRC, that was the intention, but as you mentioned that might be too aggressive

@ramin ramin removed the needs:triage label Jul 4, 2024
@ramin ramin linked a pull request Jul 11, 2024 that will close this issue
@renaynay renaynay added the v0.15.0 Intended for v0.15.0 release label Jul 16, 2024
@walldiss
Copy link
Member

Sometimes, a node fails to discover any peers on startup. Without discovered peers, the node is unable to perform most of its P2P logic. The downside of a longer cooldown is that it will cause the node application to halt for the cooldown duration in such cases. If some peers are discovered, the node should still aim to find at least 5 (our default) to rely less on the performance and availability of a single peer.

The idea behind aggressive retries is to bootstrap node into a stable network condition as soon as possible, perhaps at the cost of more resources spent on aggressive discovery. So I think the defaults should stay low.

I think it might be valuable for some users to have the ability to increase retry/timeout values if they are less concerned about the node being connected to the FN network.

@renaynay renaynay removed the v0.15.0 Intended for v0.15.0 release label Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external Issues created by non node team members kind:misc Attached to miscellaneous PRs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants