Cool down cid failures #52

aarshkshah1992 · 2023-02-27T10:52:04Z

fixes #44

willscott · 2023-02-27T15:47:33Z

caboose.go

@@ -66,20 +75,38 @@ const DefaultSaturnGlobalBlockFetchTimeout = 60 * time.Second
 const maxBlockSize = 4194305 // 4 Mib + 1 byte
 const DefaultOrchestratorEndpoint = "https://orchestrator.strn.pl/nodes/nearby?count=1000"
 const DefaultPoolRefreshInterval = 5 * time.Minute
+const DefaultMaxCidFailures = 3
+const DefaultCidCoolDownDuration = 10 * time.Minute


@lidel may have thoughts on what our negative cache should be here.

Currently, the half of the pool seems to fail to fetch some CID(s) every ~10-15m and gets punished for it (from bifrost-gw-staging-metrics):

Not sure what is the right number, but probably way shorter than 10m. Maybe play it safe and cooldown for 1m for now?
We could increase it up to 5min, but having these spikes and cooldown longer than DefaultPoolRefreshInterval feels like a recipe for running out of useful ones before refresh.

Let's continue this discussion in #59

willscott · 2023-02-27T16:11:18Z

pool.go

@@ -50,6 +50,10 @@ type pool struct {
 	refresh chan struct{} // refresh is used to signal the need for doing a refresh of the Saturn endpoints pool.
 	done    chan struct{} // done is used to signal that we're shutting down the Saturn endpoints pool and don't need to refresh it anymore.

+	cidLk            sync.RWMutex
+	cidFailureCache  *cache.Cache // guarded by cidLk


go-cache says:

Its major advantage is that, being essentially a thread-safe map[string]interface{}

can we use it's internal thread safety rather than needing an explicit lock for accesses?

@willscott I want both cidFailureCache and cidCoolDownCache to be updated atomically.

aarshkshah1992 · 2023-03-01T06:54:23Z

Let's continue this discussion in #59.

cool down cid failures

23415b9

aarshkshah1992 requested review from willscott and removed request for willscott February 27, 2023 10:52

aarshkshah1992 changed the title ~~Cool down cid failures~~ [WIP] Cool down cid failures Feb 27, 2023

go fmt

ca2c337

aarshkshah1992 requested review from willscott and lidel February 27, 2023 11:07

aarshkshah1992 changed the title ~~[WIP] Cool down cid failures~~ Cool down cid failures Feb 27, 2023

willscott reviewed Feb 27, 2023

View reviewed changes

aarshkshah1992 changed the base branch from main to feat/better-downvoting March 1, 2023 06:35

aarshkshah1992 added 2 commits March 1, 2023 10:38

Merge branch 'feat/better-downvoting' into feat/cid-failure-cooldown

0780aac

fix CI

90b1ffe

aarshkshah1992 merged commit 5d900a9 into feat/better-downvoting Mar 1, 2023

aarshkshah1992 deleted the feat/cid-failure-cooldown branch March 1, 2023 06:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cool down cid failures #52

Cool down cid failures #52

aarshkshah1992 commented Feb 27, 2023

willscott Feb 27, 2023

lidel Feb 28, 2023

aarshkshah1992 Mar 1, 2023

willscott Feb 27, 2023

aarshkshah1992 Feb 28, 2023

aarshkshah1992 commented Mar 1, 2023

Cool down cid failures #52

Cool down cid failures #52

Conversation

aarshkshah1992 commented Feb 27, 2023

willscott Feb 27, 2023

Choose a reason for hiding this comment

lidel Feb 28, 2023

Choose a reason for hiding this comment

aarshkshah1992 Mar 1, 2023

Choose a reason for hiding this comment

willscott Feb 27, 2023

Choose a reason for hiding this comment

aarshkshah1992 Feb 28, 2023

Choose a reason for hiding this comment

aarshkshah1992 commented Mar 1, 2023