Implement series limit using ingester own series #6718

pstibrany · 2023-11-23T13:47:12Z

What this PR does

This PR implements limiting of user series by only using series owned by ingesters to check the limit.

How it works:

After ingester opens all local TSDBs, ingester will wait until it's "ACTIVE" in the ring. At that point, ingester will read tokens from the ring to see which series it actually "owns", and ingester will update "owned series" for all open TSDBs. After that, ingester starts accepting push requests.
When new series is created, we assume that it was assigned to this ingester because ingester "owns" it, and simply increase "owned series" for TSDB.
When series is deleted, we don't update "owned series" counter for TSDB. Deletion only happens during compaction, so we will instead recompute "owned series" for TSDB after compaction has finished.
We recompute "owned series" for TSDB periodically for each user
- After compaction -- series may have been deleted.
- After ring changes -- "owned series" may be different now. Ring is checked for updates periodically.
- After shard size has changed -- even if there was no change in the global ring, users' subring is now possibly different.
- When new TSDB is opened while ingester already runs. We set shard size immediately, but only update owned series during next iteration. That seems good-enough, because we can assume that TSDB was opened because ingester owns all incoming series, so there's no rush to compute token ranges for such TSDB.
How user limits are affected:
- Instead of "number of in-memory series", we limit on number of "owned series".
- Changes in tenants's shard size are not reflected immediately, but only after updating token ranges and "owned series".
  Until this happens, old shard size is used for limits computation.

Credits: Original code was created by @pr00se, I continue in his work before he returns.

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

Tests updated.
Documentation added.
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
about-versioning.md updated with experimental features.

pracucci

Nice job! The overall logic is what I expected, so no surprises. I left few minor comments. Looking forward for the tests!

docs/sources/mimir/references/configuration-parameters/index.md

pkg/ingester/secondary_hash.go

pracucci · 2023-11-24T09:02:20Z

pkg/mimir/modules.go

@@ -629,7 +629,7 @@ func (t *Mimir) initIngesterService() (serv services.Service, err error) {
 	t.Cfg.Ingester.InstanceLimitsFn = ingesterInstanceLimits(t.RuntimeConfig)
 	t.tsdbIngesterConfig()

-	t.Ingester, err = ingester.New(t.Cfg.Ingester, t.Overrides, t.ActiveGroupsCleanup, t.Registerer, util_log.Logger)
+	t.Ingester, err = ingester.New(t.Cfg.Ingester, t.Overrides, t.Ring, t.ActiveGroupsCleanup, t.Registerer, util_log.Logger)


[non blocking] Unrelated. In a separate PR would be nice to rename t.Ring to t.IngesterRing. This naming was done in a era when we only had ingesters ring, but now we have many rings.

pkg/ingester/ingester.go

pracucci · 2023-11-24T09:24:16Z

pkg/ingester/user_tsdb.go

+	ownedPrev, shardSizePrev := u.OwnedSeriesAndShards()
+
+	var ownedNew int
+	for {


This could potentially lead to an infinite loop, if new series are added continuously. I think it's a bad idea. I would limit the max number of attempts.

I don't think it can lead to infinite loop, because eventually user will hit the limit :) But I agree with your proposal to allow small difference (say up to 100 series). We can also limit the attempts, set whatever value we have, but also report error to retry later. WDYT?

As a best practice we should never have a potentially infinite loop. I don't want to even reason whether could be infinite or not because of "other conditions outside of the loop". However, I've seen you adding a max retries, so LGTM.

pkg/ingester/user_tsdb.go

pracucci

Remember a CHANGELOG
Remember to list new experimental config params to docs/sources/mimir/configure/about-versioning.md

pkg/ingester/ingester.go

pkg/ingester/owned_series.go

pracucci · 2023-11-27T08:27:04Z

pkg/ingester/owned_series.go

+}
+
+// Updates token ranges and recomputes owned series for user, if necessary. If recomputation happened, true is returned.
+func (oss *ownedSeriesService) updateTenant(userID string, db *userTSDB, ringChanged bool) bool {


I find the logic around reason a bit cumbersome here. Let's talk offline about it.

I agree it is cumbersome because it covers many different scenarios. I've extended the comment to explain which scenarios we cover, to help understand why it is like this.

pkg/ingester/user_tsdb.go

…ce as a dependency

…they change

… series. Handle situation when number of series changed while recomputing owned series. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Update token ranges for new users soon, even if there was no ring change. Make sure to update shard size if it changed. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

…ged, but trigger was set. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

…nts. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

…ries difference. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

pracucci

Very nice job! I haven't found any issue. I left few last minor comments. Thanks!

pkg/ingester/owned_series.go

pracucci · 2023-11-27T17:58:00Z

pkg/ingester/owned_series.go

+		ownedSeriesCheckDuration: promauto.With(reg).NewHistogram(prometheus.HistogramOpts{
+			Name:    "cortex_ingester_owned_series_check_duration",
+			Help:    "How long does it take to check for owned series for all users.",
+			Buckets: prometheus.DefBuckets,


Don't think default buckets are appropriate here. We expect way lower timings (e.g. 10s as highest bucket looks infinitely high). I would customise buckets.

I'm worried that on some of our large cells, this can indeed take several seconds. Let's measure and adjust buckets once we see it working.

pkg/ingester/owned_series.go

pkg/ingester/owned_series_test.go

pkg/ingester/user_tsdb.go

pracucci · 2023-11-27T18:15:56Z

pkg/ingester/user_tsdb.go

+	recomputeOwnedSeriesMaxSeriesDiff = 1000
+)
+
+func (u *userTSDB) recomputeOwnedSeriesWithComputeFn(shardSize int, reason string, logger log.Logger, compute func() int) (retry bool, _ int) {


[nit] It looks a bit weird to me calling the return param retry (even in the callers) when it's effectively a failed.

I don't like calling return value "failed" (I prefer "success"), hence I opted for "retry". I will rename it to success instead.

pracucci · 2023-11-27T18:17:03Z

pkg/ingester/user_tsdb.go

+		u.ownedSeriesMtx.Unlock()
+	}
+
+	level.Info(logger).Log("msg", "owned series: recomputed owned series for user",


I think level should change to warning if we failed to update it after all attempts.

pracucci · 2023-11-27T18:17:53Z

pkg/ingester/user_tsdb.go

+		u.ownedSeriesMtx.Unlock()
+	}
+
+	level.Info(logger).Log("msg", "owned series: recomputed owned series for user",


[non blocking] Do we really have to log it for every tenant? I guess it can help you debugging in case of issues for now, but we may consider getting rid of it once feature will be stable.

Agree to remove it in the future.

pracucci · 2023-11-27T18:19:23Z

pkg/ingester/user_tsdb.go

+		// Check how many new series were added while we were computing owned series.
+		seriesDiff := u.ownedSeriesCount - prevOwnedSeriesCount
+		seriesDiffOk := seriesDiff >= 0 && seriesDiff <= recomputeOwnedSeriesMaxSeriesDiff // seriesDiff should always be >= 0, but in case it isn't, we can try again.
+		if seriesDiffOk || attempts == recomputeOwnedSeriesMaxAttemps {


[non blocking] Generally speaking I'm not a big fan of checking for for loop stop conditions inside the loop itself. Have you considered simply changing reportError into success? success gets initialised to false, and switched to true once value gets updated. The for loop could simply change to for !success && attempts < recomputeOwnedSeriesMaxAttemps.

Are you suggesting to move update of u.ownedSeriesCount and u.ownedSeriesShardSize outside of for loop? Because that would surely be nicer, but then the locking situation is more complicated. Let me try and see.

After our chat, I've reworked the method. PTAL

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

pracucci self-requested a review November 24, 2023 08:55

pracucci reviewed Nov 24, 2023

View reviewed changes

pstibrany force-pushed the owned-series-poc branch from cd43d14 to f558490 Compare November 24, 2023 15:30

pracucci self-requested a review November 27, 2023 07:53

pracucci reviewed Nov 27, 2023

View reviewed changes

pr00se and others added 25 commits November 27, 2023 14:16

Pass secondary hash function to user TSDBs

827ccf6

Expose owned series as a metric

85cea90

Recalculate owned series after WAL replay and compaction

396b838

Add ReadRing interface to Ingester struct, and require the Ring servi…

a2ec473

…ce as a dependency

Periodically check the owned token ranges, and update owned count if …

98ec57a

…they change

Limit series creation based on owned series count

1557091

Add TODO

a7b5828

Use preserved shard size from last token-triggered recompute of owned…

974659f

… series. Handle situation when number of series changed while recomputing owned series. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Added flags to enable tracking or using owned series.

10dbe17

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Update dskit.

71d0d8f

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Do initial owned series computation before ingester accepts pushes.

47a0d0c

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Only update token ranges from ownedSeries service, when it runs.

a89c9fb

Update token ranges for new users soon, even if there was no ring change. Make sure to update shard size if it changed. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Docs, fixes.

1e2745d

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Fix nil panic.

13a159f

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Make sure to recompute owned series even if token ranges haven't chan…

7584310

…ged, but trigger was set. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Unify updateMetrics.

7150fea

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Check for all ring states.

4bf801a

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Only log message about owned series if we actually updated some users.

d7471b0

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

No need to check for state when checking for ring changes.

b9ec233

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Remove "Count" from field name.

169c569

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Update dskit. Use TokenRanges.Equal. Remove lock.

e661592

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Move owned series service into separate file. Introduce reason consta…

e14c7d4

…nts. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Move series sharding code to util package.

569bef6

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Use "early compaction" reason after early compaction.

8a45f43

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Added more reasons. Added limit on number of attempts, and allowed se…

432dfd5

…ries difference. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

pstibrany added 14 commits November 27, 2023 14:17

Move series sharding code to mimirpb, to avoid import cycles.

88f2e82

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Add initial test.

676ed09

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Add test for checking for ring changes.

9b762fe

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Allow up to 1000 series diff when recomputing owned series.

27f1db2

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Add license headers.

825192a

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Add license headers.

fc2c07a

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Address review feedback.

fcd86e0

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Fix compilation errors.

1685f94

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Reworked the test to make subtests independent, and more readable.

b7d0b4e

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Add test case.

194b335

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Add test case.

7d3bcde

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Add test for getSeriesAndShardsForSeriesLimit

af72a94

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Add test for recomputeOwnedSeries.

9037f97

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

CHANGELOG, document flags as experimental.

800e1fb

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

pstibrany force-pushed the owned-series-poc branch from fa77048 to 800e1fb Compare November 27, 2023 13:22

pstibrany marked this pull request as ready for review November 27, 2023 13:23

pstibrany requested review from a team as code owners November 27, 2023 13:23

pracucci approved these changes Nov 27, 2023

View reviewed changes

pstibrany added 5 commits November 28, 2023 10:50

Review feedback.

0c99568

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Simplify for-loop in recomputeOwnedSeries computation.

4f873f2

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Fix name of return value.

5e661d4

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Fix name of return value and compilation error.

72f34ba

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

Update CHANGELOG to mention limitation of the feature.

727b1a5

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

pstibrany enabled auto-merge (squash) November 28, 2023 11:16

pstibrany merged commit 6f421c7 into main Nov 28, 2023
28 checks passed

pstibrany deleted the owned-series-poc branch November 28, 2023 11:32

pstibrany mentioned this pull request Nov 28, 2023

Rename "ring" to "ingester ring" #6762

Merged

pracucci mentioned this pull request Jan 18, 2024

Cache ingest shards until head compaction #5579

Closed

3 tasks

narqo mentioned this pull request Oct 2, 2024

ingester: promote series limit using own series to stable #9496

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement series limit using ingester own series #6718

Implement series limit using ingester own series #6718

pstibrany commented Nov 23, 2023 •

edited

Loading

pracucci left a comment

pracucci Nov 24, 2023

pstibrany Nov 28, 2023

pracucci Nov 24, 2023

pstibrany Nov 24, 2023

pracucci Nov 27, 2023

pracucci left a comment

pracucci Nov 27, 2023

pstibrany Nov 27, 2023

pracucci left a comment

pracucci Nov 27, 2023

pstibrany Nov 28, 2023

pracucci Nov 27, 2023

pstibrany Nov 28, 2023

pracucci Nov 27, 2023

pracucci Nov 27, 2023

pstibrany Nov 28, 2023

pracucci Nov 27, 2023 •

edited

Loading

pstibrany Nov 28, 2023

pstibrany Nov 28, 2023

Implement series limit using ingester own series #6718

Implement series limit using ingester own series #6718

Conversation

pstibrany commented Nov 23, 2023 • edited Loading

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

pracucci left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pracucci Nov 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pstibrany commented Nov 23, 2023 •

edited

Loading

pracucci Nov 27, 2023 •

edited

Loading