Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[r282] backport Fix out of order exemplar error for native histograms pr 7640 #7820

Merged
merged 1 commit into from
Apr 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
* [BUGFIX] Store-gateway: account for `"other"` time in LabelValues and LabelNames requests. #7622
* [BUGFIX] Query-frontend: Fix memory leak on every request. #7654
* [BUGFIX] Ingester: turn native histogram validation errors in TSDB into soft ingester errors that result in returning 4xx to the end-user instead of 5xx. In the case of TSDB validation errors, the counter `cortex_discarded_samples_total` will be increased with the `reason` label set to `"invalid-native-histogram"`. #7736 #7773
* [BUGFIX] Ingester: when receiving multiple exemplars for a native histogram via remote write, sort them and only report an error if all are older than the latest exemplar as this could be a partial update. #7640

### Mixin

Expand Down
19 changes: 19 additions & 0 deletions pkg/ingester/ingester.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (
"net/http"
"os"
"path/filepath"
"sort"
"strings"
"sync"
"time"
Expand Down Expand Up @@ -1436,6 +1437,15 @@ func (i *Ingester) pushSamplesToAppender(userID string, timeseries []mimirpb.Pre
})
stats.failedExemplarsCount += len(ts.Exemplars)
} else { // Note that else is explicit, rather than a continue in the above if, in case of additional logic post exemplar processing.
if len(ts.Exemplars) > 1 {
// We can get multiple exemplars for native histograms.
// Sort exemplars by timestamp to ensure they are ingested in order.
// OpenTelemetry in particular does not order exemplars.
sort.Slice(ts.Exemplars, func(i, j int) bool {
return ts.Exemplars[i].TimestampMs < ts.Exemplars[j].TimestampMs
})
}
outOfOrderExemplars := 0
for _, ex := range ts.Exemplars {
if ex.TimestampMs > maxTimestampMs {
stats.failedExemplarsCount++
Expand All @@ -1458,6 +1468,15 @@ func (i *Ingester) pushSamplesToAppender(userID string, timeseries []mimirpb.Pre
continue
}

if errors.Is(err, storage.ErrOutOfOrderExemplar) {
outOfOrderExemplars++
// Only report out of order exemplars if all are out of order, otherwise this was a partial update
// to some existing set of exemplars.
if outOfOrderExemplars < len(ts.Exemplars) {
continue
}
}

// Error adding exemplar
updateFirstPartial(nil, func() softError {
return newTSDBIngestExemplarErr(err, model.Time(ex.TimestampMs), ts.Labels, ex.Labels)
Expand Down
Loading
Loading