Skip to content

Commit

Permalink
Fix out of order exemplar error for native histograms (#7640)
Browse files Browse the repository at this point in the history
* Fix out of order exemplar error for native histograms

Port of prometheus/prometheus#13021

When receiving multiple exemplars for a native histogram in mimir via
remote write, sort them and only report an error if all are older than the latest
exemplar as this could be a partial update.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
  • Loading branch information
krajorama authored Mar 19, 2024
1 parent 8ed42e1 commit def7165
Show file tree
Hide file tree
Showing 4 changed files with 450 additions and 6 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
* [BUGFIX] Store-gateway: account for `"other"` time in LabelValues and LabelNames requests. #7622
* [BUGFIX] Query-frontend: Don't panic when using the `-query-frontend.downstream-url` flag. #7651
* [BUGFIX] Query-frontend: Fix memory leak on every request. #7654
* [BUGFIX] Ingester: when receiving multiple exemplars for a native histogram via remote write, sort them and only report an error if all are older than the latest exemplar as this could be a partial update. #7640

### Mixin

Expand Down
19 changes: 19 additions & 0 deletions pkg/ingester/ingester.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (
"net/http"
"os"
"path/filepath"
"sort"
"strings"
"sync"
"time"
Expand Down Expand Up @@ -1398,6 +1399,15 @@ func (i *Ingester) pushSamplesToAppender(userID string, timeseries []mimirpb.Pre
})
stats.failedExemplarsCount += len(ts.Exemplars)
} else { // Note that else is explicit, rather than a continue in the above if, in case of additional logic post exemplar processing.
if len(ts.Exemplars) > 1 {
// We can get multiple exemplars for native histograms.
// Sort exemplars by timestamp to ensure they are ingested in order.
// OpenTelemetry in particular does not order exemplars.
sort.Slice(ts.Exemplars, func(i, j int) bool {
return ts.Exemplars[i].TimestampMs < ts.Exemplars[j].TimestampMs
})
}
outOfOrderExemplars := 0
for _, ex := range ts.Exemplars {
if ex.TimestampMs > maxTimestampMs {
stats.failedExemplarsCount++
Expand All @@ -1420,6 +1430,15 @@ func (i *Ingester) pushSamplesToAppender(userID string, timeseries []mimirpb.Pre
continue
}

if errors.Is(err, storage.ErrOutOfOrderExemplar) {
outOfOrderExemplars++
// Only report out of order exemplars if all are out of order, otherwise this was a partial update
// to some existing set of exemplars.
if outOfOrderExemplars < len(ts.Exemplars) {
continue
}
}

// Error adding exemplar
updateFirstPartial(nil, func() softError {
return newTSDBIngestExemplarErr(err, model.Time(ex.TimestampMs), ts.Labels, ex.Labels)
Expand Down
Loading

0 comments on commit def7165

Please sign in to comment.