Skip to content

Commit

Permalink
OTLP: Convert start timestamps to Prom created timestamps (#9131) (#9178
Browse files Browse the repository at this point in the history
)

* OTLP: Convert start timestamps to Mimir created timestamps

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* bump to latest mimir-prometheus

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* make use of annotations

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
(cherry picked from commit b58a08b)

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
  • Loading branch information
grafanabot and aknuds1 authored Sep 2, 2024
1 parent 5dc4593 commit 28db790
Show file tree
Hide file tree
Showing 21 changed files with 174 additions and 75 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@
* [ENHANCEMENT] Update runtime configuration to read gzip-compressed files with `.gz` extension. #9074
* [ENHANCEMENT] Ingester: add `cortex_lifecycler_read_only` metric which is set to 1 when ingester's lifecycler is set to read-only mode. #9095
* [ENHANCEMENT] Add a new field, `encode_time_seconds` to query stats log messages, to record the amount of time it takes the query-frontend to encode a response. This does not include any serialization time for downstream components. #9062
* [ENHANCEMENT] OTLP: If the flag `-distributor.otel-created-timestamp-zero-ingestion-enabled` is true, OTel start timestamps are converted to Prometheus zero samples to mark series start. #9131
* [BUGFIX] Ruler: add support for draining any outstanding alert notifications before shutting down. This can be enabled with the `-ruler.drain-notification-queue-on-shutdown=true` CLI flag. #8346
* [BUGFIX] Query-frontend: fix `-querier.max-query-lookback` enforcement when `-compactor.blocks-retention-period` is not set, and viceversa. #8388
* [BUGFIX] Ingester: fix sporadic `not found` error causing an internal server error if label names are queried with matchers during head compaction. #8391
Expand Down
11 changes: 11 additions & 0 deletions cmd/mimir/config-descriptor.json
Original file line number Diff line number Diff line change
Expand Up @@ -4699,6 +4699,17 @@
"fieldType": "boolean",
"fieldCategory": "advanced"
},
{
"kind": "field",
"name": "otel_created_timestamp_zero_ingestion_enabled",
"required": false,
"desc": "Whether to enable translation of OTel start timestamps to Prometheus zero samples in the OTLP endpoint.",
"fieldValue": null,
"fieldDefaultValue": false,
"fieldFlag": "distributor.otel-created-timestamp-zero-ingestion-enabled",
"fieldType": "boolean",
"fieldCategory": "experimental"
},
{
"kind": "field",
"name": "ingest_storage_read_consistency",
Expand Down
2 changes: 2 additions & 0 deletions cmd/mimir/help-all.txt.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -1215,6 +1215,8 @@ Usage of ./cmd/mimir/mimir:
[experimental] Max size of the pooled buffers used for marshaling write requests. If 0, no max size is enforced.
-distributor.metric-relabeling-enabled
[experimental] Enable metric relabeling for the tenant. This configuration option can be used to forcefully disable metric relabeling on a per-tenant basis. (default true)
-distributor.otel-created-timestamp-zero-ingestion-enabled
[experimental] Whether to enable translation of OTel start timestamps to Prometheus zero samples in the OTLP endpoint.
-distributor.otel-metric-suffixes-enabled
Whether to enable automatic suffixes to names of metrics ingested through OTLP.
-distributor.remote-timeout duration
Expand Down
2 changes: 2 additions & 0 deletions docs/sources/mimir/configure/about-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ The following features are currently experimental:
- `-distributor.max-request-pool-buffer-size`
- Enable direct translation from OTLP write requests to Mimir equivalents
- `-distributor.direct-otlp-translation-enabled`
- Enable conversion of OTel start timestamps to Prometheus zero samples to mark series start
- `-distributor.otel-created-timestamp-zero-ingestion-enabled`
- Hash ring
- Disabling ring heartbeat timeouts
- `-distributor.ring.heartbeat-timeout=0`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3707,6 +3707,11 @@ The `limits` block configures default and per-tenant limits imposed by component
# CLI flag: -distributor.otel-metric-suffixes-enabled
[otel_metric_suffixes_enabled: <boolean> | default = false]
# (experimental) Whether to enable translation of OTel start timestamps to
# Prometheus zero samples in the OTLP endpoint.
# CLI flag: -distributor.otel-created-timestamp-zero-ingestion-enabled
[otel_created_timestamp_zero_ingestion_enabled: <boolean> | default = false]
# (experimental) The default consistency level to enforce for queries when using
# the ingest storage. Supports values: strong, eventual.
# CLI flag: -ingest-storage.read-consistency
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -278,7 +278,7 @@ require (
)

// Using a fork of Prometheus with Mimir-specific changes.
replace github.com/prometheus/prometheus => github.com/grafana/mimir-prometheus v0.0.0-20240830123921-fdf902dd68d9
replace github.com/prometheus/prometheus => github.com/grafana/mimir-prometheus v0.0.0-20240830150301-6b342fac9c48

// client_golang v1.20.0 has some bugs https://github.com/prometheus/client_golang/issues/1605, https://github.com/prometheus/client_golang/issues/1607
// Stick to v1.19.1 until they are fixed.
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -1118,8 +1118,8 @@ github.com/grafana/gomemcache v0.0.0-20240229205252-cd6a66d6fb56 h1:X8IKQ0wu40wp
github.com/grafana/gomemcache v0.0.0-20240229205252-cd6a66d6fb56/go.mod h1:PGk3RjYHpxMM8HFPhKKo+vve3DdlPUELZLSDEFehPuU=
github.com/grafana/memberlist v0.3.1-0.20220714140823-09ffed8adbbe h1:yIXAAbLswn7VNWBIvM71O2QsgfgW9fRXZNR0DXe6pDU=
github.com/grafana/memberlist v0.3.1-0.20220714140823-09ffed8adbbe/go.mod h1:MS2lj3INKhZjWNqd3N0m3J+Jxf3DAOnAH9VT3Sh9MUE=
github.com/grafana/mimir-prometheus v0.0.0-20240830123921-fdf902dd68d9 h1:B09XYT+dKsdwye3e52achSnT9WXs0vEcOoHtT770sJg=
github.com/grafana/mimir-prometheus v0.0.0-20240830123921-fdf902dd68d9/go.mod h1:Sp9UNArUoyscK0pnnjTmmE5HfhEifkoY8hi3tzxZFZo=
github.com/grafana/mimir-prometheus v0.0.0-20240830150301-6b342fac9c48 h1:SwY0fuJgoUGguKLOwY/1cUm2DAc0U+dk4UZBoTGd71c=
github.com/grafana/mimir-prometheus v0.0.0-20240830150301-6b342fac9c48/go.mod h1:Sp9UNArUoyscK0pnnjTmmE5HfhEifkoY8hi3tzxZFZo=
github.com/grafana/opentracing-contrib-go-stdlib v0.0.0-20230509071955-f410e79da956 h1:em1oddjXL8c1tL0iFdtVtPloq2hRPen2MJQKoAWpxu0=
github.com/grafana/opentracing-contrib-go-stdlib v0.0.0-20230509071955-f410e79da956/go.mod h1:qtI1ogk+2JhVPIXVc6q+NHziSmy2W5GbdQZFUHADCBU=
github.com/grafana/prometheus-alertmanager v0.25.1-0.20240625192351-66ec17e3aa45 h1:AJKOtDKAOg8XNFnIZSmqqqutoTSxVlRs6vekL2p2KEY=
Expand Down
5 changes: 3 additions & 2 deletions pkg/api/handlers.go
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ func NewQuerierHandler(

const (
remoteWriteEnabled = false
oltpEnabled = false
otlpEnabled = false
)

api := v1.NewAPI(
Expand Down Expand Up @@ -281,7 +281,8 @@ func NewQuerierHandler(
nil,
remoteWriteEnabled,
nil,
oltpEnabled,
otlpEnabled,
true,
)

api.InstallCodec(protobufCodec{})
Expand Down
22 changes: 15 additions & 7 deletions pkg/distributor/otel.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ const (

type OTLPHandlerLimits interface {
OTelMetricSuffixesEnabled(id string) bool
OTelCreatedTimestampZeroIngestionEnabled(id string) bool
}

// OTLPHandler is an http.Handler accepting OTLP write requests.
Expand Down Expand Up @@ -162,18 +163,19 @@ func OTLPHandler(
return err
}
addSuffixes := limits.OTelMetricSuffixesEnabled(tenantID)
enableCTZeroIngestion := limits.OTelCreatedTimestampZeroIngestionEnabled(tenantID)

pushMetrics.IncOTLPRequest(tenantID)
pushMetrics.ObserveUncompressedBodySize(tenantID, float64(uncompressedBodySize))

var metrics []mimirpb.PreallocTimeseries
if directTranslation {
metrics, err = otelMetricsToTimeseries(ctx, tenantID, addSuffixes, discardedDueToOtelParseError, logger, otlpReq.Metrics())
metrics, err = otelMetricsToTimeseries(ctx, tenantID, addSuffixes, enableCTZeroIngestion, discardedDueToOtelParseError, logger, otlpReq.Metrics())
if err != nil {
return err
}
} else {
metrics, err = otelMetricsToTimeseriesOld(ctx, tenantID, addSuffixes, discardedDueToOtelParseError, logger, otlpReq.Metrics())
metrics, err = otelMetricsToTimeseriesOld(ctx, tenantID, addSuffixes, enableCTZeroIngestion, discardedDueToOtelParseError, logger, otlpReq.Metrics())
if err != nil {
return err
}
Expand Down Expand Up @@ -401,10 +403,11 @@ func otelMetricsToMetadata(addSuffixes bool, md pmetric.Metrics) []*mimirpb.Metr
return metadata
}

func otelMetricsToTimeseries(ctx context.Context, tenantID string, addSuffixes bool, discardedDueToOtelParseError *prometheus.CounterVec, logger log.Logger, md pmetric.Metrics) ([]mimirpb.PreallocTimeseries, error) {
func otelMetricsToTimeseries(ctx context.Context, tenantID string, addSuffixes, enableCTZeroIngestion bool, discardedDueToOtelParseError *prometheus.CounterVec, logger log.Logger, md pmetric.Metrics) ([]mimirpb.PreallocTimeseries, error) {
converter := otlp.NewMimirConverter()
_, errs := converter.FromMetrics(ctx, md, otlp.Settings{
AddMetricSuffixes: addSuffixes,
AddMetricSuffixes: addSuffixes,
EnableCreatedTimestampZeroIngestion: enableCTZeroIngestion,
})
mimirTS := converter.TimeSeries()
if errs != nil {
Expand All @@ -427,10 +430,11 @@ func otelMetricsToTimeseries(ctx context.Context, tenantID string, addSuffixes b
}

// Old, less efficient, version of otelMetricsToTimeseries.
func otelMetricsToTimeseriesOld(ctx context.Context, tenantID string, addSuffixes bool, discardedDueToOtelParseError *prometheus.CounterVec, logger log.Logger, md pmetric.Metrics) ([]mimirpb.PreallocTimeseries, error) {
func otelMetricsToTimeseriesOld(ctx context.Context, tenantID string, addSuffixes, enableCTZeroIngestion bool, discardedDueToOtelParseError *prometheus.CounterVec, logger log.Logger, md pmetric.Metrics) ([]mimirpb.PreallocTimeseries, error) {
converter := prometheusremotewrite.NewPrometheusConverter()
_, errs := converter.FromMetrics(ctx, md, prometheusremotewrite.Settings{
AddMetricSuffixes: addSuffixes,
annots, errs := converter.FromMetrics(ctx, md, prometheusremotewrite.Settings{
AddMetricSuffixes: addSuffixes,
EnableCreatedTimestampZeroIngestion: enableCTZeroIngestion,
})
promTS := converter.TimeSeries()
if errs != nil {
Expand All @@ -448,6 +452,10 @@ func otelMetricsToTimeseriesOld(ctx context.Context, tenantID string, addSuffixe

level.Warn(logger).Log("msg", "OTLP parse error", "err", parseErrs)
}
ws, _ := annots.AsStrings("", 0, 0)
if len(ws) > 0 {
level.Warn(logger).Log("msg", "Warnings translating OTLP metrics to Prometheus write request", "warnings", ws)
}

mimirTS := mimirpb.PreallocTimeseriesSliceFromPool()
for _, ts := range promTS {
Expand Down
42 changes: 32 additions & 10 deletions pkg/distributor/otlp/helper_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 8 additions & 7 deletions pkg/distributor/otlp/metrics_to_prw_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 12 additions & 8 deletions pkg/distributor/otlp/number_data_points_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions pkg/distributor/push_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1247,3 +1247,7 @@ type otlpLimitsMock struct{}
func (o otlpLimitsMock) OTelMetricSuffixesEnabled(_ string) bool {
return false
}

func (o otlpLimitsMock) OTelCreatedTimestampZeroIngestionEnabled(_ string) bool {
return false
}
Loading

0 comments on commit 28db790

Please sign in to comment.