Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update module go.opentelemetry.io/collector/pdata to v1.15.0 (main) #9380

Merged

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Sep 23, 2024

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
go.opentelemetry.io/collector/pdata v1.12.0 -> v1.15.0 age adoption passing confidence

Release Notes

open-telemetry/opentelemetry-collector (go.opentelemetry.io/collector/pdata)

v1.15.0

🛑 Breaking changes 🛑
  • scraperhelper: Remove deprecated ObsReport, ObsReportSettings, NewObsReport types/funcs (#​11086)
  • confmap: Remove stable confmap.strictlyTypedInput gate (#​11008)
  • confmap: Removes stable confmap.unifyEnvVarExpansion feature gate. (#​11007)
  • ballastextension: Removes the deprecated ballastextension (#​10671)
  • service: Removes stable service.disableOpenCensusBridge feature gate (#​11009)
🚩 Deprecations 🚩
  • processorhelper: These funcs are not used anywhere, marking them deprecated. (#​11083)
🚀 New components 🚀
  • extension/experimental/storage: Move extension/experimental/storage into a separate module (#​11022)
💡 Enhancements 💡
  • configtelemetry: Add guidelines for each level of component telemetry (#​10286)

  • service: move useOtelWithSDKConfigurationForInternalTelemetry gate to beta (#​11091)

  • service: implement a no-op tracer provider that doesn't propagate the context (#​11026)
    The no-op tracer provider supported by the SDK incurs a memory cost of propagating the context no matter
    what. This is not needed if tracing is not enabled in the Collector. This implementation of the no-op tracer
    provider removes the need to allocate memory when tracing is disabled.

  • envprovider: Mark module as stable (#​10982)

  • fileprovider: Mark module as stable (#​10983)

  • processor: Add incoming and outgoing counts for processors using processorhelper. (#​10910)
    Any processor using the processorhelper package (this is most processors) will automatically report
    incoming and outgoing item counts. The new metrics are:

    • otelcol_processor_incoming_spans
    • otelcol_processor_outgoing_spans
    • otelcol_processor_incoming_metric_points
    • otelcol_processor_outgoing_metric_points
    • otelcol_processor_incoming_log_records
    • otelcol_processor_outgoing_log_records
🧰 Bug fixes 🧰
  • configgrpc: Change the value of max_recv_msg_size_mib from uint64 to int to avoid a case where misconfiguration caused an integer overflow. (#​10948)
  • exporterqueue: Fix a bug in persistent queue that Offer can becomes deadlocked when queue is almost full (#​11015)

v1.14.1

🧰 Bug fixes 🧰
  • mdatagen: Fix a missing import in the generated test file (#​10969)

v1.14.0

🛑 Breaking changes 🛑
  • all: Added support for go1.23, bumped the minimum version to 1.22 (#​10869)
  • otelcol: Remove deprecated ConfmapProvider interface. (#​10934)
  • confmap: Mark confmap.strictlyTypedInput as stable (#​10552)
💡 Enhancements 💡
  • exporter/otlp: Add batching option to otlp exporter (#​8122)
  • builder: Add a --skip-new-go-module flag to skip creating a module in the output directory. (#​9252)
  • component: Add TelemetrySettings.LeveledMeterProvider func to replace MetricsLevel in the near future (#​10931)
  • mdatagen: Add LeveledMeter method to mdatagen (#​10933)
  • service: Adds level configuration option to service::telemetry::trace to allow users to disable the default TracerProvider (#​10892)
    This replaces the feature gate service.noopTracerProvider introduced in v0.107.0
  • componentstatus: Add new Reporter interface to define how to report a status via a component.Host implementation (#​10852)
  • mdatagen: support using a different github project in mdatagen README issues list (#​10484)
  • mdatagen: Updates mdatagen's usage to output a complete command line example, including the metadata.yaml file. (#​10886)
  • extension: Add ModuleInfo to extension.Settings to allow extensions to access component go module information. (#​10876)
  • confmap: Mark module as stable (#​9379)
🧰 Bug fixes 🧰
  • batchprocessor: Update units for internal telemetry (#​10652)
  • confmap: Fix bug where an unset env var used with a non-string field resulted in a panic (#​10950)
  • service: Fix memory leaks during service package shutdown (#​9165)
  • mdatagen: Update generated telemetry template to only include context import when there are async metrics. (#​10883)
  • mdatagen: Fixed bug in which setting SkipLifecycle & SkipShutdown to true would result in a generated file with an unused import confmaptest (#​10866)
  • confmap: Use string representation for field types where all primitive types are strings. (#​10937)
  • otelcol: Preserve internal representation when unmarshaling component configs (#​10552)

v1.13.0

🛑 Breaking changes 🛑
  • service: Remove OpenCensus bridge completely, mark feature gate as stable. (#​10414)
  • confmap: Set the confmap.unifyEnvVarExpansion feature gate to Stable. Expansion of $FOO env vars is no longer supported. Use ${FOO} or ${env:FOO} instead. (#​10508)
  • service: Remove otelcol from Prometheus configuration. This means that any metric that isn't explicitly prefixed with otelcol_ no longer have that prefix. (#​9759)
💡 Enhancements 💡
  • mdatagen: export ScopeName in internal/metadata package (#​10845)
    This can be used by components that need to set their scope name manually. Will save component owners from having to store a variable, which may diverge from the scope name used by the component for emitting its own telemetry.

  • semconv: Add v1.26.0 semantic conventions package (#​10249, #​10829)

  • mdatagen: Expose a setting on tests::host to set up your own host initialization code (#​10765)
    Some receivers require a host that has additional capabilities such as exposing exporters.
    For those, we can expose a setting that allows them to place a different host in the generated code.

  • confmap: Allow using any YAML structure as a string when loading configuration. (#​10800)
    Previous to this change, slices could not be used as strings in configuration.

  • ocb: migrate build and release of ocb binaries to opentelemetry-collector-releases repository (#​10710)
    ocb binaries will now be released under open-telemetry/opentelemetry-collector-releases tagged as "cmd/builder/vX.XXX.X"

  • semconv: Add semantic conventions version v1.27.0 (#​10837)

  • client: Mark module as stable. (#​10775)

🧰 Bug fixes 🧰
  • configtelemetry: Add 10s read header timeout on the configtelemetry Prometheus HTTP server. (#​5699)

  • service: Allow users to disable the tracer provider via the feature gate service.noopTracerProvider (#​10858)
    The service is returning an instance of a SDK tracer provider regardless of whether there were any processors configured causing resources to be consumed unnecessarily.

  • processorhelper: Fix processor metrics not being reported initially with 0 values. (#​10855)

  • service: Implement the temporality_preference setting for internal telemetry exported via OTLP (#​10745)

  • configauth: Fix unmarshaling of authentication in HTTP servers. (#​10750)

  • confmap: If loading an invalid YAML string through a provider, use it verbatim instead of erroring out. (#​10759)
    This makes the ${env:ENV} syntax closer to how ${ENV} worked before unifying syntaxes.

  • component: Allow component names of up to 1024 characters in length. (#​10816)

  • confmap: Remove original string representation if invalid. (#​10787)


Configuration

📅 Schedule: Branch creation - "before 9am on Monday" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot requested review from stevesg, grafanabot and a team as code owners September 23, 2024 08:24
@aknuds1 aknuds1 enabled auto-merge (squash) September 23, 2024 08:34
@aknuds1 aknuds1 merged commit ac16d23 into main Sep 23, 2024
29 checks passed
@aknuds1 aknuds1 deleted the deps-update/main-go.opentelemetry.io-collector-pdata-1.x branch September 23, 2024 08:43
dimitarvdimitrov added a commit that referenced this pull request Sep 27, 2024
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Use labels hasher

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Use consistent title name

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Use consistent title name

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

kafka replay speed: adjust batchingQueueCapacity (#9344)

* kafka replay speed: adjust batchingQueueCapacity

I made 2000 up when we were flushing individual series to the channel.
Then 2000 might have made sense, but when flushing whole WriteRequests
a capacity of 1 should be sufficient.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Increase errCh capacity

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Explain why +1

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Set capacity to 5

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update pkg/storage/ingest/pusher.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Improve test

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update pkg/storage/ingest/pusher.go

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

kafka replay speed: rename CLI flags (#9345)

* kafka replay speed: rename CLI flags

Make them a bit more consistent on what they mean and add better descriptions.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Clarify metrics

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Rename flags

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update docs

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

kafka replay speed: add support for metadata & source (#9287)

* kafka replay speed: add support for metadata & source

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Remove completed TODO

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Use a single map

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Make tests compile again

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

kafka replay speed: improve fetching tracing (#9361)

* Better span logging

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Better span logging

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Try to have more buffering in ordered batches

maybe waiting to send to ordered batches comes with too much overhead

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Correct local docker-compose config with new flags

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Maybe have more stable events

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Revert "Try to have more buffering in ordered batches"

This reverts commit 886b159.

* Maybe have more stable events

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Maybe have more stable events

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Propagate loggers in spans

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

continuous-test: Make the User-Agent header for the Mimir client conf… (#9338)

* continuous-test: Make the User-Agent header for the Mimir client configurable

* Update CHANGELOG.md

* Run make reference-help

TestIngester_PushToStorage_CircuitBreaker: increase initial delay (#9351)

* TestIngester_PushToStorage_CircuitBreaker: increase initial delay

Fixes XXX
I believe there's a race between sending the first request and then collecting the metrics. It's possible that we collect the metrics longer than 200ms after the first request, at which point the CB has opened. I could reproduce XXX by reducing the initialDelay to 10ms.

This PR increases it to 1 hour so that we're more sure that the delay hasn't expired when we're collecting the metrics.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Adjust all tests

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Update to latest commit of dskit main (#9356)

Specifically pulls in grafana/dskit#585

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

Update mimir-prometheus (#9358)

* Update mimir-prometheus

* Run make generate-otlp

query-tee: add equivalent errors for string expression for range queries (#9366)

* query-tee: add equivalent errors for string expression for range queries

* Add changelog entry

MQE: fix `rate()` over native histograms where first point in range is a counter reset (#9371)

* MQE: fix `rate()` over native histograms where first point is a counter reset

* Add changelog entry

Update module github.com/Azure/azure-sdk-for-go/sdk/storage/azblob to v1.4.1 (#9369)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Use centralized 'Add to docs project' workflow with GitHub App auth (#9330)

* Use centralized 'Add to docs project' workflow with GitHub App auth

Until this is merged, it is likely that any issues labeled `type/docs` won't be added to the [organization project](https://github.com/orgs/grafana/projects/69).

The underlying action is centralized so that any future changes are made in one place (`grafana/writers-toolkit`). The action is versioned to protect workflows from breaking changes.

The action uses Vault secrets instead of the discouraged organization secrets.

The workflow uses a consistent name so that future changes can be made programmatically.

Relates to https://github.com/orgs/grafana/projects/279/views/9?pane=issue&itemId=44280262

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove unneeded checkout step

* Remove unneeded checkout step

---------

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

Update grafana/agent Docker tag to v0.43.1 (#9365)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module github.com/hashicorp/vault/api/auth/userpass to v0.8.0 (#9375)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module github.com/hashicorp/vault/api/auth/approle to v0.8.0 (#9374)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module go.opentelemetry.io/collector/pdata to v1.15.0 (#9380)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module github.com/hashicorp/vault/api/auth/kubernetes to v0.8.0 (#9377)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module github.com/twmb/franz-go/plugin/kotel to v1.5.0 (#9379)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

kafka replay speed: ingestion metrics (#9346)

* kafka replay speed: ingestion metrics

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Separate batch processing time by batch contents

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Also set time on metadata

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add tenant to metrics

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add metrics for errors

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Rename batching queue metrics

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Pairing to address code review

Co-Authored-By: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Move the metrics into their own file

Co-Authored-By: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* go mod tidy

Signed-off-by: gotjosh <josue.abreu@gmail.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

kafka replay speed: move error handling closer to actual ingestion (#9349)

* kafka replay speed: move error handling closer to actual ingestion

Previously, we'd let error bubble-up and only take decisions on whether to abort the request or not at the very top (`pusherConsumer`). This meant that we'd potentially buffer more requests before we detect an error.

This change extracts error handling logic into a `Pusher` implementation: `clientErrorFilteringPusher`. This implementation logs client errors and then swallows them. We inject that implementation in front of the ingester. This means that the parallel storage implementation can abort ASAP instead of collecting and bubbling up the errors.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

kafka replay speed: concurrency fetching improvements (#9389)

* fetched records include timestamps

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* try with defaultMinBytesWaitTime=3s

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* add fetch_min_bytes_max_wait

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Don't block on sending to the channel

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Remove wait for when we're fetching from the end

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Fix bug with blocking on fetch

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Slightly easier to follow lifecycle of previousResult

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Correct merging of results

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Avoid double-logging events

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Revert "add fetch_min_bytes_max_wait"

This reverts commit 6197d4b.

* Increase MinBytesWaitTime to 5s

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add comment about warpstream and MinBytes

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Address review comments

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* Add tests for concurrentFetchers

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Fix bugs in tracking lastReturnedRecord

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Renamed method

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* use the older context

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* Name variable correct variable name

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Reduce MaxWaitTime in PartitionReader tests

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Change test createConcurrentFetchers signature

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Sort imports

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

Make concurrentFetchers change its concurrency dynamically (#9437)

* Make concurrentFetchers change its concurrency dynamically

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* address review comments

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* `make doc`

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* inline the stop method

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* Fix panic when creating concurrent fetchers fails

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Disabled by default

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* we don't need to handle the context in start

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* don't store concurrency or records per fetch

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* add validation to the flags

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* Ensure we don't leak any goroutines.

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* remove concurrent and recordsperfetch from the main struct

Signed-off-by: gotjosh <josue.abreu@gmail.com>

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

kafka replay speed: fix concurrent fetching concurrency transition (#9447)

* kafka replay speed: fix concurrent fetching concurrency transition

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update pkg/storage/ingest/reader.go

* Make sure we evaluate r.lastReturnedRecord WHEN we return

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Redistribute wg.Add

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add tests

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Remove defer causing data race

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Move defer to a different place

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* WIP

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Give more time to catch up with target_lag

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Clarify comment

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
dimitarvdimitrov added a commit that referenced this pull request Sep 27, 2024
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Use labels hasher

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Use consistent title name

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Use consistent title name

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

kafka replay speed: adjust batchingQueueCapacity (#9344)

* kafka replay speed: adjust batchingQueueCapacity

I made 2000 up when we were flushing individual series to the channel.
Then 2000 might have made sense, but when flushing whole WriteRequests
a capacity of 1 should be sufficient.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Increase errCh capacity

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Explain why +1

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Set capacity to 5

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update pkg/storage/ingest/pusher.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Improve test

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update pkg/storage/ingest/pusher.go

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

kafka replay speed: rename CLI flags (#9345)

* kafka replay speed: rename CLI flags

Make them a bit more consistent on what they mean and add better descriptions.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Clarify metrics

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Rename flags

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update docs

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

kafka replay speed: add support for metadata & source (#9287)

* kafka replay speed: add support for metadata & source

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Remove completed TODO

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Use a single map

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Make tests compile again

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

kafka replay speed: improve fetching tracing (#9361)

* Better span logging

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Better span logging

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Try to have more buffering in ordered batches

maybe waiting to send to ordered batches comes with too much overhead

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Correct local docker-compose config with new flags

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Maybe have more stable events

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Revert "Try to have more buffering in ordered batches"

This reverts commit 886b159.

* Maybe have more stable events

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Maybe have more stable events

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Propagate loggers in spans

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

continuous-test: Make the User-Agent header for the Mimir client conf… (#9338)

* continuous-test: Make the User-Agent header for the Mimir client configurable

* Update CHANGELOG.md

* Run make reference-help

TestIngester_PushToStorage_CircuitBreaker: increase initial delay (#9351)

* TestIngester_PushToStorage_CircuitBreaker: increase initial delay

Fixes XXX
I believe there's a race between sending the first request and then collecting the metrics. It's possible that we collect the metrics longer than 200ms after the first request, at which point the CB has opened. I could reproduce XXX by reducing the initialDelay to 10ms.

This PR increases it to 1 hour so that we're more sure that the delay hasn't expired when we're collecting the metrics.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Adjust all tests

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Update to latest commit of dskit main (#9356)

Specifically pulls in grafana/dskit#585

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

Update mimir-prometheus (#9358)

* Update mimir-prometheus

* Run make generate-otlp

query-tee: add equivalent errors for string expression for range queries (#9366)

* query-tee: add equivalent errors for string expression for range queries

* Add changelog entry

MQE: fix `rate()` over native histograms where first point in range is a counter reset (#9371)

* MQE: fix `rate()` over native histograms where first point is a counter reset

* Add changelog entry

Update module github.com/Azure/azure-sdk-for-go/sdk/storage/azblob to v1.4.1 (#9369)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Use centralized 'Add to docs project' workflow with GitHub App auth (#9330)

* Use centralized 'Add to docs project' workflow with GitHub App auth

Until this is merged, it is likely that any issues labeled `type/docs` won't be added to the [organization project](https://github.com/orgs/grafana/projects/69).

The underlying action is centralized so that any future changes are made in one place (`grafana/writers-toolkit`). The action is versioned to protect workflows from breaking changes.

The action uses Vault secrets instead of the discouraged organization secrets.

The workflow uses a consistent name so that future changes can be made programmatically.

Relates to https://github.com/orgs/grafana/projects/279/views/9?pane=issue&itemId=44280262

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove unneeded checkout step

* Remove unneeded checkout step

---------

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

Update grafana/agent Docker tag to v0.43.1 (#9365)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module github.com/hashicorp/vault/api/auth/userpass to v0.8.0 (#9375)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module github.com/hashicorp/vault/api/auth/approle to v0.8.0 (#9374)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module go.opentelemetry.io/collector/pdata to v1.15.0 (#9380)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module github.com/hashicorp/vault/api/auth/kubernetes to v0.8.0 (#9377)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Update module github.com/twmb/franz-go/plugin/kotel to v1.5.0 (#9379)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

kafka replay speed: ingestion metrics (#9346)

* kafka replay speed: ingestion metrics

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Separate batch processing time by batch contents

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Also set time on metadata

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add tenant to metrics

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add metrics for errors

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Rename batching queue metrics

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Pairing to address code review

Co-Authored-By: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Move the metrics into their own file

Co-Authored-By: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* go mod tidy

Signed-off-by: gotjosh <josue.abreu@gmail.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

kafka replay speed: move error handling closer to actual ingestion (#9349)

* kafka replay speed: move error handling closer to actual ingestion

Previously, we'd let error bubble-up and only take decisions on whether to abort the request or not at the very top (`pusherConsumer`). This meant that we'd potentially buffer more requests before we detect an error.

This change extracts error handling logic into a `Pusher` implementation: `clientErrorFilteringPusher`. This implementation logs client errors and then swallows them. We inject that implementation in front of the ingester. This means that the parallel storage implementation can abort ASAP instead of collecting and bubbling up the errors.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

kafka replay speed: concurrency fetching improvements (#9389)

* fetched records include timestamps

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* try with defaultMinBytesWaitTime=3s

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* add fetch_min_bytes_max_wait

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Don't block on sending to the channel

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Remove wait for when we're fetching from the end

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Fix bug with blocking on fetch

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Slightly easier to follow lifecycle of previousResult

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Correct merging of results

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Avoid double-logging events

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Revert "add fetch_min_bytes_max_wait"

This reverts commit 6197d4b.

* Increase MinBytesWaitTime to 5s

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add comment about warpstream and MinBytes

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Address review comments

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* Add tests for concurrentFetchers

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Fix bugs in tracking lastReturnedRecord

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Renamed method

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* use the older context

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* Name variable correct variable name

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Reduce MaxWaitTime in PartitionReader tests

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Change test createConcurrentFetchers signature

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Sort imports

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>

Make concurrentFetchers change its concurrency dynamically (#9437)

* Make concurrentFetchers change its concurrency dynamically

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* address review comments

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* `make doc`

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* inline the stop method

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* Fix panic when creating concurrent fetchers fails

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Disabled by default

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* we don't need to handle the context in start

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* don't store concurrency or records per fetch

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* add validation to the flags

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* Ensure we don't leak any goroutines.

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* remove concurrent and recordsperfetch from the main struct

Signed-off-by: gotjosh <josue.abreu@gmail.com>

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

kafka replay speed: fix concurrent fetching concurrency transition (#9447)

* kafka replay speed: fix concurrent fetching concurrency transition

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update pkg/storage/ingest/reader.go

* Make sure we evaluate r.lastReturnedRecord WHEN we return

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Redistribute wg.Add

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add tests

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Remove defer causing data race

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Move defer to a different place

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* WIP

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Give more time to catch up with target_lag

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Clarify comment

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant