Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose block.GatherFileStats. #5400

Merged
merged 1 commit into from
Jun 2, 2022

Conversation

pstibrany
Copy link
Contributor

  • [na] I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

This PR exposes block.GatherFileStats function so that it can be used outside of Upload function. I've also removed TODOs from it, as I don't think they should be implemented.

Verification

I haven't changed the function itself, only renamed it.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
@pstibrany pstibrany changed the title Expose GatherFileStats. Expose block.GatherFileStats. May 31, 2022
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yeya24 yeya24 merged commit a6f6ce0 into thanos-io:main Jun 2, 2022
pstibrany added a commit to grafana/mimir that referenced this pull request Jun 3, 2022
pstibrany added a commit to grafana/mimir that referenced this pull request Jun 3, 2022
jesusvazquez added a commit to grafana/mimir that referenced this pull request Jun 10, 2022
* Extend Makefile and Dockerfiles to support multiarch builds for all Go binaries. (#1759)

* Extend Dockerfiles to support multiarch builds for all Go binaries.

By calling any of

make push-multiarch-./cmd/metaconvert/.uptodate
make push-multiarch-./cmd/mimir/.uptodate
make push-multiarch-./cmd/query-tee/.uptodate
make push-multiarch-./cmd/mimir-continuous-test/.uptodate
make push-multiarch-./cmd/mimirtool/.uptodate
make push-multiarch-./operations/mimir-rules-action/.uptodate

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Update to latest dskit and memberlist fork (#1758)

* Update to latest dskit and memberlist fork

Fixes #1743

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update changelog

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* update cli parameter description (#1760)

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* mimirtool config: Add more retained old defaults (#1762)

* mimirtool config: Add more retained old defaults

The following parameters have their old defaults retained even when
`--update-defaults` is used with `mimirtool config covert`:

* `activity_tracker.filepath`
* `alertmanager.data_dir`
* `blocks_storage.filesystem.dir`
* `compactor.data_dir`
* `ruler.rule_path`
* `ruler_storage.filesystem.dir`
* `graphite.querier.schemas.backend` (only in GEM)

These are filepaths for which the new defaults don't make more sense
than the old ones. In fact updating these can lead to subpar migration
experience because components start using directories that don't exist.

Because activity_tracker.filepath changed its name since cortex the
tests needed to allow for differentiating old common options and new
ones. This is something that was already there for GEM and was added
for cortex/mimir too.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* dashboards: add flag to skip gateway (#1761)

* dashboards: add flag to skip gateway

The gateway component seems to be an enterprise component, so groups
that aren't running enterprise shouldn't need the empty panels and rows
in their dashboards. This patch adds a flag to drop gateway-related
widgets from the mixin dashboards.

Signed-off-by: Josh Carp <jm.carp@gmail.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Gracefully shutdown querier when using query-scheduler (#1756)

* Gracefully shutdown querier when using query-scheduler

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed comment

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added TestQueuesOnTerminatingQuerier

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Commented executionContext

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/querier/worker/util.go

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

* Fixed typo in suggestion

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed superfluous time sensitive assertion

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Commented newExecutionContext()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

* Graceful shutdown querier without query-scheduler (#1767)

* Graceful shutdown querier with not using query-scheduler

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Updated CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved comment

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Refactoring

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Increase continuous test query timeout (#1777)

* Increase mimir-continuous-test query timeout from 30s to 60

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Increased default -tests.run-interval from 1m to 5m (#1778)

* Increased default -tests.run-interval from 1m to 5m

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix flaky tests on querier graceful shutdown (#1779)

* Fix flaky tests on querier graceful shutdown

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove spurious newline

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update build image and GitHub workflow (#1781)

* Update build-image to use golang:1.17.8-bullseye, and add skopeo to build image.

Skopeo will be used in subsequent PR to push multiarch images.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Update build image. Use ubuntu-latest for workflow steps.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* api: remote duplicated remote read querier handler (#1776)

* Publish multiarch images (#1772)

* Publish multiarch images.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Tag with extra tag, if pushing tagged commit or release.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Split building of docker images and archiving them into tar.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* When tagging with test, use --all.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Only run deploy step on tags or weekly release branches.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Don't tag with test anymore.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Address review feedback.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Fix license check.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* K6: Take into account HTTP status code 202 (#1787)

When using `K6_HA_REPLICAS > 1`, Mimir will accept all HTTP calls but a
part of those call will receive a status code `202`. The following
commit makes this status code as expected otherwise user receive the
following error:
```
reads_inat write (file:///.../mimir-k6/load-testing-with-k6.js:254:8(137))
reads_inat native  executor=ramping-arrival-rate scenario=writing_metrics source=stacktrace
ERRO[0015] GoError: ERR: write failed. Status: 202. Body: replicas did not mach, rejecting sample: replica=replica_1, elected=replica_0
```

At the end of the benchmark summary display errors:
```
     ✗ write worked
      ↳  20% — ✓ 23 / ✗ 92
```

Example of load testing:
```shell
./k6 run load-testing-with-k6.js \
    -e K6_SCHEME="https" \
    -e K6_WRITE_HOSTNAME="${mimir}" \
    -e K6_READ_HOSTNAME="${mimir}" \
    -e K6_USERNAME="${user}" \
    -e K6_WRITE_TOKEN="${password}" \
    -e K6_READ_TOKEN="${password}" \
    -e K6_HA_CLUSTERS="1" \
    -e K6_HA_REPLICAS="3" \
    -e K6_DURATION_MIN="5"
```

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>

* replace model.Metric with labels.Labels in distributor.MetricsForLabelMatchers() (#1788)

* Streaming remote read (#1735)

* implement read v2

* updated CHANGELOG.md

* extend maxBytesInFram comment.

* addressed PR feedback

* addressed PR feedback

* addressed PR feedback

* use indexed xor chunk function to assert stream remote read tests

* updated CHANGELOG.md

Co-authored-by: Miguel Ángel Ortuño <miguel.ortuno@grafana.com>

* Upgrade dskit (#1791)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix mimir-continuous-test when changing configured num-series (#1775)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Do not export per user and integration Alertmanager metrics when value is 0 (#1783)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Print version+arch of Mimir loaded to Docker. (#1793)

* Print version+arch of Mimir loaded to Docker.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Use debug log for distributor.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Remove unused metrics cortex_distributor_ingester_queries_total and cortex_distributor_ingester_query_failures_total (#1797)

* Remove unused metrics cortex_distributor_ingester_queries_total and cortex_distributor_ingester_query_failures_total

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove unused fields

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added options support to SendSumOfCountersPerUser() (#1794)

* Added options support to SendSumOfCountersPerUser()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Renamed SkipZeroValueMetrics() to WithSkipZeroValueMetrics()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Changed all Grafana dashboards UIDs to not conflict with Cortex ones, to let people install both while migrating from Cortex to Mimir (#1801)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Adopt mixin convention to set dashboard UIDs based on md5(filename) (#1808)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add support for store_gateway_zone args (#1807)

Allow customizing mimir cli flags per zone for the store gateway.
Copied the same solution as we have for ingesters.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add protection to store-gateway to not drop all blocks if unhealthy in the ring (#1806)

* Add protection to store-gateway to not drop all blocks if unhealthy in the ring

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update CHANGELOG.md

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

* Removed cortex_distributor_ingester_appends_total and cortex_distributor_ingester_append_failures_total unused metrics (#1799)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove unused clientConfig from ingester (#1814)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add tracing to `mimir-continuous-test` (#1795)

* Extract and test TracerTransport functionality

We need to use a TracerTransport in mimir-continous-test. We have that
in the frontend package, but I don't want to import frontend from the
mimir-continous-test, so we extract it to util/instrumentation.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Set up global tracer in mimir-continuous-test

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add tracing to the client and spans to the tests

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add jaeger-mixin to mimir-continuous test container

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make license

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add traces to the write path

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Chore: remove unused code from BucketStore (#1816)

* Removed unused Info() and advLabelSets from BucketStore

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused FilterConfig from BucketStore

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused relabelConfig from store-gateway tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused function expectedTouchedBlockOps()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused recorder from BucketStore tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* go mod vendor

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Refactoring: force removal of all blocks when BucketStore is closed (#1817)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Simplify FilterUsers() logic in store-gateway (#1819)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Migrate admin CSS to bootstrap 5 (#1821)

* Migrate admin CSS to bootstrap 5

When I added bootstrap, for some reason I imported bootstrap 3 which was
originally launched in 2013.

Before adding more CSS styles, let's migrate to modern Bootstrap 5
launched in 2021.

This doesn't require an explicit jquery dependency anymore.

Also re-styled admin header to adapt properly to mobile devices screens.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* ruler: make use of dskit `grpcclient.Config` on remote evaluation client (#1818)

* ruler: use dskit grpc client for remote evaluation

* addressed PR feedback

* Memberlist status page CSS (#1824)

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update dskit to 4d7238067788a04f3dd921400dcf7a7657116907

This includes changes from https://github.com/grafana/dskit/pull/163

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Custom memberlist status template

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Include `import` in jsonnet snippets (#1826)

* Do not drop blocks in the store-gateway if missing in the ring (#1823)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Upgraded dskit to fix temporary partial query results when shuffle sharding is enabled and hash ring backend storage is flushed / reset (#1829)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: ruler remote evaluation  (#1714)

* include documentation for remote rule evaluation

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* address PR feedback

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* addressed PR feedback

* addressed PR feedback

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/running-production-environment/planning-capacity.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/running-production-environment/planning-capacity.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* addressed PR feedback

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Alertmanager: Do not validate alertmanager configuration if it's not running. (#1835)

Allows other targets to start up even if an invalid alertmanager configuration
is passed in.

Fixes #1784

* Alertmanager: Allow usage with `local` storage type, with appropriate warnings. (#1836)

An oversight when we removed non-sharding modes of operation is that the `local`
storage type stopped working. Unfortunately it is not conceptually simple to
support this type fully, as alertmanager requires remote storage shared between
all replicas, to support recovering tenant state to an arbitrary replica
following an all-replica outage.

To support provisioning of alerts with `local` storage, but persisting of state
to remote storage, we would need to allow different storage configurations.

This change fixes the issue in a more naive way, so that the alertmanager can at
least be started up for testing or development purposes, but persisting state
will always fail. A second PR will propose allowing the `Persister` to be
disabled.

Although this configuration is not recommended for production used, as long as
the number of replicas is equal to the replication factor, then tenants will
never move between replicas, and so the local snapshot behaviour of the upstream
alertmanager will be sufficient.

Fixes #1638

* Mixin: Additions to Top tenants dashboard regarding sample rate and discard rate. (#1842)

Adds the following rows to the "Top tenants" dashboard:

- By samples rate growth
- By discarded samples rate
- By discarded samples rate growth

These queries are useful for determining what tenants are potentially putting excess
load on distributors and ingesters (and if it increased recently).

* Use concurrent open/close operations in compactor unit tests (#1844)

Open and close files concurrently in compactor unit tests to expose bugs
that implicitly rely on ordering.

Exposes bugs such as https://github.com/prometheus/prometheus/pull/10108

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Mixin: Show ingestion rate limit and rule group limit on Tenants dashboard. (#1845)

Whilst diagnosing a recent issue, we thought it would be useful to show the
current ingestion rate limit for the tenant. As the limit is applied to
`cortex_distributor_received_samples_total`, the limit is shown on the panel
which displays this metric. ("Distributor samples received (accepted) rate").

Also added `ruler_max_rule_groups_per_tenant` while in the area.

We don't currently display the number of exemplars in storage on the dashboard
anywhere, so cannot add `max_global_exemplars_per_user` right now.

* Jsonnet: Preparatory refactoring to simplify deploying parallel query paths. (#1846)

This change extracts some of the jsonnet used to build query deployments
(querier, query-scheduler, query-frontend) such that it is easier to deploy
secondary query paths. The use case for this is primarily to develop a
query path deployment for ruler remote-evaluation, but there may be other
use cases too.

* Removed double space in Log (#1849)

* Reference 'monolithic mode' instead of 'single binary' in logs (#1847)

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Extend safeTemplateFilepath to cover more cases. (#1833)

* Extend safeTemplateFilepath to cover more cases.

- template name ../tmpfile, stored into /tmp dir
- empty template name
- template name being just "."

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Relax mimir-continuous-test pressure when deployed with Jsonnet (#1853)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add 2.1.0-rc.0 header (#1857)

* Prepare release 2.1 (#1859)

* Update VERSION to 2.1-rc.0

* Add relevant changelog entries for user facing PRs since mimir-2.0.0

* Add patch in semver VERSION

* Adding updated ruler diagrams. (#1861)

* Create v2-1.md (#1848)

* Create v2-1.md

* Update and rename v2-1.md to v2.1.md

updated the header and renamed the file.

* Update v2.1.md

Missing the upgrade configurations.

* Update v2.1.md

added bug description

* Update v2.1.md

bug fix writeup.

* Update v2.1.md

Added the series count description

* Apply suggestions from code review

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update v2.1.md

* Update v2.1.md

updated tsdb isolation wording.

* Ran make doc.

* Fixed a broken relref.

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Allow custom data source regex in mixin dashboards (#1802)

* dashboards: update grafana-builder

The following commit update grafana-builder version and brings in:
* enable toolip by default (#665)
* Add 'Data Source' label for the default datasource template variable. (#672)
* add dashboard link func (#683)
* make allValue configurable (#703)
* Allow datasource's regex to be configured

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>

* Allow custom data source regex in mixin dashboards

The current dashboards offer the possibility to select a data source
among all prometheus data sources in the organization. Depending on the
number of data sources the list could be rather big (>10). Not all data
sources host Mimir metrics as such listing them is not helpful for the
users.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>

* Revert back change that was enabling shared tooltips

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Dashboards: Fix `container_memory_usage_bytes:sum` recording rule (#1865)

* Dashboards: Fix `container_memory_usage_bytes:sum` recording rule

This change causes recording rules that reference
`container_memory_usage_bytes` to omit series that do not contain the
required labels for rules to run successfully, by requiring a non-empty
`image` label.

Signed-off-by: Peter Fern <github@0xc0dedbad.com>

* Update CHANGELOG

Signed-off-by: Peter Fern <github@0xc0dedbad.com>

* Add compiled rules

Signed-off-by: Peter Fern <github@0xc0dedbad.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Deprecate -distributor.extend-writes and set it always to false (#1856)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove DCO from contributors guidelines (#1867)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Create v2-1.md (#1848)

* Create v2-1.md

* Update and rename v2-1.md to v2.1.md

updated the header and renamed the file.

* Update v2.1.md

Missing the upgrade configurations.

* Update v2.1.md

added bug description

* Update v2.1.md

bug fix writeup.

* Update v2.1.md

Added the series count description

* Apply suggestions from code review

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update v2.1.md

* Update v2.1.md

updated tsdb isolation wording.

* Ran make doc.

* Fixed a broken relref.

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Adding updated ruler diagrams. (#1861)

* Deprecate -distributor.extend-writes and set it always to false (#1856)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Bump version to 2.1.0-rc.1 to include cherry-picked

* List Johanna as 2.1.0 release shepherd (#1871)

* fix(mixin): add missing alertmanager hashring members (#1870)

* fix(mixin): add missing alertmanager hashring members

* docs(CHANGELOG): add changelog entry

* Docs: clarify 'Set rule group' API specification (#1869)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Simplify documentation publishing logic (#1820)

* Simplify documentation publishing logic

Split into two pipelines, one that runs on main and one that runs on
release branches and tags.

Use `has-matching-release-tag` workflow to determine whether to release
documentation on release branch and tags.

`has-matching-release-tag` is documented in https://github.com/grafana/grafana-github-actions/blob/main/has-matching-release-tag/action.yaml

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove script no longer used for documentation releases

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Add missing clone step for the website-sync action

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Update RELEASE instructions to reflect automated docs publishing

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove conditional from website clone for next publishing

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Fix capitalization of Jsonnet and Tanka (#1875)

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Checkout the repository as part of the documentation sync (#1876)

* Checkout the repository as part of the documentation sync

I assumed this was already done but the GitHub docs confirm that it is
required.
https://docs.github.com/en/github-ae@latest/actions/using-workflows/about-workflows#about-workflows
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Allow manual triggering of workflow

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Fix manual workflow dispatch (#1877)

TIL that if you edit the workflow in the GitHub UI, it will lint your workflow file and make sure that all the keys conform to the schema.

* Simplify documentation publishing logic (#1820)

* Simplify documentation publishing logic

Split into two pipelines, one that runs on main and one that runs on
release branches and tags.

Use `has-matching-release-tag` workflow to determine whether to release
documentation on release branch and tags.

`has-matching-release-tag` is documented in https://github.com/grafana/grafana-github-actions/blob/main/has-matching-release-tag/action.yaml

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove script no longer used for documentation releases

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Add missing clone step for the website-sync action

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Update RELEASE instructions to reflect automated docs publishing

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove conditional from website clone for next publishing

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Checkout the repository as part of the documentation sync (#1876)

* Checkout the repository as part of the documentation sync

I assumed this was already done but the GitHub docs confirm that it is
required.
https://docs.github.com/en/github-ae@latest/actions/using-workflows/about-workflows#about-workflows
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Allow manual triggering of workflow

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Fix manual workflow dispatch (#1877)

TIL that if you edit the workflow in the GitHub UI, it will lint your workflow file and make sure that all the keys conform to the schema.

* Chore: cleanup unused alertmanager config in Mimir jsonnet (#1873)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update mimir-prometheus to ceaa77f1 (#1883)

* Update mimir-prometheus to ceaa77f1

This includes the fix
https://github.com/grafana/mimir-prometheus/pull/234
for https://github.com/grafana/mimir/issues/1866

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix changelog

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Bump version to 2.1.0-rc.1 to include cherry-picked (#1872)

* Increased default configuration for -server.grpc-max-recv-msg-size-bytes and -server.grpc-max-send-msg-size-bytes from 4MB to 100MB (#1884)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Split mimir_queries rule group so that it doesn't have more than 20 rules (#1885)

* Split mimir_queries rule group so that it doesn't have more than 20 rules.
* Add check for number of rules in the group.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Add alert for store-gateways without blocks (#1882)

* Add alert for store-gateways without blocks

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Clarify messages

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Replace "Store Gateway" with "store-gateway"

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Rename alert to StoreGatewayNoSyncedTenants

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Rebuild mixin

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Fix flaky integration tests caused by 'metric not found' (#1891)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: Explain the runtime override of active series matchers (#1868)

* Updated docs/sources/operators-guide/configuring/configuring-custom-trackers.md; made some tweaks to the examples; changed name interesting-service and also-interesting-service to service1 and service2 respectively

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* Update to latest Thanos for Memcached fixes (#1837)

Update our vendor of Thanos to pull in the most recent changes to the
Memcached client. In particular, these changes prevent the client from
starting many goroutines as part of batching before they are able to
make progress.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Fixed deceiving error log "failed to update cached shipped blocks after shipper initialisation" (#1893)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix TestRulerEvaluationDelay flakyness (#1892)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix `MimirRulerMissedEvaluations` text and add playbook (#1895)

* Correct magnitude on MimirRulerMissedEvaluations

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add playbook for MimirRulerMissedEvaluations

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Remove trailing spaces

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Conform to tech doc style. (#1904)

* Use a dedicated threadpool for store-gateway requests (#1812)

Remove the use of a dedicated threadpool for index-header operations
because the call overhead is prohibitively expensive. Instead, use a
dedicated threadpool for entire store-gateway requests so that the cost
of switching between threads is only paid a single time. This allows
for isolation in the case of page faults during mmap accesses without
too much overhead.

Fixes #1804

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Upgrade consideration for active_series_custom_trackers_config (#1897)

* Upgrade consideration for active_series_custom_trackers_config

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* Upgrade consideration for active_series_custom_trackers_config (#1897)

* Upgrade consideration for active_series_custom_trackers_config

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* fix(mixin): do not trigger TooMuchMemory alerts if no container limits are supplied (#1905)

* fix(mixin): do not trigger `MimirAllocatingTooMuchMemory` or `EtcdAllocatingTooMuchMemory` alerts if no container limits are supplied

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Fix MimirCompactorHasNotUploadedBlocks alert false positive when Mimir is deployed in monolithic mode (#1902)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Set defaults to query ingesters, not store, for recent data (#1909)

Set queriers to _not_ query storage (store-gateways) for recent data
and set the store-gateways to ignore recent uncompacted blocks.

Default values are set to match what we use in the Mimir jsonnet.

Fixes #1639

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Revert distributor log level to warn in integration tests (#1910)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved error returned by -querier.query-store-after validation (#1914)

* Improved error returned by -querier.query-store-after validation

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/querier/querier.go

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Remove jsonnet configuration settings that match default values (#1915)

* Remove jsonnet configuration settings that match default values

Follow up to #1909

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Docs: recommend fast disks for ingesters and store-gateways (#1903)

* Docs: recommend fast disks for ingesters and store-gateways

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/running-production-environment/production-tips/index.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/running-production-environment/production-tips/index.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Improve series, sample, metadata and exemplars validation errors (#1907)

* Improved error messages returned by ValidateSample(), ValidateExemplar(), ValidateMetadata() and ValidateLabels()

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Fixed unit tests after error messages edit

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Manually applied a suggestion to error message

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Renamed globalerrors pkg to singular form

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Cleanup globalerror package based on Oleg's feedback

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed formatting support from globalerror.ID's message generation function

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Changed another error message based on feedback

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update operations/mimir-mixin/docs/playbooks.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Rephrased label name/value length error message based on feedback received in the test file

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Final fixes to error messages

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* mixin-tool: adapt screenshots dockerimage to support arm64 (#1916)

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* Ingester ring endpoint fix (#1918)

* /ingester/ring is also available via distributor.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Revert unintended change.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Configuration files for GrafanaCon 2022 presentation. (#1881)

* Configuration files for GrafanaCon 2022 presentation.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Update dskit to bring "Parallelize memberlist notified message processing" PR (#1912)

* Update dskit to bring "Parallelize memberlist notified message processing" PR.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* CHANGELOG.md

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Account for StatefulSets and Depl-s named by the helm chart (#1913)

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Change shuffle sharding ingester lookback default config (#1921)

* Change shuffle sharding ingester lookback default config

Use the same default value for ingester lookback as the "query ingesters
within" setting to reduce the number of things that need to be changed from
their defaults. This change also removes use of the
`-blocks-storage.tsdb.close-idle-tsdb-timeout` flag in jsonnet since the
value being used matches the default.

Follow up to #1915

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Changelog

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Improved ValidateMetadata() errors (#1919)

* Improved ValidateMetadata() errors

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/util/validation/errors.go

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Converted all ValidationError to be non-pointers

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused variable

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed unit test

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed markdown linter

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

* mixin/dashboards: ruler query path dashboards (#1911)

* mixin: added ruler query path dashboards

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* docs: added ruler reads & ruler reads resources dashboard screenshots

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* updated CHANGELOD.md

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* Mark query_ingesters_within and query_store_after as advanced (#1929)

* Mark query_ingesters_within and query_store_after as advanced

Now that they have good defaults that match what we run in production,
they shouldn't need to be tuned by users in most cases.

Fixes #1924

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Remove empty chunks panel from Queries dashboard (#1928)

* Remove empty chunks panel from Queries dashboard

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Make MimirGossipMembersMismatch less sensitive, and make it fire fewer alerts. (#1926)

* Make MimirGossipMembersMismatch less sensitive, and make it fire fewer alerts.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* CHANGELOG.md

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Update config value for -querier.query-ingesters-within to work with … (#1930)

* Update config value for -querier.query-ingesters-within to work with new default value for -querier.query-store-after

* Remove config for -querier.query-ingesters-within as they are set to default

* Update Thanos vendor for memcache improvements (#1920)

Update our vendor of Thanos so that memcache keys are grouped by the
server they are owned by before being split into batches.

Fixes #423

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Move usage generation to separate package (#1934)

* Move usage function into a separate package and export it

Signed-off-by: Patryk Prus <patryk.prus@grafana.com>

* Add function to add to flag category overrides at runtime

Signed-off-by: Patryk Prus <patryk.prus@grafana.com>

* Document CHANGELOG scopes

* Add documentation about changelog scopes
* update CHANGELOG for #1934

* Improve instance limits, ingester limits, query limiter, some querier errors (#1888)

* Add errors IDs to pkg/ingester/instance_limits.go

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add errors IDs to pkg/ingester/limiter.go

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add errors IDs to pkg/querier/blocks_store_queryable.go

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Differentiate max-ingester-ingestion-rate from distributor

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update playbooks.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Correct misspelled flags

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Correct strings in tests as well

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Re-iterated on ingesters limit errors

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Re-iterated on ingesters per-tenant limit errors

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Re-iterated on query per-tenant limit errors

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Mention the cardinality API endpoint in the err-mimir-max-series-per-metric runbook

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update operations/mimir-mixin/docs/playbooks.md

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Fixed InstanceLimits receiver name to be consistent

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Clarify metadata is stored in memory

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed linter and tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed more tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/querier/blocks_store_queryable.go

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix english grammar about 'how to fix it'

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make ingesters use heartbeat timeout instead of period to fix the bug… (#1933)

* make ingesters use heartbeat timeout instead of period to fix the bug where they sometimes appear as unhealthy

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update VERSION to 2.1.0

* Update dashboard screenshots (#1940)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix version in changelog

* Update mimir tests to use new 2.1.0 image

* Add minimum Grafana version to mixin dashboards (#1943)

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Bump grafana/mimir image to 2.1.0 for backward compatibility testing (#1942)

* Chore: renamed source files for remote ruler dashboards (#1937)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Move the mimir-distributed helm chart into the mimir repository (#1925)

* Initial copy of mimir-distributed helm chart

This commit is not expected to work in CI.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Update github action for helm lint and test

Set the working directory for github actions for helm actions.
Set more consistent name for github actions.
Set chart name for testing.
Ignore generated helm doc from prettier.
Do not do release for now of helm chart.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add bucket prefix configuration (#1686)

* Add bucket prefix configuration

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add allowed chars validation for storage prefix

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add unit tests for PrefixedBucketClient

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Use grafana/regexp instead of regexp

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Improve validation of storage_prefix

Update docs and add validate for .. and .

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add some tests for AM and ruler bucket validaiton

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add tests for bucket prefix with filesystem client

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update helm text too

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update everything

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Simplify validation for storage_prefix

Only accept alphanumeric characters for the storage_prefix to prevent
mistypings and misunderstandings when the prefix ends with a slash or
contains slashes and dots

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Make stronger assertions in bucket validation test

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Make stronger assertions in bucket prefix test

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Assert on errors, not on strings

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Exclude YAML field names from error message

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Include full image tag on rollout dashboard (#1932)

* Make version matcher in rollout dashboard work for non-weekly images

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG.md entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* docs: move federated rule groups documentation to its own section (#1906)

* docs: move federated rule groups documentation to its own section

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* Make networking panels pod matchers work with helm chart (#1927)

* Make networking panels pod matchers work with helm chart

The pods created by the helm chart follow a format of
<helm_release_name>-mimir-<ingester|distributor|...>.

This is a problem for all places that use the per_instance_label for
matching. The per_instance_label is mostly used in aggregations (sum by
(pod), count by (pod), ...). The networking panels are the only ones
that use it for matching.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Replace .* with a stronger regex in pod matchers

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG.md entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add max query length error to errors catalog (#1939)

* Add max query length error to errors catalogue

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Remove image spec from demo file. (#1946)

* Remove image spec from demo file.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Fix rejected identity accept encoding (#1864)

* Fix rejected identity accept-encoding

When a request comes in with header:
    Accept-Encoding: gzip;q=1, identity;q=0

we should gzip the response even if it's smaller than the defined
minimum size.

We achieve this by fixing the github.com/nytimes/gziphandler code, and
bringing the fixed code into this repository since:
- they don't seem to be maintaining it anymore
- we don't want to use a replace directive as it's very likely to be
  lost in codebases depending on this.
- it's a little amount of code (500 lines)

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add API test for gzip

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make lint pkg/util/gziphandler

Mostly handling errors, also removed the deprecated http.CloseNotifier
functionality and related code.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix comment

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Add faillint for github.com/nytimes/gziphandler

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make lint

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix faillint paths

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* If there's content-encoding, start plain write

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* If less than min-size, don't encode

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Refactor `handleContentType` to handle by default

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Rename acceptsIdentity to rejectsIdentity,

Hopefully this will minimise the amount of double negations making the
code clearer.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix comment

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Distributor: added per-tenant request limit (#1843)

* distributor: added request limiter logic

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* updated CHANGELOG.md

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* distributor: added type plans rate limits

Assuming a minimum sane value of 100 samples per request, we've set default request limits for each user tier.

* docs: added request limit distributor documentation

* rebuilt jsonnet test output

* make linter happy

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* updated reference help

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* Add bucket prefix to experimental features (#1951)

* Add bucket prefix to experimental features

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update flag status of storage_prefix to experimental

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Copy thanos shipper (#1957)

* Copy shipper from Thanos.
* Remove support for uploading compacted blocks.
* Always allow out-of-order uploads. Removed unused overlap checker.
* Rename Shipper interface to BlocksUploader, and ThanosShipper to Shipper.
* Extract readShippedBlocks method from user_tsdb.go
* Added shipper unit tests (copied and adapted from original tests)
* Add faillint rule to avoid using Thanos shipper.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Adjust the name of the tag expected by documentation publishing (#1974)

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Use github.com/colega/grafana-tools-sdk fork (#1973)

* Use github.com/colega/grafana-tools-sdk fork

See https://github.com/grafana/cortex-tools/pull/248 for more context (this is
the same change). The grafana-tools/sdk dependency will eventually be removed entirely
from analyse commands.

Signed-off-by: hjet <hjet@users.noreply.github.com>

* Update CHANGELOG.md

Signed-off-by: hjet <hjet@users.noreply.github.com>

* mod tidy

* Deprecate -ingester.ring.join-after (#1965)

* Deprecate -ingester.ring.join-after

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Addressed review feedback

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Dashboards: disable gateway panels by default (#1955)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: rename 'playbooks' to 'runbooks' and move them to doc (#1970)

* Docs: rename 'playbooks' to 'runbooks' and move them to doc

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Named runbooks folder as 'mimir-runbooks/' to make it easy to import in Grafana Labs internal infrastructure as code

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix anchors check because they're case insensitive

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Preparation of e2eutils for Thanos indexheader unit tests. (#1982)

We want to pull in the indexheader package from Thanos so that we can add some experimental alternative implementations of BinaryReader. In order to also pull in the unit tests for this package, we need the replacements for e2eutil.Copy and e2eutil.CreateBlock. This change does two things:

1. Copy in e2eutil/copy.go and fix it up accordingly.
2. Move CreateBlock into a package to avoid circular imports.

* Make propagation of forwarding errors configurable (#1978)

* make propagation of forwarding errors optional

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* add test for disabled error propagation

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* leave error propagation enabled by default

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update help

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update docs

* better wording

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* Release the mimir-distributed-beta helm chart (#1948)

Use the common workflow from the helm-chart repo.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Copy Thanos block/indexheader package (#1983)

* Copy thanos/pkg/block/indexheader.

* Update provenance.

* Fix linter error due to error variable name.

* Use require instead of e2eutil.

* Replace usage of e2eutil.Copy

* Replace usage of e2eutil.CreateBlock with local version.

* Replace use of Thanos indexheader with local copy.

* Add faillint check for upstream indexheader.

* Fix goleak ignore for NewReaderPool.

* Update vendor directory.

* Prepare mimir beta chart release (#1995)

* Rename chart back to mimir-distributed

Apparently the helm option --devel is needed to trigger using beta
versions. This should be enough protection for accidental use. Avoids
renaming issues.

* Version bump helm chart

Do version bump to a beta version but nothing else until we double check
 that such beta chart cannot be accidentally selected with helm tooling.

* Enable helm chart release from main branch

Release process tested ok on test branch.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Bump version of helm chart (#1996)

Test if helm release triggers correctly.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Update gopkg.in/yaml.v3 (#1989)

This updates to a version that contains the fix to CVE-2022-28948.

* Remove hardlinking in Shipper code. (#1969)

* Remove hardlinking in Shipper code.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* [helm] use grpc round robin for distributor clients (#1991)

* Use GRPC round-robin for gateway -> distributor requests

Fixes https://github.com/grafana/mimir/issues/1987
Update chart version and changelog
Use the headless distributor service for the nginx gateway

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Fix binary_reader.go header text. (#1999)

Mistakenly left two lines when updating the provenance for the file.

* Workaround to keep using old memcached bitnami chart for now (#1998)

* Workaround to keep using old memcached bitnami chart for now

See also: https://github.com/grafana/helm-charts/pull/1438
Also clean up unused chart repositories from ct.yaml.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* [helm] add results cache (#1993)

* [helm] Add query-frontend results cache

Fixes https://github.com/grafana/helm-charts/issues/1403

* Add PR to CHANGELOG

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Fix README

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Disable distributor.extend-writes & ingester.ring.unregister-on-shutdown (#1994)

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Update CHANGELOG.md (#1992)

* [helm] Prepare image bump for 2.1 release (#2001)

* Prepare image bump for 2.1 release

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Fix README template to reference 2.1

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Add nice link text to CHANGELOG

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Update CHANGELOG.md

* Publish helm charts from release branches (#2002)

* Update Thanos with https://github.com/thanos-io/thanos/pull/5400. (#2006)

* Replace hardcoded intervals with $__rate_interval in dashboards (#2011)

* Replace hardcoded intervals with $__rate_interval in dashboards

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG.md entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Standardise error messages for distributor instance limits (#1984)

* standardise error messages for distributor instance limits

* Apply suggestions from code review

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* apply code review suggestions to rest of doc for consistency

* manually apply suggestion from code review

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Remove tutorials/ symlink (#2007)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add querier autoscaler support to jsonnet (#2013)

* Add querier autoscaler support to jsonnet

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed autoscaling.libsonnet import

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add a check to Mimir jsonnet to ensure query-scheduler is enabled when enabling querier autoscaling (#2023)

* Add a check to Mimir jsonnet to ensure query-scheduler is enabled when enabling querier autoscaling

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Shouldn't be an exported object

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Don't include external labels in blocks uploaded by Ingester (#1972)

* Remove support for external labels.
* Fixed comments.
* Don't use TenantID label. Filter out the label during compaction.
* CHANGELOG.md
* Use public function from Thanos.
* Use new UploadBlock function, move GrpcContextMetadataTenantID constant.
* Rename tsdb2 import to mimir_tsdb.
* Fix tests.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Enhance MimirRequestLatency runbook with more advice (#1967)

* Enhance MimirRequestLatency runbook with more advice

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Include helm-docs in build and CI (#2026)

* Update the mimir build image and its build doc

Dockerfile: Add helm-docs package to the image.
how-to: Write down the requirements for build in more detail. Add
information about build on linux.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Expand make doc with helm-docs command

This enables generating the helm chart README with the same make doc
command as all other documentation.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Update docs/internal/how-to-update-the-build-image.md

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update contributing guides for the helm chart (#2008)

* Update contributing guides for the helm chart

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Turn off helm version increment check in CI

This enables periodic releases, as opposed to requiring version bump
for release at every PR.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add extraEnvFrom to all services and enable injection into mimir config (#2017)

Add `extraEnvFrom` capability to all Mimir services to enable injecting
secrets via environment variables.

Enable `-config.exand-env=true` option in all Mimir services to be able
to take secrets/settings from the environment and inject them into the
 Mimir configuration file.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Docs: fix mimir-mixin installation instructions (#2015)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: make documentation a first class citizen in CHANGELOG (#2025)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Helm: add global.extraEnv and global.extraEnvFrom (#2031)

* Helm: add global.extraEnv and global.extraEnvFrom

Enables setting environment and env injection in one place for
mimir + nginx.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Upgrade alpine to 3.16.0 (#2028)

* Upgrade alpine to 3.16.0

* Enhance MimirRequestLatency runbook with more advice (#1967)

* Enhance MimirRequestLatency runbook with more advice

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Include helm-docs in build and CI (#2026)

* Update the mimir build image and its build doc

Dockerfile: Add helm-docs package to the image.
how-to: Write down the requirements for build in more detail. Add
information about build on linux.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Expand make doc with helm-docs command

This enables generating the helm chart README with the same make doc
command as all other documentation.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Update docs/internal/how-to-update-the-build-image.md

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update contributing guides for the helm chart (#2008)

* Update contributing guides for the helm chart

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Turn off helm version increment check in CI

This enables periodic releases, as opposed to requiring version bump
for release at every PR.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add extraEnvFrom to all services and enable injection into mimir config (#2017)

Add `extraEnvFrom` capability to all Mimir services to enable injecting
secrets via environment variables.

Enable `-config.exand-env=true` option in all Mimir services to be able
to take secrets/settings from the environment and inject them into the
 Mimir configuration file.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Docs: fix mimir-mixin installation instructions (#2015)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: make documentation a first class citizen in CHANGELOG (#2025)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* upgrade to alpine 3.16.0

* upgrade alpine to 3.16.0

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Helm: release our first weekly (#2033)

This should be automated, bu…
jesusvazquez added a commit to grafana/mimir that referenced this pull request Jun 20, 2022
* Extend Makefile and Dockerfiles to support multiarch builds for all Go binaries. (#1759)

* Extend Dockerfiles to support multiarch builds for all Go binaries.

By calling any of

make push-multiarch-./cmd/metaconvert/.uptodate
make push-multiarch-./cmd/mimir/.uptodate
make push-multiarch-./cmd/query-tee/.uptodate
make push-multiarch-./cmd/mimir-continuous-test/.uptodate
make push-multiarch-./cmd/mimirtool/.uptodate
make push-multiarch-./operations/mimir-rules-action/.uptodate

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Update to latest dskit and memberlist fork (#1758)

* Update to latest dskit and memberlist fork

Fixes #1743

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update changelog

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* update cli parameter description (#1760)

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* mimirtool config: Add more retained old defaults (#1762)

* mimirtool config: Add more retained old defaults

The following parameters have their old defaults retained even when
`--update-defaults` is used with `mimirtool config covert`:

* `activity_tracker.filepath`
* `alertmanager.data_dir`
* `blocks_storage.filesystem.dir`
* `compactor.data_dir`
* `ruler.rule_path`
* `ruler_storage.filesystem.dir`
* `graphite.querier.schemas.backend` (only in GEM)

These are filepaths for which the new defaults don't make more sense
than the old ones. In fact updating these can lead to subpar migration
experience because components start using directories that don't exist.

Because activity_tracker.filepath changed its name since cortex the
tests needed to allow for differentiating old common options and new
ones. This is something that was already there for GEM and was added
for cortex/mimir too.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* dashboards: add flag to skip gateway (#1761)

* dashboards: add flag to skip gateway

The gateway component seems to be an enterprise component, so groups
that aren't running enterprise shouldn't need the empty panels and rows
in their dashboards. This patch adds a flag to drop gateway-related
widgets from the mixin dashboards.

Signed-off-by: Josh Carp <jm.carp@gmail.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Gracefully shutdown querier when using query-scheduler (#1756)

* Gracefully shutdown querier when using query-scheduler

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed comment

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added TestQueuesOnTerminatingQuerier

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Commented executionContext

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/querier/worker/util.go

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

* Fixed typo in suggestion

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed superfluous time sensitive assertion

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Commented newExecutionContext()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

* Graceful shutdown querier without query-scheduler (#1767)

* Graceful shutdown querier with not using query-scheduler

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Updated CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved comment

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Refactoring

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Increase continuous test query timeout (#1777)

* Increase mimir-continuous-test query timeout from 30s to 60

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Increased default -tests.run-interval from 1m to 5m (#1778)

* Increased default -tests.run-interval from 1m to 5m

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix flaky tests on querier graceful shutdown (#1779)

* Fix flaky tests on querier graceful shutdown

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove spurious newline

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update build image and GitHub workflow (#1781)

* Update build-image to use golang:1.17.8-bullseye, and add skopeo to build image.

Skopeo will be used in subsequent PR to push multiarch images.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Update build image. Use ubuntu-latest for workflow steps.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* api: remote duplicated remote read querier handler (#1776)

* Publish multiarch images (#1772)

* Publish multiarch images.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Tag with extra tag, if pushing tagged commit or release.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Split building of docker images and archiving them into tar.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* When tagging with test, use --all.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Only run deploy step on tags or weekly release branches.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Don't tag with test anymore.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Address review feedback.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Fix license check.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* K6: Take into account HTTP status code 202 (#1787)

When using `K6_HA_REPLICAS > 1`, Mimir will accept all HTTP calls but a
part of those call will receive a status code `202`. The following
commit makes this status code as expected otherwise user receive the
following error:
```
reads_inat write (file:///.../mimir-k6/load-testing-with-k6.js:254:8(137))
reads_inat native  executor=ramping-arrival-rate scenario=writing_metrics source=stacktrace
ERRO[0015] GoError: ERR: write failed. Status: 202. Body: replicas did not mach, rejecting sample: replica=replica_1, elected=replica_0
```

At the end of the benchmark summary display errors:
```
     ✗ write worked
      ↳  20% — ✓ 23 / ✗ 92
```

Example of load testing:
```shell
./k6 run load-testing-with-k6.js \
    -e K6_SCHEME="https" \
    -e K6_WRITE_HOSTNAME="${mimir}" \
    -e K6_READ_HOSTNAME="${mimir}" \
    -e K6_USERNAME="${user}" \
    -e K6_WRITE_TOKEN="${password}" \
    -e K6_READ_TOKEN="${password}" \
    -e K6_HA_CLUSTERS="1" \
    -e K6_HA_REPLICAS="3" \
    -e K6_DURATION_MIN="5"
```

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>

* replace model.Metric with labels.Labels in distributor.MetricsForLabelMatchers() (#1788)

* Streaming remote read (#1735)

* implement read v2

* updated CHANGELOG.md

* extend maxBytesInFram comment.

* addressed PR feedback

* addressed PR feedback

* addressed PR feedback

* use indexed xor chunk function to assert stream remote read tests

* updated CHANGELOG.md

Co-authored-by: Miguel Ángel Ortuño <miguel.ortuno@grafana.com>

* Upgrade dskit (#1791)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix mimir-continuous-test when changing configured num-series (#1775)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Do not export per user and integration Alertmanager metrics when value is 0 (#1783)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Print version+arch of Mimir loaded to Docker. (#1793)

* Print version+arch of Mimir loaded to Docker.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Use debug log for distributor.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Remove unused metrics cortex_distributor_ingester_queries_total and cortex_distributor_ingester_query_failures_total (#1797)

* Remove unused metrics cortex_distributor_ingester_queries_total and cortex_distributor_ingester_query_failures_total

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove unused fields

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added options support to SendSumOfCountersPerUser() (#1794)

* Added options support to SendSumOfCountersPerUser()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Renamed SkipZeroValueMetrics() to WithSkipZeroValueMetrics()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Changed all Grafana dashboards UIDs to not conflict with Cortex ones, to let people install both while migrating from Cortex to Mimir (#1801)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Adopt mixin convention to set dashboard UIDs based on md5(filename) (#1808)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add support for store_gateway_zone args (#1807)

Allow customizing mimir cli flags per zone for the store gateway.
Copied the same solution as we have for ingesters.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add protection to store-gateway to not drop all blocks if unhealthy in the ring (#1806)

* Add protection to store-gateway to not drop all blocks if unhealthy in the ring

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update CHANGELOG.md

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

* Removed cortex_distributor_ingester_appends_total and cortex_distributor_ingester_append_failures_total unused metrics (#1799)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove unused clientConfig from ingester (#1814)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add tracing to `mimir-continuous-test` (#1795)

* Extract and test TracerTransport functionality

We need to use a TracerTransport in mimir-continous-test. We have that
in the frontend package, but I don't want to import frontend from the
mimir-continous-test, so we extract it to util/instrumentation.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Set up global tracer in mimir-continuous-test

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add tracing to the client and spans to the tests

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add jaeger-mixin to mimir-continuous test container

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make license

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add traces to the write path

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Chore: remove unused code from BucketStore (#1816)

* Removed unused Info() and advLabelSets from BucketStore

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused FilterConfig from BucketStore

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused relabelConfig from store-gateway tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused function expectedTouchedBlockOps()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused recorder from BucketStore tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* go mod vendor

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Refactoring: force removal of all blocks when BucketStore is closed (#1817)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Simplify FilterUsers() logic in store-gateway (#1819)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Migrate admin CSS to bootstrap 5 (#1821)

* Migrate admin CSS to bootstrap 5

When I added bootstrap, for some reason I imported bootstrap 3 which was
originally launched in 2013.

Before adding more CSS styles, let's migrate to modern Bootstrap 5
launched in 2021.

This doesn't require an explicit jquery dependency anymore.

Also re-styled admin header to adapt properly to mobile devices screens.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* ruler: make use of dskit `grpcclient.Config` on remote evaluation client (#1818)

* ruler: use dskit grpc client for remote evaluation

* addressed PR feedback

* Memberlist status page CSS (#1824)

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update dskit to 4d7238067788a04f3dd921400dcf7a7657116907

This includes changes from https://github.com/grafana/dskit/pull/163

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Custom memberlist status template

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Include `import` in jsonnet snippets (#1826)

* Do not drop blocks in the store-gateway if missing in the ring (#1823)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Upgraded dskit to fix temporary partial query results when shuffle sharding is enabled and hash ring backend storage is flushed / reset (#1829)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: ruler remote evaluation  (#1714)

* include documentation for remote rule evaluation

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/configuring/configuring-to-evaluate-rules-using-query-frontend.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* address PR feedback

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* addressed PR feedback

* addressed PR feedback

* Update docs/sources/operators-guide/architecture/components/ruler/index.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/running-production-environment/planning-capacity.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update docs/sources/operators-guide/running-production-environment/planning-capacity.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* addressed PR feedback

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Alertmanager: Do not validate alertmanager configuration if it's not running. (#1835)

Allows other targets to start up even if an invalid alertmanager configuration
is passed in.

Fixes #1784

* Alertmanager: Allow usage with `local` storage type, with appropriate warnings. (#1836)

An oversight when we removed non-sharding modes of operation is that the `local`
storage type stopped working. Unfortunately it is not conceptually simple to
support this type fully, as alertmanager requires remote storage shared between
all replicas, to support recovering tenant state to an arbitrary replica
following an all-replica outage.

To support provisioning of alerts with `local` storage, but persisting of state
to remote storage, we would need to allow different storage configurations.

This change fixes the issue in a more naive way, so that the alertmanager can at
least be started up for testing or development purposes, but persisting state
will always fail. A second PR will propose allowing the `Persister` to be
disabled.

Although this configuration is not recommended for production used, as long as
the number of replicas is equal to the replication factor, then tenants will
never move between replicas, and so the local snapshot behaviour of the upstream
alertmanager will be sufficient.

Fixes #1638

* Mixin: Additions to Top tenants dashboard regarding sample rate and discard rate. (#1842)

Adds the following rows to the "Top tenants" dashboard:

- By samples rate growth
- By discarded samples rate
- By discarded samples rate growth

These queries are useful for determining what tenants are potentially putting excess
load on distributors and ingesters (and if it increased recently).

* Use concurrent open/close operations in compactor unit tests (#1844)

Open and close files concurrently in compactor unit tests to expose bugs
that implicitly rely on ordering.

Exposes bugs such as https://github.com/prometheus/prometheus/pull/10108

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Mixin: Show ingestion rate limit and rule group limit on Tenants dashboard. (#1845)

Whilst diagnosing a recent issue, we thought it would be useful to show the
current ingestion rate limit for the tenant. As the limit is applied to
`cortex_distributor_received_samples_total`, the limit is shown on the panel
which displays this metric. ("Distributor samples received (accepted) rate").

Also added `ruler_max_rule_groups_per_tenant` while in the area.

We don't currently display the number of exemplars in storage on the dashboard
anywhere, so cannot add `max_global_exemplars_per_user` right now.

* Jsonnet: Preparatory refactoring to simplify deploying parallel query paths. (#1846)

This change extracts some of the jsonnet used to build query deployments
(querier, query-scheduler, query-frontend) such that it is easier to deploy
secondary query paths. The use case for this is primarily to develop a
query path deployment for ruler remote-evaluation, but there may be other
use cases too.

* Removed double space in Log (#1849)

* Reference 'monolithic mode' instead of 'single binary' in logs (#1847)

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Extend safeTemplateFilepath to cover more cases. (#1833)

* Extend safeTemplateFilepath to cover more cases.

- template name ../tmpfile, stored into /tmp dir
- empty template name
- template name being just "."

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Relax mimir-continuous-test pressure when deployed with Jsonnet (#1853)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add 2.1.0-rc.0 header (#1857)

* Prepare release 2.1 (#1859)

* Update VERSION to 2.1-rc.0

* Add relevant changelog entries for user facing PRs since mimir-2.0.0

* Add patch in semver VERSION

* Adding updated ruler diagrams. (#1861)

* Create v2-1.md (#1848)

* Create v2-1.md

* Update and rename v2-1.md to v2.1.md

updated the header and renamed the file.

* Update v2.1.md

Missing the upgrade configurations.

* Update v2.1.md

added bug description

* Update v2.1.md

bug fix writeup.

* Update v2.1.md

Added the series count description

* Apply suggestions from code review

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update v2.1.md

* Update v2.1.md

updated tsdb isolation wording.

* Ran make doc.

* Fixed a broken relref.

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Allow custom data source regex in mixin dashboards (#1802)

* dashboards: update grafana-builder

The following commit update grafana-builder version and brings in:
* enable toolip by default (#665)
* Add 'Data Source' label for the default datasource template variable. (#672)
* add dashboard link func (#683)
* make allValue configurable (#703)
* Allow datasource's regex to be configured

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>

* Allow custom data source regex in mixin dashboards

The current dashboards offer the possibility to select a data source
among all prometheus data sources in the organization. Depending on the
number of data sources the list could be rather big (>10). Not all data
sources host Mimir metrics as such listing them is not helpful for the
users.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>

* Revert back change that was enabling shared tooltips

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Dashboards: Fix `container_memory_usage_bytes:sum` recording rule (#1865)

* Dashboards: Fix `container_memory_usage_bytes:sum` recording rule

This change causes recording rules that reference
`container_memory_usage_bytes` to omit series that do not contain the
required labels for rules to run successfully, by requiring a non-empty
`image` label.

Signed-off-by: Peter Fern <github@0xc0dedbad.com>

* Update CHANGELOG

Signed-off-by: Peter Fern <github@0xc0dedbad.com>

* Add compiled rules

Signed-off-by: Peter Fern <github@0xc0dedbad.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Deprecate -distributor.extend-writes and set it always to false (#1856)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove DCO from contributors guidelines (#1867)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Create v2-1.md (#1848)

* Create v2-1.md

* Update and rename v2-1.md to v2.1.md

updated the header and renamed the file.

* Update v2.1.md

Missing the upgrade configurations.

* Update v2.1.md

added bug description

* Update v2.1.md

bug fix writeup.

* Update v2.1.md

Added the series count description

* Apply suggestions from code review

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update v2.1.md

* Update v2.1.md

updated tsdb isolation wording.

* Ran make doc.

* Fixed a broken relref.

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Adding updated ruler diagrams. (#1861)

* Deprecate -distributor.extend-writes and set it always to false (#1856)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Bump version to 2.1.0-rc.1 to include cherry-picked

* List Johanna as 2.1.0 release shepherd (#1871)

* fix(mixin): add missing alertmanager hashring members (#1870)

* fix(mixin): add missing alertmanager hashring members

* docs(CHANGELOG): add changelog entry

* Docs: clarify 'Set rule group' API specification (#1869)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Simplify documentation publishing logic (#1820)

* Simplify documentation publishing logic

Split into two pipelines, one that runs on main and one that runs on
release branches and tags.

Use `has-matching-release-tag` workflow to determine whether to release
documentation on release branch and tags.

`has-matching-release-tag` is documented in https://github.com/grafana/grafana-github-actions/blob/main/has-matching-release-tag/action.yaml

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove script no longer used for documentation releases

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Add missing clone step for the website-sync action

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Update RELEASE instructions to reflect automated docs publishing

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove conditional from website clone for next publishing

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Fix capitalization of Jsonnet and Tanka (#1875)

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Checkout the repository as part of the documentation sync (#1876)

* Checkout the repository as part of the documentation sync

I assumed this was already done but the GitHub docs confirm that it is
required.
https://docs.github.com/en/github-ae@latest/actions/using-workflows/about-workflows#about-workflows
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Allow manual triggering of workflow

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Fix manual workflow dispatch (#1877)

TIL that if you edit the workflow in the GitHub UI, it will lint your workflow file and make sure that all the keys conform to the schema.

* Simplify documentation publishing logic (#1820)

* Simplify documentation publishing logic

Split into two pipelines, one that runs on main and one that runs on
release branches and tags.

Use `has-matching-release-tag` workflow to determine whether to release
documentation on release branch and tags.

`has-matching-release-tag` is documented in https://github.com/grafana/grafana-github-actions/blob/main/has-matching-release-tag/action.yaml

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove script no longer used for documentation releases

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Add missing clone step for the website-sync action

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Update RELEASE instructions to reflect automated docs publishing

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove conditional from website clone for next publishing

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Checkout the repository as part of the documentation sync (#1876)

* Checkout the repository as part of the documentation sync

I assumed this was already done but the GitHub docs confirm that it is
required.
https://docs.github.com/en/github-ae@latest/actions/using-workflows/about-workflows#about-workflows
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Allow manual triggering of workflow

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Fix manual workflow dispatch (#1877)

TIL that if you edit the workflow in the GitHub UI, it will lint your workflow file and make sure that all the keys conform to the schema.

* Chore: cleanup unused alertmanager config in Mimir jsonnet (#1873)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update mimir-prometheus to ceaa77f1 (#1883)

* Update mimir-prometheus to ceaa77f1

This includes the fix
https://github.com/grafana/mimir-prometheus/pull/234
for https://github.com/grafana/mimir/issues/1866

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix changelog

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Bump version to 2.1.0-rc.1 to include cherry-picked (#1872)

* Increased default configuration for -server.grpc-max-recv-msg-size-bytes and -server.grpc-max-send-msg-size-bytes from 4MB to 100MB (#1884)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Split mimir_queries rule group so that it doesn't have more than 20 rules (#1885)

* Split mimir_queries rule group so that it doesn't have more than 20 rules.
* Add check for number of rules in the group.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Add alert for store-gateways without blocks (#1882)

* Add alert for store-gateways without blocks

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Clarify messages

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Replace "Store Gateway" with "store-gateway"

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Rename alert to StoreGatewayNoSyncedTenants

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Rebuild mixin

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Fix flaky integration tests caused by 'metric not found' (#1891)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: Explain the runtime override of active series matchers (#1868)

* Updated docs/sources/operators-guide/configuring/configuring-custom-trackers.md; made some tweaks to the examples; changed name interesting-service and also-interesting-service to service1 and service2 respectively

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* Update to latest Thanos for Memcached fixes (#1837)

Update our vendor of Thanos to pull in the most recent changes to the
Memcached client. In particular, these changes prevent the client from
starting many goroutines as part of batching before they are able to
make progress.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Fixed deceiving error log "failed to update cached shipped blocks after shipper initialisation" (#1893)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix TestRulerEvaluationDelay flakyness (#1892)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix `MimirRulerMissedEvaluations` text and add playbook (#1895)

* Correct magnitude on MimirRulerMissedEvaluations

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add playbook for MimirRulerMissedEvaluations

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Remove trailing spaces

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Conform to tech doc style. (#1904)

* Use a dedicated threadpool for store-gateway requests (#1812)

Remove the use of a dedicated threadpool for index-header operations
because the call overhead is prohibitively expensive. Instead, use a
dedicated threadpool for entire store-gateway requests so that the cost
of switching between threads is only paid a single time. This allows
for isolation in the case of page faults during mmap accesses without
too much overhead.

Fixes #1804

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Upgrade consideration for active_series_custom_trackers_config (#1897)

* Upgrade consideration for active_series_custom_trackers_config

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* Upgrade consideration for active_series_custom_trackers_config (#1897)

* Upgrade consideration for active_series_custom_trackers_config

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* Update docs/sources/release-notes/v2.1.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Jennifer Villa <jen.villa@grafana.com>

* fix(mixin): do not trigger TooMuchMemory alerts if no container limits are supplied (#1905)

* fix(mixin): do not trigger `MimirAllocatingTooMuchMemory` or `EtcdAllocatingTooMuchMemory` alerts if no container limits are supplied

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Fix MimirCompactorHasNotUploadedBlocks alert false positive when Mimir is deployed in monolithic mode (#1902)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Set defaults to query ingesters, not store, for recent data (#1909)

Set queriers to _not_ query storage (store-gateways) for recent data
and set the store-gateways to ignore recent uncompacted blocks.

Default values are set to match what we use in the Mimir jsonnet.

Fixes #1639

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Revert distributor log level to warn in integration tests (#1910)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved error returned by -querier.query-store-after validation (#1914)

* Improved error returned by -querier.query-store-after validation

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/querier/querier.go

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Remove jsonnet configuration settings that match default values (#1915)

* Remove jsonnet configuration settings that match default values

Follow up to #1909

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Docs: recommend fast disks for ingesters and store-gateways (#1903)

* Docs: recommend fast disks for ingesters and store-gateways

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/running-production-environment/production-tips/index.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Update docs/sources/operators-guide/running-production-environment/production-tips/index.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Improve series, sample, metadata and exemplars validation errors (#1907)

* Improved error messages returned by ValidateSample(), ValidateExemplar(), ValidateMetadata() and ValidateLabels()

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Fixed unit tests after error messages edit

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Manually applied a suggestion to error message

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Renamed globalerrors pkg to singular form

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Cleanup globalerror package based on Oleg's feedback

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed formatting support from globalerror.ID's message generation function

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Changed another error message based on feedback

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update operations/mimir-mixin/docs/playbooks.md

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Rephrased label name/value length error message based on feedback received in the test file

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Final fixes to error messages

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* mixin-tool: adapt screenshots dockerimage to support arm64 (#1916)

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* Ingester ring endpoint fix (#1918)

* /ingester/ring is also available via distributor.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Revert unintended change.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Configuration files for GrafanaCon 2022 presentation. (#1881)

* Configuration files for GrafanaCon 2022 presentation.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Update dskit to bring "Parallelize memberlist notified message processing" PR (#1912)

* Update dskit to bring "Parallelize memberlist notified message processing" PR.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* CHANGELOG.md

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Account for StatefulSets and Depl-s named by the helm chart (#1913)

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Change shuffle sharding ingester lookback default config (#1921)

* Change shuffle sharding ingester lookback default config

Use the same default value for ingester lookback as the "query ingesters
within" setting to reduce the number of things that need to be changed from
their defaults. This change also removes use of the
`-blocks-storage.tsdb.close-idle-tsdb-timeout` flag in jsonnet since the
value being used matches the default.

Follow up to #1915

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Changelog

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Improved ValidateMetadata() errors (#1919)

* Improved ValidateMetadata() errors

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/util/validation/errors.go

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Converted all ValidationError to be non-pointers

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Removed unused variable

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed unit test

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed markdown linter

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

* mixin/dashboards: ruler query path dashboards (#1911)

* mixin: added ruler query path dashboards

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* docs: added ruler reads & ruler reads resources dashboard screenshots

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* updated CHANGELOD.md

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* Mark query_ingesters_within and query_store_after as advanced (#1929)

* Mark query_ingesters_within and query_store_after as advanced

Now that they have good defaults that match what we run in production,
they shouldn't need to be tuned by users in most cases.

Fixes #1924

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Remove empty chunks panel from Queries dashboard (#1928)

* Remove empty chunks panel from Queries dashboard

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Make MimirGossipMembersMismatch less sensitive, and make it fire fewer alerts. (#1926)

* Make MimirGossipMembersMismatch less sensitive, and make it fire fewer alerts.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* CHANGELOG.md

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Update config value for -querier.query-ingesters-within to work with … (#1930)

* Update config value for -querier.query-ingesters-within to work with new default value for -querier.query-store-after

* Remove config for -querier.query-ingesters-within as they are set to default

* Update Thanos vendor for memcache improvements (#1920)

Update our vendor of Thanos so that memcache keys are grouped by the
server they are owned by before being split into batches.

Fixes #423

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Move usage generation to separate package (#1934)

* Move usage function into a separate package and export it

Signed-off-by: Patryk Prus <patryk.prus@grafana.com>

* Add function to add to flag category overrides at runtime

Signed-off-by: Patryk Prus <patryk.prus@grafana.com>

* Document CHANGELOG scopes

* Add documentation about changelog scopes
* update CHANGELOG for #1934

* Improve instance limits, ingester limits, query limiter, some querier errors (#1888)

* Add errors IDs to pkg/ingester/instance_limits.go

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add errors IDs to pkg/ingester/limiter.go

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add errors IDs to pkg/querier/blocks_store_queryable.go

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Differentiate max-ingester-ingestion-rate from distributor

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update playbooks.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Correct misspelled flags

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Correct strings in tests as well

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Re-iterated on ingesters limit errors

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Re-iterated on ingesters per-tenant limit errors

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Re-iterated on query per-tenant limit errors

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Mention the cardinality API endpoint in the err-mimir-max-series-per-metric runbook

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update operations/mimir-mixin/docs/playbooks.md

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Fixed InstanceLimits receiver name to be consistent

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Clarify metadata is stored in memory

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed linter and tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed more tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/querier/blocks_store_queryable.go

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix english grammar about 'how to fix it'

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make ingesters use heartbeat timeout instead of period to fix the bug… (#1933)

* make ingesters use heartbeat timeout instead of period to fix the bug where they sometimes appear as unhealthy

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update VERSION to 2.1.0

* Update dashboard screenshots (#1940)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix version in changelog

* Update mimir tests to use new 2.1.0 image

* Add minimum Grafana version to mixin dashboards (#1943)

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Bump grafana/mimir image to 2.1.0 for backward compatibility testing (#1942)

* Chore: renamed source files for remote ruler dashboards (#1937)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Move the mimir-distributed helm chart into the mimir repository (#1925)

* Initial copy of mimir-distributed helm chart

This commit is not expected to work in CI.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Update github action for helm lint and test

Set the working directory for github actions for helm actions.
Set more consistent name for github actions.
Set chart name for testing.
Ignore generated helm doc from prettier.
Do not do release for now of helm chart.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add bucket prefix configuration (#1686)

* Add bucket prefix configuration

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add allowed chars validation for storage prefix

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add unit tests for PrefixedBucketClient

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Use grafana/regexp instead of regexp

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Improve validation of storage_prefix

Update docs and add validate for .. and .

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add some tests for AM and ruler bucket validaiton

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add tests for bucket prefix with filesystem client

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update helm text too

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update everything

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Simplify validation for storage_prefix

Only accept alphanumeric characters for the storage_prefix to prevent
mistypings and misunderstandings when the prefix ends with a slash or
contains slashes and dots

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Make stronger assertions in bucket validation test

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Make stronger assertions in bucket prefix test

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Assert on errors, not on strings

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Exclude YAML field names from error message

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Include full image tag on rollout dashboard (#1932)

* Make version matcher in rollout dashboard work for non-weekly images

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG.md entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* docs: move federated rule groups documentation to its own section (#1906)

* docs: move federated rule groups documentation to its own section

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* Make networking panels pod matchers work with helm chart (#1927)

* Make networking panels pod matchers work with helm chart

The pods created by the helm chart follow a format of
<helm_release_name>-mimir-<ingester|distributor|...>.

This is a problem for all places that use the per_instance_label for
matching. The per_instance_label is mostly used in aggregations (sum by
(pod), count by (pod), ...). The networking panels are the only ones
that use it for matching.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Replace .* with a stronger regex in pod matchers

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG.md entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add max query length error to errors catalog (#1939)

* Add max query length error to errors catalogue

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added PR number to CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Remove image spec from demo file. (#1946)

* Remove image spec from demo file.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Fix rejected identity accept encoding (#1864)

* Fix rejected identity accept-encoding

When a request comes in with header:
    Accept-Encoding: gzip;q=1, identity;q=0

we should gzip the response even if it's smaller than the defined
minimum size.

We achieve this by fixing the github.com/nytimes/gziphandler code, and
bringing the fixed code into this repository since:
- they don't seem to be maintaining it anymore
- we don't want to use a replace directive as it's very likely to be
  lost in codebases depending on this.
- it's a little amount of code (500 lines)

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add API test for gzip

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make lint pkg/util/gziphandler

Mostly handling errors, also removed the deprecated http.CloseNotifier
functionality and related code.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix comment

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Add faillint for github.com/nytimes/gziphandler

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make lint

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix faillint paths

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* If there's content-encoding, start plain write

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* If less than min-size, don't encode

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Refactor `handleContentType` to handle by default

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Rename acceptsIdentity to rejectsIdentity,

Hopefully this will minimise the amount of double negations making the
code clearer.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix comment

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Distributor: added per-tenant request limit (#1843)

* distributor: added request limiter logic

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* updated CHANGELOG.md

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* distributor: added type plans rate limits

Assuming a minimum sane value of 100 samples per request, we've set default request limits for each user tier.

* docs: added request limit distributor documentation

* rebuilt jsonnet test output

* make linter happy

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* updated reference help

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* addressed PR feedback

Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>

* Add bucket prefix to experimental features (#1951)

* Add bucket prefix to experimental features

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update flag status of storage_prefix to experimental

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Copy thanos shipper (#1957)

* Copy shipper from Thanos.
* Remove support for uploading compacted blocks.
* Always allow out-of-order uploads. Removed unused overlap checker.
* Rename Shipper interface to BlocksUploader, and ThanosShipper to Shipper.
* Extract readShippedBlocks method from user_tsdb.go
* Added shipper unit tests (copied and adapted from original tests)
* Add faillint rule to avoid using Thanos shipper.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Adjust the name of the tag expected by documentation publishing (#1974)

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Use github.com/colega/grafana-tools-sdk fork (#1973)

* Use github.com/colega/grafana-tools-sdk fork

See https://github.com/grafana/cortex-tools/pull/248 for more context (this is
the same change). The grafana-tools/sdk dependency will eventually be removed entirely
from analyse commands.

Signed-off-by: hjet <hjet@users.noreply.github.com>

* Update CHANGELOG.md

Signed-off-by: hjet <hjet@users.noreply.github.com>

* mod tidy

* Deprecate -ingester.ring.join-after (#1965)

* Deprecate -ingester.ring.join-after

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Addressed review feedback

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Dashboards: disable gateway panels by default (#1955)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: rename 'playbooks' to 'runbooks' and move them to doc (#1970)

* Docs: rename 'playbooks' to 'runbooks' and move them to doc

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Named runbooks folder as 'mimir-runbooks/' to make it easy to import in Grafana Labs internal infrastructure as code

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix anchors check because they're case insensitive

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Preparation of e2eutils for Thanos indexheader unit tests. (#1982)

We want to pull in the indexheader package from Thanos so that we can add some experimental alternative implementations of BinaryReader. In order to also pull in the unit tests for this package, we need the replacements for e2eutil.Copy and e2eutil.CreateBlock. This change does two things:

1. Copy in e2eutil/copy.go and fix it up accordingly.
2. Move CreateBlock into a package to avoid circular imports.

* Make propagation of forwarding errors configurable (#1978)

* make propagation of forwarding errors optional

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* add test for disabled error propagation

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* leave error propagation enabled by default

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update help

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update docs

* better wording

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* Release the mimir-distributed-beta helm chart (#1948)

Use the common workflow from the helm-chart repo.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Copy Thanos block/indexheader package (#1983)

* Copy thanos/pkg/block/indexheader.

* Update provenance.

* Fix linter error due to error variable name.

* Use require instead of e2eutil.

* Replace usage of e2eutil.Copy

* Replace usage of e2eutil.CreateBlock with local version.

* Replace use of Thanos indexheader with local copy.

* Add faillint check for upstream indexheader.

* Fix goleak ignore for NewReaderPool.

* Update vendor directory.

* Prepare mimir beta chart release (#1995)

* Rename chart back to mimir-distributed

Apparently the helm option --devel is needed to trigger using beta
versions. This should be enough protection for accidental use. Avoids
renaming issues.

* Version bump helm chart

Do version bump to a beta version but nothing else until we double check
 that such beta chart cannot be accidentally selected with helm tooling.

* Enable helm chart release from main branch

Release process tested ok on test branch.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Bump version of helm chart (#1996)

Test if helm release triggers correctly.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Update gopkg.in/yaml.v3 (#1989)

This updates to a version that contains the fix to CVE-2022-28948.

* Remove hardlinking in Shipper code. (#1969)

* Remove hardlinking in Shipper code.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* [helm] use grpc round robin for distributor clients (#1991)

* Use GRPC round-robin for gateway -> distributor requests

Fixes https://github.com/grafana/mimir/issues/1987
Update chart version and changelog
Use the headless distributor service for the nginx gateway

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Fix binary_reader.go header text. (#1999)

Mistakenly left two lines when updating the provenance for the file.

* Workaround to keep using old memcached bitnami chart for now (#1998)

* Workaround to keep using old memcached bitnami chart for now

See also: https://github.com/grafana/helm-charts/pull/1438
Also clean up unused chart repositories from ct.yaml.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* [helm] add results cache (#1993)

* [helm] Add query-frontend results cache

Fixes https://github.com/grafana/helm-charts/issues/1403

* Add PR to CHANGELOG

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Fix README

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Disable distributor.extend-writes & ingester.ring.unregister-on-shutdown (#1994)

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Update CHANGELOG.md (#1992)

* [helm] Prepare image bump for 2.1 release (#2001)

* Prepare image bump for 2.1 release

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Fix README template to reference 2.1

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Add nice link text to CHANGELOG

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Update CHANGELOG.md

* Publish helm charts from release branches (#2002)

* Update Thanos with https://github.com/thanos-io/thanos/pull/5400. (#2006)

* Replace hardcoded intervals with $__rate_interval in dashboards (#2011)

* Replace hardcoded intervals with $__rate_interval in dashboards

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG.md entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Standardise error messages for distributor instance limits (#1984)

* standardise error messages for distributor instance limits

* Apply suggestions from code review

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* apply code review suggestions to rest of doc for consistency

* manually apply suggestion from code review

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>

* Remove tutorials/ symlink (#2007)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add querier autoscaler support to jsonnet (#2013)

* Add querier autoscaler support to jsonnet

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed autoscaling.libsonnet import

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add a check to Mimir jsonnet to ensure query-scheduler is enabled when enabling querier autoscaling (#2023)

* Add a check to Mimir jsonnet to ensure query-scheduler is enabled when enabling querier autoscaling

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Shouldn't be an exported object

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Don't include external labels in blocks uploaded by Ingester (#1972)

* Remove support for external labels.
* Fixed comments.
* Don't use TenantID label. Filter out the label during compaction.
* CHANGELOG.md
* Use public function from Thanos.
* Use new UploadBlock function, move GrpcContextMetadataTenantID constant.
* Rename tsdb2 import to mimir_tsdb.
* Fix tests.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Enhance MimirRequestLatency runbook with more advice (#1967)

* Enhance MimirRequestLatency runbook with more advice

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Include helm-docs in build and CI (#2026)

* Update the mimir build image and its build doc

Dockerfile: Add helm-docs package to the image.
how-to: Write down the requirements for build in more detail. Add
information about build on linux.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Expand make doc with helm-docs command

This enables generating the helm chart README with the same make doc
command as all other documentation.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Update docs/internal/how-to-update-the-build-image.md

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update contributing guides for the helm chart (#2008)

* Update contributing guides for the helm chart

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Turn off helm version increment check in CI

This enables periodic releases, as opposed to requiring version bump
for release at every PR.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add extraEnvFrom to all services and enable injection into mimir config (#2017)

Add `extraEnvFrom` capability to all Mimir services to enable injecting
secrets via environment variables.

Enable `-config.exand-env=true` option in all Mimir services to be able
to take secrets/settings from the environment and inject them into the
 Mimir configuration file.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Docs: fix mimir-mixin installation instructions (#2015)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: make documentation a first class citizen in CHANGELOG (#2025)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Helm: add global.extraEnv and global.extraEnvFrom (#2031)

* Helm: add global.extraEnv and global.extraEnvFrom

Enables setting environment and env injection in one place for
mimir + nginx.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Upgrade alpine to 3.16.0 (#2028)

* Upgrade alpine to 3.16.0

* Enhance MimirRequestLatency runbook with more advice (#1967)

* Enhance MimirRequestLatency runbook with more advice

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Include helm-docs in build and CI (#2026)

* Update the mimir build image and its build doc

Dockerfile: Add helm-docs package to the image.
how-to: Write down the requirements for build in more detail. Add
information about build on linux.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Expand make doc with helm-docs command

This enables generating the helm chart README with the same make doc
command as all other documentation.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Update docs/internal/how-to-update-the-build-image.md

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update contributing guides for the helm chart (#2008)

* Update contributing guides for the helm chart

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Turn off helm version increment check in CI

This enables periodic releases, as opposed to requiring version bump
for release at every PR.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add extraEnvFrom to all services and enable injection into mimir config (#2017)

Add `extraEnvFrom` capability to all Mimir services to enable injecting
secrets via environment variables.

Enable `-config.exand-env=true` option in all Mimir services to be able
to take secrets/settings from the environment and inject them into the
 Mimir configuration file.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Docs: fix mimir-mixin installation instructions (#2015)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Docs: make documentation a first class citizen in CHANGELOG (#2025)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* upgrade to alpine 3.16.0

* upgrade alpine to 3.16.0

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Helm: release our first weekly (#2033)

This should be automated, but…
openshift-merge-robot pushed a commit to stolostron/thanos that referenced this pull request Dec 8, 2022
* Remove debug line (#5245)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* e2e: fix compact test's flakiness (#5246)

Fix the compact test's by running this sub-test sequentially. The
further steps depend on this test's results so it's wrong to run it as a
sub-test.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* bump prometheus version to v2.33.5 (#5256)

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* info: Return store info only when the service is ready (#5255)

* return store info only when the service is ready

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix test

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* Merge release 0.25 to main (#5210)

* Cut 0.25.0-rc.0 (#5184)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Cut v0.25.0 (#5209)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Create v0.25.1 built with Go 1.17.8 (#5226)

The binaries published with this release are built with Go1.17.8 to
avoid
[CVE-2022-24921](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-24921).

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* *: Cut 0.25.2 rc.0 (#5247)

* fix: add null check to exemplar data (#5202)

Signed-off-by: Thomas Mota <tmm@danskecommodities.com>

* Ruler: Fix WAL directory in stateless mode (#5242)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update CHANGELOG, VERSION

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Updates busybox SHA (#5234)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

Co-authored-by: Tomás Mota <tomasrebelomota@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* Cut v0.25.2

Signed-off-by: Matej Gera <matejgera@gmail.com>

Update tutorials

Signed-off-by: Matej Gera <matejgera@gmail.com>

Co-authored-by: Matthias Loibl <mail@matthiasloibl.com>
Co-authored-by: Tomás Mota <tomasrebelomota@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* Implement GRPC query API (#5250)

With the current GRPC APIs, layering Thanos Queriers results in
the root querier getting all of the samples and executing the query
in memory. As a result, the intermediary Queriers do not do any
intensive work and merely transport samples from the Stores to the
root Querier.

When data is perfectly sharded, users can implement a pattern where
the root Querier instructs the intermediary ones to execute the queries
from their stores and return back results. The results can then be
concatenated by the root querier and returned to the user.

In order to support this use case, this commit implements a GRPC API
in the Querier which is analogous to the HTTP Query API exposed
by Prometheus.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* Change error cleanup in `objstore.DownloadDir` to delete files not destination dir (#5229)

* Change error cleanup in objstore.DownloadDir to delete files not directories

Dst is always a directory. If any file after the first fails to download,
the cleanup will fail because the destination already contains at least one file.
This commit changes the cleanup logic to clean up successfully downloaded files one by one
instead of attempting to clean up the whole dst directory.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add cleanup of root dst directory.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add unit test for cleanup of DownloadDir

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Fix linter

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update index.html (#5264)

* Add SumUp logo to adopters (#5267)

Signed-off-by: Guilherme Souza <101073+guilhermef@users.noreply.github.com>

* receive: Added tenant ID  error handling of remote write requests. (#5269)

Plus better explanation.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Add TIXnGO logo to adopters (#5273)

Signed-off-by: Pierre Hanselmann <pierre.hanselmann@gmail.com>

* Fix miekgdns resolver to work with CNAME records too (#5271)

* Fix miekgdns resolver to work with CNAME records too

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove unused context

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/discovery/dns/miekgdns/resolver.go

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Lucas Servén Marín <lserven@gmail.com>

Co-authored-by: Lucas Servén Marín <lserven@gmail.com>

* UI: Remove old ui (#5145)

* remove old ui

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* add changelog

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* update assets

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* Updates busybox SHA (#5283)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* build(deps): bump moment from 2.29.1 to 2.29.2 in /pkg/ui/react-app (#5274)

Bumps [moment](https://github.com/moment/moment) from 2.29.1 to 2.29.2.
- [Release notes](https://github.com/moment/moment/releases)
- [Changelog](https://github.com/moment/moment/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/moment/moment/compare/2.29.1...2.29.2)

---
updated-dependencies:
- dependency-name: moment
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs: fix URLs preventing generation and unblock CI (#5285)

* docs: fix Ian Billett's GitHub handle

I noticed that CI was failing [0] for PR
https://github.com/thanos-io/thanos/pull/5284 because Ian had changed
his GitHub handle from @ianbillett to @bill3tt. This commit fixes this.

[0] https://github.com/thanos-io/thanos/runs/6050355497?check_suite_focus=true#step:5:135

Signed-off-by: Lucas Servén Marín <lserven@gmail.com>

* docs: fix broken links to GitHub docs

Currently, documentation generation is failing because mdox can't fetch
some GitHub documentation pages since the URLs for the help content has
changed. This commit updates the links to use the correct URLs.

Signed-off-by: Lucas Servén Marín <lserven@gmail.com>

* MAINTAINERS.md: regenerate

Signed-off-by: Lucas Servén Marín <lserven@gmail.com>

* UI: Update vulnerable dependencies (#5233)

* refactor global window typings

Use declaration merging for better window types

Signed-off-by: Gabriel Bernal <gbernal@redhat.com>

* bump vulnerable react-scripts version

Signed-off-by: Gabriel Bernal <gbernal@redhat.com>

* Add Vestiaire Collective as adopter (#5289)

Signed-off-by: claude ebaneck <claudeforlife@gmail.com>

Co-authored-by: claude ebaneck <claude.ebaneck@vestiairecollective.com>

* Implement Query API discovery (#5291)

A recent commit (#5250) added a GRPC API to Thanos Query which allows
executing PromQL over GRPC. This API is currently not discoverable
through endpointsets which makes it hard for other Thanos components
to use it.

This commit extends endpointsets with a GetQueryAPIClients method
which returns Query API clients to all components which support
this API.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* Added support for ppc64le (#5290)

* Added support for ppc64le

Signed-off-by: Marvin Giessing <marvin.giessing@gmail.com>

* Updated Changelog

Signed-off-by: Marvin Giessing <marvin.giessing@gmail.com>

* Updated promu & protoc

Signed-off-by: Marvin Giessing <marvin.giessing@gmail.com>

* Updated Makefile comment

Signed-off-by: Marvin Giessing <marvin.giessing@gmail.com>

* Added target API tests (+goleak). (#5260)

Attempted to repro https://github.com/thanos-io/thanos/issues/5257, but no good luck.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Revert "Added target API tests (+goleak). (#5260)" (#5297)

This reverts commit 955ea6dcae2529ad5b5b97a6a11150a5906d775a.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Use correct filesystem/network path separators when uploading blocks (#5281)

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* query-frontend: Don't cache request with dedup=false  (#5300)

* query-frontend: Added repro for dedup affecting precision of querying.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* QFE does not cache request with dedup=false.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Move info about queries that skip cache logic to docs

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update CHANGELOG

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Run docs formatter

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix e2e tests where caching logic is desired

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* mixin: Fix typo in ThanosCompactHalted alert (#5306)

Signed-off-by: Pedro Araujo <pedro.araujo@saltpay.co>

* Avoid starting goroutines for memcached batch requests before gate (#5301)

Use the doWithBatch function to avoid starting goroutines to fetch batched
results from memcached before they are allowed to run via the concurrency
Gate. This avoids starting many goroutines which cannot make any progress
due to a concurrency limit.

Fixes #4967

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Cut readme for 0.26 (#5311)

Co-authored-by: Wiard van Rij <wvanrij@roku.com>

* Reviewed and updated Changelog for 0.26-rc0 (#5313)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

Co-authored-by: Wiard van Rij <wvanrij@roku.com>

* Cut 0.26.0-rc.0 set version correctly (#5317)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

Co-authored-by: Wiard van Rij <wvanrij@roku.com>

* docs: Fix broken link to introduction blog (#5319)

Signed-off-by: jmjf <jamee.mikell@gmail.com>

* Ensure memcached batched requests handle context cancelation (#5314)

* Ensure memcached batched requests handle context cancellation

Ensure that when the context used for Memcached GetMulti is cancelled,
getMultiBatched does not hang waiting for results that will never be
generated (since the batched requests will not run if the context has
been cancelled).

Fixes an issue introduced in #5301

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Lint fixes

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Code review changes: run batches unconditionally

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* stalebot: add generic label to avoid stalebot (#5322)

Add a generic label which tells stalebot not to close issues marked with
it.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Use proper replicalabels in GRPC Query API (#5308)

The GRPC Query API uses only the replica labels coming from the
RPC request and ignores the ones configured when starting the querier.

This commit ensures that the API falls back on the preconfigured
replica labels when they are not provided in the request.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* groupcache: reduce log severity (#5323)

Sometimes certain operations can fail with some error(-s) being expected
e.g. a deletion marker might or might not exist. Thus, these log lines
could get triggered even though nothing bad is happening. Since the
expected errors are known only at the very end, near the call site, and
because `error`s are already logged in other places, and because these
Fetch()/Store() functions are working in best-effort scenario, I propose
reducing the severity of these log lines to `debug`.

Fixes https://github.com/thanos-io/thanos/issues/5265.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Update release process (#5325)

* update release process

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Add info about VERSION file

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* query-frontend: improve docs on requestes excluded from cache (#5326)

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* cut release 0.26.0 (#5330)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Updates busybox SHA (#5336)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* receive: fix deadlock on interrupt in routerOnly mode (#5339)

* fix receive router deadlock on interrupt

Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>

* Update changelog

Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>

* docs: Updated information about our community call. (#5309)

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* reloader: Force trigger reload when config rollbacked (#5324)

* Add Cache metrics to groupcache (#5352)

Add metrics about the hot and main caches[0].
* Number of bytes in each cache.
* Number of items in each cache.
* Counter of evictions from each cache.

[0]: https://pkg.go.dev/github.com/vimeo/galaxycache#CacheStats

Signed-off-by: SuperQ <superq@gmail.com>

* e2e: Refactored service helpers to be consistent with new API. (#5348)

* test: Added Alert compatibilty test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Tmp.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* e2e: Refactored service helpers for newest e2e version.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed alert combatibiltiy test for now.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed lint.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed lint2.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed nginx service.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixes.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Refactored ruler.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* fixes.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed compactor.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* What about now?

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* groupcache: fix handling of slashes (#5357)

Use https://github.com/julienschmidt/httprouter#catch-all-parameters for
the groupcache route otherwise slashes in the cache's key gets
interpreted by the router and thus groupcache's function never gets
invoked, and all clients get 404.

Remove test regarding cache hit because now Thanos Store during test
constantly generates cache hits due to 1s delay between block
information refreshes.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Adds more info about the formatting part. (#5347)

* Adds more info about the formatting part. Closes #5282

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* adds extra newline

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Update promdoc to solve #5344 (#5345)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* e2e: Refactored Receive Builder to be consistent with other helpers. (#5358)

* e2e: Refactored Receive Builder to be consistent with other helpers.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Updates busybox SHA (#5365)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* e2e: Fixed exemplar support in receive helper. (#5372)

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Enforce memcached concurrency limit with unbatched requests (#5360)

* Enforce memcached concurrency limit with unbatched requests

This ensures that requests that are _not_ split into batches still count
towards the concurrency limit that the client enforces.

This fixes an issue introduced in #5301

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Lint fix

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* docs: fix link (#5379)

I think I've found a replacement for the dead link.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* cache: do not copy data in groupcache (#5378)

Add a unsafe codec which uses the given byte slices directly to avoid
copying - we are doing ioutil.ReadAll() either way so there is no need
to copy anything.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* fix ruler send empty alerts (#5377)

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* Add custom `errors` package with stack trace functionality (#5239)

* feat: a simple stacktrace utility

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* feat: custom errors package with new, errorf, wrapping, unwrapping and stacktrace

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* chore: update existing errors import (small subset)

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* chore: update comments

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* add errors into skip-files linter config

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* intoduce UnwrapTillCause to suffice the limitation of Unwrap

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* Revert "chore: update existing errors import (small subset)"

This reverts commit d27f0177fe6c8a357ba10e4ac8bfee87c8bf985c.

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* revert makefile && golangcilint file

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* apply PR feedbacks

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* stacktrace and errors test

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* fix typo

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* update stacktrace testing regex

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* add lint ignore for standard errors import inside errors pkg

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* [test files] add copyright headers

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* add no lint to avoid false misspell detection of keyword Tast

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* update stacktrace output test line number with regex pattern

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* return pc slice with reduced capacity

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* segregate formatted vs non formatted methods

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* update with only f functions

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* Group memcached keys based on server when performing batch gets (#5356)

* Group memcached keys based on server when performing batch gets

Order and group keys during batch get operations based on the memcached
server they will be sharded to. This reduces the number of connections
that must be made within each batch of get operations.

Fixes #5353

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Code review changes

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Fix error in testutil method added

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Code review: comments for selector interface

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* QueryFrontend: pre-compile regexp (#5383)

* pre compile regexp

Signed-off-by: Jin Dong <djdongjin95@gmail.com>

* rename oppattern to labelvaluespattern

Signed-off-by: Jin Dong <djdongjin95@gmail.com>

* [FEAT] adding thanos consul blogpost (#5387)

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* Fix empty $externalLabels when templating labels in rule. (#5394)

Signed-off-by: Rostislav Benes <r.dee.b.b@gmail.com>

Co-authored-by: Rostislav Benes <r.dee.b.b@gmail.com>

* support series relabeling on Thanos receiver (#5391)

* support series relabeling on Thanos receiver

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* add changelog

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix lint

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* update lint

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix e2e test

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix relabel config pass

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* cleanup white space

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* address review comments

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* address comments

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* update comment

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* Expose GatherFileStats. (#5400)

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Rule: Error out earlier when building alertmanager config (#5405)

* Error out earlier when building alertmanager config

Signed-off-by: Jéssica Lins <jessicaalins@gmail.com>

* Add test case for empty host

Signed-off-by: Jéssica Lins <jlins@redhat.com>

* [5130] [.*:] Upgrade Minio used for local development and e2e tests (#5392)

* add updated bingo .gitignore

Signed-off-by: B0go <victorbogo@icloud.com>

* update bingo minio version to commit 91130e884b5df59d66a45a0aad4f48db88f5ca63

Signed-off-by: B0go <victorbogo@icloud.com>

* trigger CI

Signed-off-by: B0go <victorbogo@icloud.com>

* Submit a proposal for vertical query sharding (#5350)

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* query: Close() after using query (#5410)

* query: Close() after using query

This should reduce memory usage because Close() returns points back to a
sync.Pool.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* CHANGELOG: add item

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* query: call Close() in gRPC API too

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* avoided potential panic due to divide by 0 (#5412)

Signed-off-by: Aditi Ahuja <ahuja.aditi@gmail.com>

* sidecar/compact/store/receiver - Add the prefix option to buckets (#5337)

* Create prefixed bucket

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* started PrefixedBucket tests

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* finish objstore tests

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Simplify string removal logic

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Test more prefix cases on PrefixedBucket

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Only use a prefixedbucket if we have a valid prefix

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Add single unit test for prefixedBucket prefix

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* test other prefixes on UsesPrefixTest

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add remaining methods to UsesPrefixTest

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add prefix to docs examples

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Simplify Iter method

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* add prefix explanation to S3 docs

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Conclusion of prefix sentence on docs

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Use DirDelim instead of magic string

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Add log when using prefixed bucket

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove "@" from test string to make them simpler

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* fix BucketConfig Config type - back to interface

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add changelog

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add missing checks in UsesPrefixTest

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* fix linter and test errors

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Add license to new files

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove autogenerated docs

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove duplicated transformation of string->[]byte

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Add prefixed bucket on all e2e tests for S3

The idea is that if it works, we can add for all other providers.
Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Add e2e tests using prefixed bucket to all providers

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* refactor: move validPrefix to prefixed_bucket logic

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Enhance the documentation about prefix.

Signed-off-by: jademcosta <jademcosta@gmail.com>

* Fix format
Signed-off-by: jademcosta <jademcosta@gmail.com>

* Add prefix entry on bucket config example

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Removing redundancies on prefix checks and tests

We already check if the prefix if not empty when creating the bucket.

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove redundant YAML unmarshal
Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove unused parameter
Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove docs that should be auto-geneated
Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* refactor: move prefix to config root level

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add auto generated docs

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* fix changelog

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

Co-authored-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Ruler: Change default evaluation interval to 1m (#5417)

* Change default eval interval to 1m

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update CHANGELOG

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Updates busybox SHA (#5423)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* receive: Added Ketamo Consistent hashing (#5408)

* Add support for consistent hashing in receivers

This commit adds support for distributing series in Receivers using
consistent hashing based on the libketama algorithm.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Use require package for test assertions

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Rename algorithm from consistent to ketama

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* S3: Add config option to enforce the minio DNS lookup (#5409)

* Add config option to enforce the minio DNS lookup

Signed-off-by: Jakob Hahn <jakob.hahn@hetzner.com>

* Useenums instead of boolean for bucket_lookup_type

Signed-off-by: Jakob Hahn <jakob.hahn@hetzner.com>

* Expose tsdb status in receiver (#5402)

* Expose tsdb status in receiver

This commit implements the api/v1/status/tsdb API in the Receiver.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add docs and todo

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix tests

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Receive: option to extract tenant from client certificate (#5153)

* added option to extract tenant from client certificate

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* added suggestions from PR

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* removed else cases

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* corrected location of certificate field check

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* fixed issue with err definition

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* updated docs

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* corrected comment

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

Co-authored-by: Magnus Kaiser <magnus.kaiser@gec.io>

* Improve ketama hashring replication (#5427)

With the Ketama hashring, replication is currently handled by choosing
subsequent nodes in the list of endpoints. This can lead to existing nodes
getting more series when the hashring is scaled.

This commit changes replication to choose subsequent nodes from the hashring
which should not create new series in old nodes when the hashring is scaled.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut readme for 0.27 (#5429)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Added alert compliance test for Thanos (#5315)

* test: Added Alert compatibilty test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Tmp.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* e2e: Refactored service helpers for newest e2e version.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed alert combatibiltiy test for now.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* e2e: Added test for compatibility.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added Querier /alerts API.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* e2e:Added replica labels.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Option to remove replica-label.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* skip.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Use stateful ruler and default resend delay

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update docs

Signed-off-by: Matej Gera <matejgera@gmail.com>

Co-authored-by: Matej Gera <matejgera@gmail.com>

* 0.27-rc0 Update readme and version (#5430)

* Update readme and version

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Fix newlines

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Fixes typo

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* fixes noise

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Alert Compliance: Fix wrong ruler configuration (#5433)

* [receive] Export metrics about remote write requests per tenant (#5424)

* Add write metrics to Thanos Receive

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Let the middleware count inflight HTTP requests

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update Receive write metrics type & definition

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Put option back in its place to avoid big diff

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fetch tenant from headers instead of context

It might not be in the context in some cases.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Delete unnecessary tenant parser middleware

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Refactor & reuse code for HTTP instrumentation

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add missing copyright to some files

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add changelog entry for Receive & new HTTP metrics

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove TODO added by accident

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Make error handling code shorter

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Make switch statement simpler

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove method label from timeseries' metrics

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Count samples of all series instead of each

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove in-flight requests metric

Will add this in a follow-up PR to keep this small.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Change timeseries/samples metrics to histograms

The buckets were picked based on the fact that Prometheus' default
remote write configuration
(see https://prometheus.io/docs/practices/remote_write/#memory-usage)
set a max of 500 samples sent per second.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix Prometheus registry for histograms

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix comment in NewHandler functions

There are now four metrics instead of five.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* remove unused block-sync-concurrency flag (#5426)

* remove unused block-sync-concurrency flag

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* add changelog

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* update

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix e2e test

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix tests

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix docs typo in metric thanos_compact_halted (#5448)

Signed-off-by: Nikita Matveenko <nikitapecasa@gmail.com>

* Implement tenant expiration (#5420)

* Implement tenant expiration

This commit adds dynamic TSDB pruning for tenants which have not
received new samples within a certain period of time.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add link to receiver tenant-lifecycle-management

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Docs: Remove Katacoda links (#5454)

* Remove Katacoda links

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Remove one more reference

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Fixed lint on Go 1.18.3+ (#5459)

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Add HTTP metrics for in-flight requests (#5440)

* Add HTTP metrics for in-flight requests

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix changelog entry after PR creation

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix link in old CHANGELOG entry

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix style in the CHANGELOG

All the entries should end up with a period.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Improve help for in-flight htttp requests metric

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Move changelog entry pending PR

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add a method label to the in-flight HTTP requests

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* docs: Fix heading level of "Excluded from caching" (#5455)

* Refactor DefaultTransport() from objstore to package exthttp (#5447)

* Refactoring the DefaultTransport func in package exthttp

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Refactoring the DefaultTransport func from s3 in package exthttp

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Updated helpers.go

corrected argument for DefaultTransport() in helpers.go

Signed-off-by: Srushti (sroo-sh-tee) <73685894+SrushtiSapkale@users.noreply.github.com>

* Changed the argument type in getContainerURL

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Update pkg/exthttp/transport.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

Signed-off-by: Srushti (sroo-sh-tee) <73685894+SrushtiSapkale@users.noreply.github.com>

* Update pkg/exthttp/transport.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

Signed-off-by: Srushti (sroo-sh-tee) <73685894+SrushtiSapkale@users.noreply.github.com>

* Removed the use of NewTransport() in cos.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Moved TLSConfig struct and functions that need it from objstore to exthttp

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Changed s3.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Kept s3.go and helpers.go unchanged to not break the cortex deps

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Consistency changed made while pair++ programming.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Created a new tlsconfig in exthttp and minor changes in cos.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Commented in s3.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Minor changes in transport.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Changed transport.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Changed transport.go and tlsconfig.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Removed changes from prometheus.mod and prometheus.sum

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Minor updation in cos.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

Co-authored-by: bwplotka <bwplotka@gmail.com>

* receive: Fix race condition when pruning tenants (#5460)

Pruning Receiver tenants has a race condition caused by concurrently
removing items from the tenants map.

This commit addresses the issue by using a mutex to guard the tenants map.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* Adding SCMP as an adopter (#5466)

Signed-off-by: Chris Ng <2509212+chris-ng-scmp@users.noreply.github.com>

* Updated busybox version. (#5471)

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Refactor endpoint ref clients

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Fix E2E test env name clash

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Build with Go 1.18 (#5258)

* Build with Go 1.18

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Try something

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Upgrade minio

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Replace json-iterator/reflect2 in bingo

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Ignore 405 errors for prometheus buildVersion API requests (#5477)

Older versions of prometheus (such as 2.7 which is shipped by Debian
buster) return a 405 error for non-existent API endpoints instead of the
404 returned by more recent versions.

Signed-off-by: Nicolas Dandrimont <olasd@softwareheritage.org>

* *: Cut 0.27.0 (#5473)

* Cut 0.27.0

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Updated busybox version. (#5471)

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Matej Gera <matejgera@gmail.com>

* Docs: Remove Katacoda links (#5454)

* Remove Katacoda links

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Remove one more reference

Signed-off-by: Matej Gera <matejgera@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update compact.md (#5465)

* During 1h downsampling skip XOR chunks that may erroneously be present in 5m resolution blocks (#5453)

* Add fpetkovski to triage list

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Use Azure BlobURL.Download instead of in-memory buffer (#5451)

Modify the azure.Bucket get methods to use BlobURL.Download for fetching
blobs and blob ranges. This avoids the need to allocate a buffer for storing
the entire expected size of the object in memory. Instead, use a ReaderCloser
view of the body returned by the download method.

See grafana/mimir#2229

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update storage.md (#5486)

* [receive] Add per-tenant charts to Receive's example dashboard  (#5472)

* Start to add tenant charts to Receive

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Properly filter HTTP status codes

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix tenant error rate chart

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Refactor to improve readability and consistency

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Refactor one more usage of code and tenant labels

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Filter tenant metrics to the Receive handler

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Format math expression properly

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update CHANGELOG

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add samples charts to series & samples row

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Bump Go version in all the GH Actions (#5487)

* Bump go version in go mod

This is a follow up to #5258, which made the project be built with Go 1.18.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update Go version in all GH Actions

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Run go mod tidy

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Added changelog entry

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Put back Go 1.17 in go.mod

Because we don't use any Go 1.18 feature yet, so it's not needed

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update go.sum after changing go.mod to go 1.17

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove non-user-impacting entry for changelog

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* objstore: Download and Upload block files in parallel (#5475)

* Parallel Chunks

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* test

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Changelog

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* making ApplyDownloadOptions private

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* upload concurrency

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Upload Test

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* update change log

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Change comments

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Address comments

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Remove duplicate entries on changelog

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Addressing Comments

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* update golang.org/x/sync

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Adding Commentts

Signed-off-by: Alan Protasio <approtas@amazon.com>

* Use default HTTP config for E2E S3 tests (#5483)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* chore: Included githubactions in the dependabot config (#5364)

This should help with keeping the GitHub actions updated on new releases. This will also help with keeping it secure.

Dependabot helps in keeping the supply chain secure https://docs.github.com/en/code-security/dependabot

GitHub actions up to date https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot

https://github.com/ossf/scorecard/blob/main/docs/checks.md#dependency-update-tool
Signed-off-by: naveensrinivasan <172697+naveensrinivasan@users.noreply.github.com>

* bump codemirror and promql editor to the last version (#5491)

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* receiver: Expose stats for all tenants (#5470)

* receiver: Expose stats for all tenants

Thanos Receiver supports the Prometheus tsdb status API and can expose
TSDB stats for a single tenant.

This commit extends that functionality and allows users to request
TSDB stats for all tenants using the all_tenants=true query parameter.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add back chunk count

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Simplify TSDBStats interface

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Return empty result for no stats

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* CHANGELOG.md: regenerate (#5495)

* receive: Fix stats nil pointer panic (#5494)

When fetching TSDB stats from receivers, certain TSDBs might not be
initialized yet. This can lead to a nil pointer access when the
status endpoint is accessed before all TSDBs are initialized.

This commit adds an explicit check for each tenant's TSDB when
exporting TSDB stats.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Update query.md (#5496)

Fix typo of parameter --store.sd-files

Signed-off-by: Firxiao <Firxiao@users.noreply.github.com>

* Parallel download blocks - Follow up of #5475 (#5493)

* Download blocks in parallel

Signed-off-by: Alan Protasio <approtas@amazon.com>

* remove the go func

Signed-off-by: Alan Protasio <approtas@amazon.com>

* Doc

Signed-off-by: Alan Protasio <approtas@amazon.com>

* CHANGELOG

Signed-off-by: Alan Protasio <approtas@amazon.com>

* doc

Signed-off-by: alanprot <alanprot@gmail.com>

* AddressComments

Signed-off-by: alanprot <alanprot@gmail.com>

* fix typo

Signed-off-by: Alan Protasio <approtas@amazon.com>

* Upgrade mdox with cache and some http settings to reduce CI failures (#5500)

* Pin mdox to latest master commit

It suppors now a cache for link validation and some HTTP
configuration that can be used to help avoid intermittent
CI failures.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add mdox cache and HTTP configuration

The cache has a default TTL (5 days)

A timeout of 1m and 10 connections per host at transport
level should help us reduce the intermittent failures if
we have to invalidate the cache.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add Github Action cache for the mdox cache

Using the hash of the md files as cache key.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Upgrade cache actions to v3 and add restore key

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Empty commit to test CI build cache

Signed-off-by: GitHub <noreply@github.com>

* Use 2.5 days as jitter for mdox cache

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix bad editor auto-formating again

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Updated minio-go to latest; removed fork. (#5474)

* Updated minio-go fork to latest.

NOTE: Optimization is propopsed to upstream to avoid fork in future.

Relates to https://github.com/thanos-io/thanos/issues/5101 and https://github.com/thanos-io/thanos/issues/5130

Signed-off-by: bwplotka <bwplotka@gmail.com>

# Conflicts:
#	go.mod
#	go.sum

* Removed fork.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Added comment.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Receiver: Handle storage exemplar multi-error (#5502)

* Handle exemplar store errors as conflict

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Adjust tests

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update CHANGELOG

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Fixing Race condition Introduced by #5493  (#5503)

* Update busybox image versions (#5506)

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Updates busybox SHA (#5507)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* chore: Update Prometheus dependency (#5484)

* chore: Update Prometheus dependency

Update Prometheus from v2.33.5 to v2.36.2.

Signed-off-by: SuperQ <superq@gmail.com>

* Update query tests for cortex changes.

Signed-off-by: SuperQ <superq@gmail.com>

* Use the default rules.RuleGroupPostProcessFunc.

Signed-off-by: SuperQ <superq@gmail.com>

* Update QueryStats use.

Signed-off-by: SuperQ <superq@gmail.com>

* Update Cortex.

Signed-off-by: SuperQ <superq@gmail.com>

* Update queryfrontend for Cortex changes.

Signed-off-by: SuperQ <superq@gmail.com>

* Bump pprof.

Signed-off-by: SuperQ <superq@gmail.com>

* Add changelog entry.

Signed-off-by: SuperQ <superq@gmail.com>

* Adapt to changed query stats API

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Sync dependencies

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Reflect changed metric names

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Kemal Akkoyun <kakkoyun@gmail.com>
Co-authored-by: Kemal Akkoyun <kakkoyun@users.noreply.github.com>

* chore: Vendor Cortex dependency as an internal package (#5504)

* Vendor Cortex dependency as an internal package

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add gitattributes

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Skip checks for vendored directory

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add copyright headers for Cortex

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* *: Move objstore out of repo (#5510)

* *: Move objstore out of repo

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix doc checks

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* chore: Update Prometheus to v2.37.0 (#5511)

* chore: Update Prometheus to v2.37.0

Update Prometheus to the latest release. Note that Prometheus
upstream now tags v0.x.y to map to the 2.x.y releases.

Signed-off-by: SuperQ <superq@gmail.com>

* Cleanup direct/indirect go.mod requirements.

Signed-off-by: SuperQ <superq@gmail.com>

* chore: Update Go modules (#5516)

* Update weaveworks/common to remove node_exporter indirect dep.
* Update simonpasquier/klog-gokit/v2.
* Update google.golang.org/grpc lock to v1.45.0.
* Cleanup replacements that are now handled by indirect requirements.
* Fixup grpc.WithInsecure() use.

Signed-off-by: SuperQ <superq@gmail.com>

* chore: Update Go modules (#5518)

* Reuse upstream TSDB status structs (#5526)

This commit replaces the copied TSDB status structs with direct
references from prometheus/prometheus.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix proposal on website (#5530)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update all bingo dependencies (#5525)

This commit updates all bingo dependencies to their latest versions.

It pins golang.org/x/sys to v0.0.0-20220715151400-c0bba94af5f8 for
the github.com/google/go-jsonnet dependency in order to prevent
failures when running make docs on Mac OS.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* delete_katacoda (#5529)

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* Remove empty RuleGroups in api/v1/rules when using matchers (#5537)

* Remove empty RuleGroups in api/v1/rules

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestion

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Rename variables

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* fix(api): When querying api query on endpoint alerts return a json struct with alerts in lowercase. (#5534)

To be same result as prometheus api
Signed-off-by: Guillaume audic <audic.gui@gmail.com>

* Receiver: Add benchmark for receive writer (#5533)

* Add benchmark for receive writer

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Incorporate feedback

- Clearer parameter naming; use a separate temp dir for bench

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Submit a proposal for Active Series Limiting for Hashring Topology (#5415)

* Add proposal for Active Series Limiting for Hashring Topology

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Resize images

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add Observatorium as an alternative

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestions; add TODO

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update proposal

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestions: add sections numbers

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Refactor EndpointSet (#5538)

* Refactor EndpointSet

This commit refactors the EndpointSet struct in order to make it easier
to understand and work with.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Handle context cancellation in endpoint mock

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Make additions and removals of refs atomic.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix changed-docs grep regex (#5556)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Added Vertical Query Sharding to Query-Frontend (#5342)

* Update faillint to v1.10.0

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Implement query sharding

This commit implements query sharding for grouping PromQL expressions.

Sharding is initiated by analyzing the PromQL and extracting
grouping labels. Extracted labels are propagated down to Stores which
partition the response by hashmoding all series on those labels.

If a query is shardable, the partitioning and merging process will be
initiated by the Query Frontend. The Query Frontend will make N distinct
queries across a set of Queriers and merge the results back before
presenting them to the user.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* First code review pass

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Use sync pool to reuse sharding buffers

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add test for binary expression with constant

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Include external labels in series sharding

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Rule: Fix e2e test flake (#5558)

* Rule: Fix e2e test flake

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix lint

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Check errors

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Change to github.com/thanos-io/thanos/pkg/errors

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestion

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix multi-tenant exemplar matchers (#5554)

* Fix multi-tenant exemplar matchers

The exemplar proxy synthesizes a query based on PromQL expression matchers
and individual store's label sets. When a store has multiple label sets
with same label names but different values (e.g. multitenant Receivers),
each exemplar matcher will be repeated once for each label set. Because of this,
a receiver hosting 200 tenants can get the same exemplar matcher 200 times. This leads
to the underlying stores slowing down and timing out when asked for exemplars.

This commit modifies the exemplar proxy to deduplicate matchers and only send
a matcher once to an underlying store.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Address CR comments

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Receive: add per request limits for remote write (#5527)

* Add per request limits for remote write

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove useless TODO item

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Refactor write request limits test

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add write concurrency limit to Receive

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Change write limits config option name

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Document remote write concurrenty limit

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add changelog entry

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Format docs

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Extract request limiting logic from handler

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add copyright header

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add a TODO for per-tenant limits

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add default value and hide the request limit flags

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Improve TODO comment in request limits

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update Receive docs after flags wre made hidden

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add note about WIP in Receive request limits doc

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix typo in Receive docs

Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix help text for concurrent request limit

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Use byte unit helpers for improved readability

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Removed check for nil writeGate

The constructor sets the writeGate to a noopGate.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Better organize linebreaks

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix help text for limits hit metric

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Apply some english feedback

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Improve limits & gates documentationb

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix import clause

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Use a 3 node hashring for write limits test

This should ensure the request fanout logic cannot somehow interfere
with the request limit logic.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix comment

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Announce sharding in ruler and store proxy (#5560)

The ruler and store proxy currently support series sharding
through the components that they use. However, this capability is not
announced to the querier.

This commit modifies their Info calls to indicate to the querier
that it doesn't need to shard the response it receives from rulers
and other store proxies.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix flaky e2e tests (#5563)

* Tools: Fix e2e test flake

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Metadata: Fix flaky e2e test

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Compact: Fix flaky e2e test

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Bumping actions/cache to v3 for e2e tests

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add missing e2e.WaitMissingMetrics

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Meta-monitoring based active series limiting (#5520)

* Add initial PoC for meta-monitoring Receive active series limits

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add e2e tests, rebase

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add multitenant test + remake diagrams

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestions; Make naming consistent; Rm/Add metrics

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Reuse meta-monitoring client

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix panic

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Cache meta-monitoring query result

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix lint

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fail fast when limiting

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestions: docs + mutex + struct

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add interface and no-op

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add changelog entry

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add seriesLimitSupported to handler

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Remove tools fork

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Change docs header

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Remove usage of ioutil (#5564)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* docs/contribution.md: Update required Go version  (#5557)

* delete_katacoda

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* updated go version

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* update golang version

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* updated

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* Retrigger CI

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* Retrigger CI

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* fix an expression param in a link to an alert in the rules page (#5562)

Signed-off-by: Rostislav Benes <r.dee.b.b@gmail.com>

Co-authored-by: Rostislav Benes <r.dee.b.b@gmail.com>

* Receiver: Validate labels in write requests (#5508)

* Add label set validation method

Signed-off-by: Matej Gera <matejgera@g…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants