From 60457e220b0e1f8e874b70ae3732e697827970f4 Mon Sep 17 00:00:00 2001 From: Xiang Dai <764524258@qq.com> Date: Mon, 23 Dec 2019 14:26:39 +0800 Subject: [PATCH] docs: Document Thanos Sharding Signed-off-by: Xiang Dai <764524258@qq.com> --- docs/sharing.md | 95 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 95 insertions(+) create mode 100644 docs/sharing.md diff --git a/docs/sharing.md b/docs/sharing.md new file mode 100644 index 00000000000..b49ef8cb0ea --- /dev/null +++ b/docs/sharing.md @@ -0,0 +1,95 @@ +--- +title: Sharing +type: docs +menu: thanos +slug: /Sharing.md +--- + +# Sharing + +Sharing is for Long Term Retention Storage. + +## Background + +Currently all components that read from object store assume that all the operations and functionality should be done based +on **all** the available blocks that are present in the certain bucket's root directory. + +This is in most cases totally fine, however with time and allowance of storing blocks from multiple `Sources` into the same bucket, +the number of objects in a bucket can grow drastically. + +This means that with time you might want to scale out certain components e.g: + +* Compactor: Larger number of objects does not matter much, however compactor has to scale (CPU, network) with number of Sources pushing blocks to the object storage. +If you have multiple sources handled by the same compactor, with slower network and CPU you might not compact/downsample quick enough to cope with incoming blocks. + * This happens a lot if no compactor is deployed for longer periods and thus has to quickly catch up with large number of blocks (e.g couple of months). +* Store Gateway: Queries against store gateway which are touching large number of Sources might be expensive, so it has to scale up with number of Sources if we assume those queries. + * Orthogonally we did not advertise any labels on Store Gateway's Info. This means that querier was not able to do any pre-filtering, so all store gateways in system are always touched for each query. + +### Reminder: What is a Source + +`Source` is a any component that creates new metrics in a form of Thanos TSDB blocks uploaded to the object storage. We differentiate Sources by `external labels`. +Having unique sources has several benefits: + + * Sources does not need to inject "global" source labels to all metrics (like `cluster, env, replica`). Those all the same for all metrics produced by source, we can assume that whole block has those. + * We can track what blocks are "duplicates": e.g in HA groups, where 2 replicas of Prometheus-es are scraping the same targets. + * We can track what source produced metrics in case of problems if any. + +Example Sources are: Sidecar, Rule, Thanos Receive. + +## Solution + +Now, we can specify `--selector.relabel-config` (and corresponding `--selector.relabel-config-file`) flag that will be used to filter out what blocks should be selected for operations in store and compact components. + +For example: + +* We want to run Compactor only for blocks with `external_labels` being `cluster=A`. We will run second Compactor for blocks with `cluster=B` external labels. +* We want to browse only blocks with `external_labels` being `cluster=A` from object storage. We will run StoreGateway with selector of `cluster=A` from external labels of blocks. + +### Relabelling + +Similar to [promtail](https://github.com/grafana/loki/blob/master/docs/promtail.md#scrape-configs) this config will follow native +[Prometheus relabel-config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config) syntax. + +The relabel config will define filtering process done on **every** synchronization with object storage. + +We will allow potentially manipulating with several of inputs: + +* `__block_id` +* External labels: + * `` +* `__block_objstore_bucket_endpoint` +* `__block_objstore_bucket_name` +* `__block_objstore_bucket_path` + +Output: + +* If output is empty, drop block. + +By default, on empty relabel-config, all external labels are assumed. +Intuitively blocks without any external labels will be ignored. + +All blocks should compose as set of lables to advertise. The input should be based from original meta files. NOT from relabelling. +The reasoning is covered in [`Next Steps`](#Future-Work) section + +Example usages would be: + +* Drop blocks which contains external labels cluster=A + +```yaml +- action: drop + regex: "A" + source_labels: + - cluster +``` + +* Keep only blocks which contains external labels cluster=A +```yaml +- action: keep + regex: "A" + source_labels: + - cluster +``` + +## Plan + +What we plan to do or not is documented in [sharing proposals](./proposals/201909_thanos_sharding.md).